π‘Newsletter - June 2025
Montly Newsletter
Research Data Management at Universities: Storage Solutions
Disclaimer The views expressed here are my own in my professional capacity as a Data Steward and Research Data Manager. They do not necessarily reflect the official opinions or positions of the Novo Nordisk Foundation Center for Stem Cell Medicine β reNEW.
Introduction
Universities play a critical role in generating and managing vast and complex research datasets, from social science surveys to high-resolution biomedical imaging and genomic sequencing.
Effective Research Data Management (RDM) is essential to:
Ensure the accuracy and reproducibility of research outcomes.
Protect the security and integrity of datasets.
Facilitate accessibility and long-term preservation.
Meet institutional, legal, and funder compliance requirements.
As projects scale in size, interdisciplinarity, and technical sophistication, universities must provide sustainable, equitable, and technically robust storage solutions to support their research communities.
Key Challenges in University Research Data Storage
1. Multidisciplinary Demands
Infrastructure must accommodate a wide range of data types and sizes, from large imaging datasets to compact text-based files.
2. Variable Data Lifespans
Some datasets are short-lived, while othersβsuch as longitudinal studiesβmust be preserved for decades.
3. Collaborative Workflows
International, cross-institutional collaborations require secure, high-availability sharing mechanisms.
4. Compliance and Ethics
Sensitive data, particularly in human subjects research, must comply with strict legal and ethical standards.
Best Practices and Solutions
On-Premise Storage
High-speed, institution-managed data centers for analysis-intensive work.
Cloud Storage
Flexible, scalable solutions such as AWS S3, Google Cloud Storage, or Microsoft Azure for global collaboration.
Hybrid Models
Combining local high-speed access with cloud scalability for cost-performance balance.
Data Repositories
Institutional or domain-specific platforms that ensure discoverability, metadata richness, and FAIR compliance.
Secure Storage for Sensitive Data
Advanced encryption, access controls, and auditing for clinical or personal data.
Data Archiving
Low-cost, long-term storage for data that is infrequently accessed but must be preserved.
Local Context: UCPH IT Storage Costs
At the University of Copenhagen (UCPH), recent changes have introduced direct storage charges for research groups.
Current state:
Only active storage tiers are available; no true Tier 3 cold storage exists for archival data.
All storage is priced equally, regardless of access frequency.
Implications:
Cost Equity β Large archival datasets incur disproportionately high costs, disadvantaging underfunded projects.
No Low-Cost Archival Option β Data for archival storage remains in higher-cost active tiers.
Risk to Preservation β Without affordable options, researchers may delete valuable data prematurely, undermining reproducibility and Open Science commitments.
Practical Recommendations
Implement Tiered Storage Models β Include genuine low-cost archival storage for rarely accessed data.
Align Pricing with Usage β Match costs to access frequency and storage volume.
Invest in Institutional Infrastructure β Develop scalable, secure, and researcher-friendly storage systems.
Protect Research Assets β Ensure archival-quality storage to prevent loss of irreplaceable datasets.
Looking Ahead
As data volumes continue to grow, sustainable, equitable, and well-structured storage strategies will be vital to preserving research integrity and enabling discovery. Institutions that balance performance, security, and cost will be best positioned to lead in the era of Open Science.
Last updated