🟑Newsletter - June 2025

Montly Newsletter

Research Data Management at Universities: Storage Solutions

Disclaimer The views expressed here are my own in my professional capacity as a Data Steward and Research Data Manager. They do not necessarily reflect the official opinions or positions of the Novo Nordisk Foundation Center for Stem Cell Medicine – reNEW.

Introduction

Universities play a critical role in generating and managing vast and complex research datasets, from social science surveys to high-resolution biomedical imaging and genomic sequencing.

Effective Research Data Management (RDM) is essential to:

  • Ensure the accuracy and reproducibility of research outcomes.

  • Protect the security and integrity of datasets.

  • Facilitate accessibility and long-term preservation.

  • Meet institutional, legal, and funder compliance requirements.

As projects scale in size, interdisciplinarity, and technical sophistication, universities must provide sustainable, equitable, and technically robust storage solutions to support their research communities.

Key Challenges in University Research Data Storage

1. Multidisciplinary Demands

  • Infrastructure must accommodate a wide range of data types and sizes, from large imaging datasets to compact text-based files.

2. Variable Data Lifespans

  • Some datasets are short-lived, while othersβ€”such as longitudinal studiesβ€”must be preserved for decades.

3. Collaborative Workflows

  • International, cross-institutional collaborations require secure, high-availability sharing mechanisms.

4. Compliance and Ethics

  • Sensitive data, particularly in human subjects research, must comply with strict legal and ethical standards.

Best Practices and Solutions

On-Premise Storage

  • High-speed, institution-managed data centers for analysis-intensive work.

Cloud Storage

  • Flexible, scalable solutions such as AWS S3, Google Cloud Storage, or Microsoft Azure for global collaboration.

Hybrid Models

  • Combining local high-speed access with cloud scalability for cost-performance balance.

Data Repositories

  • Institutional or domain-specific platforms that ensure discoverability, metadata richness, and FAIR compliance.

Secure Storage for Sensitive Data

  • Advanced encryption, access controls, and auditing for clinical or personal data.

Data Archiving

  • Low-cost, long-term storage for data that is infrequently accessed but must be preserved.

Local Context: UCPH IT Storage Costs

At the University of Copenhagen (UCPH), recent changes have introduced direct storage charges for research groups.

Current state:

  • Only active storage tiers are available; no true Tier 3 cold storage exists for archival data.

  • All storage is priced equally, regardless of access frequency.

Implications:

  • Cost Equity – Large archival datasets incur disproportionately high costs, disadvantaging underfunded projects.

  • No Low-Cost Archival Option – Data for archival storage remains in higher-cost active tiers.

  • Risk to Preservation – Without affordable options, researchers may delete valuable data prematurely, undermining reproducibility and Open Science commitments.

Practical Recommendations

  1. Implement Tiered Storage Models – Include genuine low-cost archival storage for rarely accessed data.

  2. Align Pricing with Usage – Match costs to access frequency and storage volume.

  3. Invest in Institutional Infrastructure – Develop scalable, secure, and researcher-friendly storage systems.

  4. Protect Research Assets – Ensure archival-quality storage to prevent loss of irreplaceable datasets.

Looking Ahead

As data volumes continue to grow, sustainable, equitable, and well-structured storage strategies will be vital to preserving research integrity and enabling discovery. Institutions that balance performance, security, and cost will be best positioned to lead in the era of Open Science.

Last updated