Biomedical Data Life Cycle

Biomedical Data Life Cycle

Biomedical Data Life Cycle

(Biomedical) Research Data Lifecycle by LMA Research Data Management Working Group is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Find all versions and materials for the lifecycle in Zenodo.

Below is a summary table of the Biomedical Research Data Lifecycle with all major RDM stages. It aligns with institutional, ethical, and funder expectations relevant to reNEW, UCPH, and European biomedical research environments.

πŸ”„ Biomedical Research Data Lifecycle – Summary Table


Lifecycle Stage

Purpose & Key Outputs

Best Practices

Common Risks & Mitigations

1. Plan & Design

Define what data will be collected, created, or reused and how it will be managed. Output: Data Management Plan (DMP).

  1. Use institutional/funder-compliant DMP templates (e.g., DMPonline)

  2. Integrate UCPH-approved storage solutions (ERDA, REDCap)

  3. Address GDPR and consent for sensitive data

  4. Update DMP throughout the project lifecycle

Risk: Static or incomplete DMP

Mitigation: Schedule regular DMP reviews and assign update responsibility

2. Collect & Create

Capture or generate data through experimental, observational, or computational methods. Output: Raw data and metadata.

  1. Use ELNs or digital lab notebooks to capture contextual metadata

  2. Standardize file naming and folder structures

  3. Implement data quality control procedures at the point of collection

Risk: Metadata not captured Mitigation: Require lab-level metadata templates

Risk: Unsecure data storage Mitigation: Use managed network storage, not personal devices

3. Analyze & Collaborate

Transform raw data into analyzable formats; conduct computational and statistical analyses. Output: Processed data, results, and analysis scripts.

  1. Use version control (e.g., Git) for code and datasets

  2. Record all processing steps (e.g., workflows, software versions)

  3. Apply reproducible workflow tools (Snakemake, Nextflow, Galaxy)

Risk: Untraceable changes Mitigation: Track provenance and document transformations

Risk: Irreproducibility Mitigation: Automate and share workflows

  1. Evaluate & Preserve

Securely store final datasets and documentation for long-term access. Output: Archived data package (data + metadata + documentation).

  1. Deposit in domain-specific or institutional repositories (e.g., Zenodo, PRIDE, ArrayExpress)

  2. Use open, non-proprietary file formats- Assign persistent identifiers (e.g., DOIs)

  3. Define retention periods per UCPH/funder policy

Risk: Data loss or inaccessibility

Mitigation: Use repositories with long-term guarantees and metadata standards

5. Share & Publish

Make data available to others, either openly or with controlled access. Output: Public or restricted dataset, Data Availability Statement.

  1. Select appropriate repository (subject-specific, institutional, or controlled access)

  2. Anonymize or pseudonymize sensitive data- Choose proper license (e.g., CC-BY, CC0)

  3. Comply with consent and legal agreements

Risk: GDPR breach Mitigation: Use secure repositories with access control; consult DPO

Risk: Sharing non-documented data

Mitigation: Require README, dictionary, and metadata files

6. Discover & Reuse

Enable others (and yourself) to find and reuse data to generate new knowledge. Output: Citable, documented, reusable datasets.

  1. Ensure data are FAIR (Findable, Accessible, Interoperable, Reusable)

  2. Include citations to datasets in publications- Register datasets in data catalogs or registries

  3. Promote reuse through metadata quality and open licensing

Risk: Low discoverability Mitigation: Use persistent IDs, index in catalog

Risk: Misuse

Mitigation: Apply standard licenses and clarify reuse terms

Last updated