Skip to Main Content

Research Data Management

Good research starts with good data. This guide will help you understand how to organize, store, protect, and share your data throughout your research journey.

Why Data Organization Matters?

Keeping your files well-organized isn't just 'nice to have'--it's the backbone of reproducible, collaborative, and long-lasting research. Here are some benefits of why data organization is important to your research.

Generated Cards
⚑

Efficiency

Faster retrieval of data and documents enhances productivity.

πŸ”„

Reproducibility

A clear structure allows for easy reproduction of analyses.

πŸ‘₯

Collaboration

Team members can quickly locate files, improving teamwork.

♾️

Longevity

Well-managed files ensure continuity despite personnel changes.

Common Practice

Folder structure design in research data management refers to the organized arrangement of digital folders and files to systematically store, access, and manage research data. A well-designed folder structure ensures consistency, makes data easy to locate, and supports collaboration across research teams.

Click the folders below to expand and explore the recommended structure:

πŸ“ Project Root ▢️

βœ… Best Practices

  • • Limit depth to ≤ 4 levels
  • • Aim for 5-20 items per folder
  • • Use descriptive folder names
  • • Maintain a README manifest
  • • Separate working vs. archive

❌ Common Mistakes

  • • Too many nested levels
  • • Overcrowded folders
  • • Vague folder names
  • • No documentation
  • • Mixing working and archive files

File naming in research data management involves creating clear, consistent, and descriptive names for files to make them easy to identify, organize, and retrieve. Good file naming practices help avoid confusion, support collaboration, and ensure long-term usability of research data.

 

βœ… Good Examples

SpineReview_TK_20250720_SurveyData_v01.csv
Manuscript_CYT_20250725_Draft_v1.0.docx
Analysis_Script_JD_20250722_Clustering_v02.py

❌ Poor Examples

data final FINAL.xlsx
Spaces, unclear version
survey@results#2.csv
Special characters
stuff.docx
Not descriptive

Key Rules

βœ… Do

  • • Use underscores instead of spaces
  • • Include dates as YYYYMMDD
  • • Use consistent abbreviations
  • • Include version numbers
  • • Be descriptive but concise

❌ Don't

  • • Use spaces or special characters
  • • Use vague terms like "final"
  • • Make names too long
  • • Forget version information
  • • Use inconsistent formats

Version control in research data management is the practice of tracking and managing changes to data files, code, and documents over time. It allows researchers to record revisions, revert to earlier versions, and collaborate without overwriting each other's work.

 

Click on version points to see the evolution of a file:

 
1

v1.00 - Initial Analysis

2025-07-20

First complete analysis with basic visualizations

▢️
2

v1.01 - Bug Fixes

2025-07-22

Fixed data cleaning issues and updated charts

▢️
3

v1.02 - Additional Analysis

2025-07-25

Added correlation analysis and statistical tests

▢️
4

v2.00 - Major Revision

2025-07-30

Complete restructure with new methodology

▢️
🌿

Git

Distributed version control for code and small files

βœ… Best for: Code, scripts, documentation

❌ Avoid for: Large datasets

πŸ—„οΈ

DVC

Data Version Control for large datasets

βœ… Best for: Large datasets, ML models

❌ Learning curve required

⏰

Timestamped Backups

Simple manual versioning with timestamps

βœ… Best for: Simple projects, beginners

❌ Manual process, limited features

Version Management Best Practices

Sequential Versioning

v01, v02, v03 Simple sequential
v1.00, v1.01 Major.Minor format
final, revision2 Avoid ambiguous names

File Protection

  • πŸ”’ Set raw data as read-only
  • πŸ“¦ Archive milestone versions
  • πŸ—‘οΈ Delete intermediate drafts
  • ✏️ Document changes in changelog

Summary

Category Key Practices Example Tools
πŸ—‚οΈ Folder Structure
  • • Hierarchy ≤ 4 levels
  • • README manifest
  • • Separate Working & Archive
File Explorer, Finder
🏷️ File Naming
  • • Project_Author_YYYYMMDD_Type_vXX
  • • No spaces/special chars
  • • Document conventions
Bulk Rename Utility
🌿 Versioning
  • • Sequential: v01, v1.00
  • • Changelog documentation
  • • Protect raw data
Git, DVC, Git LFS
☁️ Backup & Sync
  • • Automated cloud backups
  • • Scheduled full backups
  • • Version history
OneDrive, rsync, cron