Mr. Isaac Tai
Assistant Librarian (User Services)
Email: cytai@eduhk.hk
Tel: (852) 2948-6681
We warmly welcome faculty members, researchers, and academic departments to partner with us in organizing tailored workshops that support students' research skills development and academic success.
Data sharing is a core component of Research Data Management: it involves preparing, documenting and depositing your datasets in trusted repositories so that other researchers can discover, access and reuse them. By applying clear metadata standards, persistent identifiers (e.g., DOIs) and appropriate licensing, data sharing ensures that your work remains findable, interpretable and citable long after publication.
Data sharing makes research findings more accessible and discoverable to the global community.
Sharing data fosters collaboration among researchers, leading to more comprehensive studies.
Shared data allows others to verify and reproduce research results, enhancing credibility.
Researchers save time by reusing existing data rather than collecting new datasets.
Reduces duplication of efforts and costs associated with data collection and storage.
Shared data can be used across disciplines, maximizing the impact of research.
Open data accelerates scientific progress by enabling new analyses and discoveries.
Shared datasets serve as valuable resources for teaching and training purposes.
When preparing to share data, prioritize transparency and reusability by including well-structured datasets, clear documentation, and reproducible code. However, always assess whether the data contain confidential, sensitive, or restricted elements, and take appropriate steps such as anonymization or controlled access to ensure compliance with ethical and legal standards.
✅ Data You Are Encouraged to Share
Type of Data | Description | Purpose |
---|---|---|
Raw Data | Original observations, measurements, or readings collected during research. | Enables replication, validation, or reanalysis by other researchers. |
Processed/Analyzed Data | Cleaned, transformed, or aggregated data used in your publications. | Allows others to reproduce findings and perform secondary analyses. |
Metadata | Descriptive information about the dataset (e.g., variables, units, data collection methods). | Facilitates data discovery, interpretation, and reuse. |
Documentation | README files, codebooks, protocols, and lab notes that describe the dataset. | Ensures proper understanding and reuse of data by others. |
Code & Scripts | Analytical scripts, software code, or workflows used for data processing and analysis. | Enhances reproducibility and transparency of research workflows. |
⚠️ Data You Should Not Share
Type of Data | Risk/Concern | Recommendation |
---|---|---|
Personally Identifiable Information (PII) | Risk of identifying individuals from the data. | Anonymize or de-identify the data before sharing. |
Sensitive Data | May include health records, genetic data, or confidential social data. | Apply ethical review and data access controls. |
Proprietary or Restricted Data | Owned by third parties or subject to license/contractual restrictions. | Obtain permission or provide summary data if sharing is blocked. |
Data Under Embargo | Temporarily restricted due to publication or funding requirements. | Share after embargo period ends, if allowed. |
Choosing the right repository is essential for ensuring your research data is discoverable, accessible, citable, and preserved in the long term. Researchers should select a repository that aligns with their discipline, data type, or institutional policies.
Repository Type | Description | Examples | Best For | Key Features |
---|---|---|---|---|
General-purpose | Accepts data from any discipline. Easy to use and broadly accessible. | Zenodo, Figshare, Dryad, OSF (Open Science Framework) | Multidisciplinary projects or when no subject-specific repository exists | DOI assignment, version control, basic metadata support, easy sharing |
Disciplinary | Tailored for specific research domains, often with community standards. | GenBank (genetics), ICPSR (social sciences), PANGAEA (earth sciences) | Discipline-specific data types and formats | Domain metadata, strong community uptake, citation metrics |
Institutional | Hosted by universities or research institutes to support affiliated staff. | HKU Scholars Hub, DataHub@PolyU | Internal data sharing, policy compliance, institutional visibility | Authentication, institutional branding, data preservation policies |
Journal/Publisher-linked | Connected to journal submissions; some require mandatory data deposit. | Elsevier’s Mendeley Data, Springer Nature’s figshare, Dryad (linked) | Journal articles with data availability policies | Integration with peer review, citation linking, curated datasets |
Government/Funder Repositories | Required or recommended by funding agencies or government bodies. | NIH dbGaP, UK Data Service, European Open Science Cloud (EOSC) | Funded research, compliance with data mandates | Secure access control, compliance with legal/ethical frameworks, persistent archiving |
Ensuring your data is well-described and organized dramatically improves discoverability, reuse, and citation. Adopting established metadata standards helps you and others understand the structure, provenance, and context of your dataset—making sharing seamless and compliant with both institutional and funder requirements.