Data Sharing Working Group
Sharing expertise to better provide support for sharing research data at IU
The Data Sharing Working Group provides an opportunity to share expertise and collaborate to solve problems across individual services and/or units,with the aim of providing better support for sharing research data at IU.
Members
- Charles Brandt
- Heather Coates
- Tom Crowe
- Levi Dolan
- Ethan Fridmanski
- Danielle Giltner
- Beth Johnson
- Alicia Libla
- Charles McClary
- Emily Meanwell
- Beth Plale
- Katie Wright
Supporting Resources
Pathways for Sharing Research Data
How to use this information
This resource is meant to support researchers by clearly identifying options for sharing research data. In this document, research data sharing is limited to sharing for reuse via deposit into one or more data repository(ies). Commonly, this is done to meet funder or publisher requirements or to extend open research practices. Ideally, this guidance is considered during the proposal development process. When used during the planning phases, specifically to support the creation of Data Management and Sharing Plans (DMSP), researchers will be better prepared to meet funder obligations and manage data for sharing as planned.
Description
This use case includes projects involving human research participants and data generated by or about them as determined by the IU Human Research Protection Program (HRPP).
Criteria for sharing
The existing data or research agreement(s) (e.g., DUA, DSA, collaborative agreement) allows you to share the data.
Human research participants consented to sharing data beyond the original study.
The data files do not contain any protected elements (e.g., date of birth, social security number, etc.) or other sensitive information that are associated with special requirements to safeguard the data.
Considerations/Decision Points
What do you want/are required to do with the data during the project?
What do you want/are required to do with the data after the project is complete?
Can the data set to be shared be provided via open access? If the data contain any data elements or information considered Personally Identifiable Information (PII) or other protected elements, the data cannot be shared via open access mechanisms.
If open access is not appropriate, what requirements need to be addressed in the controlled-access process and data agreement?
Are there data or domain-specific controlled-access data repositories available?
Do you need to budget for data sharing/curation/preservation expenses?
Next Steps (Processes & Resources)
First, consult with Research Contracting to assess your options based on terms in the existing data or research agreement(s), if applicable. You may also need to discuss an appropriate data agreement (e.g., DUA, DSA, etc.).
Second, consult with a statistician (see list below) to assess disclosure risk and determine whether the data set can be de-identified and remain usable.
IU Bloomington
IU Indianapolis
- Biostatistics & Health Data Science
- Refer to the guidance on choosing a data repository from University Library
- Review the list of NIH-supported Data Repositories, if applicable
IU Recommended Generalist Data Repositories Comparison Matrix
Related Policies
Related policies: DM-02, IU HRPP Policies, IU HRPP Informed Consent Policy, HRPP Research Data Management Policy
Description
This use case includes projects generating or using data that includes PHI protected under the Health Insurance Portability and Accountability Act (HIPAA).
Criteria for sharing
Human research participants consented to sharing data beyond the original study.
Considerations and Decision Points
What do you want/are required to do with the data during the project?
What do you want/are required to do with the data after the project is complete?
Is sharing the full data set required? If not, can a data set be created for reuse that reduces the amount of PHI shared?
Can the data set to be shared be provided via open access? If the data contain any direct identifiers or other protected elements, the data cannot be shared via open access mechanisms.
If open access is not appropriate, what requirements need to be addressed in the controlled-access process and data agreement?
Are there data or domain-specific controlled-access data repositories available?
Processes & Resources
First, consult with Research Contracting to assess your options based on terms in the existing data or research agreement(s), if applicable. You may also need to discuss an appropriate data agreement (e.g., DUA, DSA, etc.).
Second, consult with a statistician (see list below) to assess disclosure risk and determine whether the data set can be de-identified and remain usable.
IU Bloomington
IU Indianapolis
Refer to the guidance on choosing a data repository from University Library
Review the list of NIH-supported Data Repositories, if applicable
IU Recommended Generalist Data Repositories Comparison Matrix
Related policies
DM-02, IU HRPP Policies, IU HRPP Informed Consent Policy, HRPP Research Data Management Policy, IU HRPP Use of Protected Health Information (PHI) in Research
Description
This use case includes large-scale human genomic data via deposit into the repository dbGap under an Institutional Certification.
Criteria for sharing
Human research participants consented to sharing data beyond the original study.
Considerations/Decision Points
Review the NIH Scientific Data Sharing site to determine what you are required to do before collecting the data.
What are you required to do to prepare the data for deposit into dbGap?
Processes & Resources
First, contact the IU Human Research Protection Program (HRPP) at irb@iu.edu to discuss your project proposal.
For a Provisional Certificate contact IU HRPP at irb@iu.edu.
Once the protocol is approved, submit as an amendment.
Related policies
DM-02, IU HRPP Policies, IU HRPP Informed Consent Policy, HRPP Research Data Management Policy, IU HRPP Use of Protected Health Information (PHI) in Research
Description
This use case includes any data that have been purchased (i.e., there is an associated fee) or licensed (i.e., there is no fee) from a third-party for use in research, whether the data are available for use by all IU personnel or a subset of individuals.
Criteria for sharing
The existing data or research agreement(s) (e.g., DUA, DSA, collaborative agreement) allows you to share the data.
The data files do not contain any protected elements (e.g., date of birth, social security number, etc.) or other sensitive information that are associated with special requirements to safeguard the data.
If human research participants were involved, they consented to sharing of data beyond the original study.
Considerations/Decision Points
What do you want/are required to do with the data during the project?
What do you want/are required to do with the data after the project is complete?
Is sharing the full data set required? If not, can a data set with reduced disclosure risk be created for reuse?
If open access is not appropriate, what requirements need to be addressed in the controlled-access process and data agreement?
Are there data or domain-specific controlled-access data repositories available?
Processes & Resources
For incoming data received by IU, you must consult with Research Contracting to assess your options based on terms in the existing data or research agreement(s), if applicable.
Additionally, you must consult with Research Contracting to understand your rights and obligations and to develop an appropriate data agreement (e.g., DUA, DSA, etc.) for data being shared by IU with others.
Review the list of NIH-supported Data Repositories, if applicable
IU Recommended Generalist Data Repositories Comparison Matrix
Related policies
DM-02, IU HRPP Policies, IU HRPP Informed Consent Policy, HRPP Research Data Management Policy
Definitions
- Controlled access: Access that is mediated through a form of review and approval and which often involves a Data Sharing Agreement, or other type of agreement.
- Data repository: A repository is a tool to share, preserve, and discover research outputs, including but not limited to data or datasets. While workflows and processes will vary across repositories, generally speaking, researchers submit and describe their own data which is then ingested into the repository for storage. Other researchers can then download, or request to download, the data directly from the repository. (Source: https://www.nnlm.gov/guides/data-glossary/repository)
- Research data sharing (for reuse): Providing data to individuals not associated with the project that generated the data, typically for reuse for research purposes.
- Disclosure risk: The risk of inappropriate attribution of information to an individual or organization without their approval.
- Open access sharing: Making data available to anyone for access and reuse; often includes redistribution.
- Open data: Open data and content can be freely used, modified, and shared by anyone for any purpose. (Source: https://opendefinition.org/od/2.1/en/)
- Limited data set (Defined under HIPAA): A limited data set is described as health information that excludes certain, listed direct identifiers (see below) but that may include city; state; ZIP Code; elements of date; and other numbers, characteristics, or codes not listed as direct identifiers. The direct identifiers listed in the Privacy Rule's limited data set provisions apply both to information about the individual and to information about the individual's relatives, employers, or household members. (Source: https://privacyruleandresearch.nih.gov/pr_08.asp)