According to the NIH,
sharing data reinforces open scientific inquiry, encourages diversity of analysis and opinion, promotes new research, makes possible the testing of new or alternative hypotheses and methods of analysis, supports studies on data collection methods and measurement, facilitates the education of new researchers, enables the exploration of topics not envisioned by the initial investigators, and permits the creation of new datasets when data from multiple sources are combined.
- What are your ethical and legal obligations? Is there a need for confidentiality, privacy, or other protections?
- How will the data be accessed and distributed?
- What data (how much, version, etc.) will you share?
- What file formats will be used to share the data?
- What will you allow others to do with the data? (e.g., Creative Commons)
- How do you want others to give you credit (e.g., data citation)?
- How long will data be made available?
The most common ways researchers share data include:
- sharing raw data (common for national data centers like NOAA)
- sharing the data supporting published results
- sharing limited data sets, or datasets that have had all identifiable information removed
- sharing anonymized or data with indirect identifiers masked, collapsed, top-coded or other strategies (see ICPSR's guide)
- providing restricted use data that include identifiable information and require a data use agreement
- sharing sensitive data in a secure environment, such as a data enclave