Skip to Main Content

Manage Your Research Data: Publishing your dataset

Research data play a crucial role in ensuring the reproducibility and transparency of scientific findings. Enable the reuse and re-analysis of data, as well as the integration of data from various sources, thereby opening up opportunities for further research and the generation of new knowledge. Ideally, reusability encompasses the rights to download, copy, distribute, and automatically process the data, all without financial, technical, or legal barriers. Publishing research data also enhances their citability, thereby boosting the scientific reputation of the authors.


benefits of sharing Data


 

Why publish research data?

Increased visibility

Dataset is publicly accessible and indexed by major search engines, enhancing its visibility within the academic community and beyond.

Citable with DOI

A Digital Object Identifier (DOI) will be assigned, providing a permanent link to your dataset, making it easier to cite and track its usage.

Supports collaboration

Public access to your data facilitates collaboration with other researchers, institutions, and organisations, fostering further research opportunities.

Long-term accessibility

Research Online, the ECU institutional repository, provides secure, long-term storage, ensuring your dataset remains accessible and preserved for future research, promoting data reuse and reproducibility over time.

Reduces the cost and risk of duplicating data collection

When you share your dataset openly, others working in the same or related fields can reuse existing data instead of spending time and resources collecting similar data again.

Before publishing data, researchers or data authors must ensure that the dataset is thoroughly prepared. Here are the key steps to follow:

Clean and verify the dataset Ensure the dataset is cleaned, verified for accuracy, and suitable for its intended use.
Structure the dataset Organise the dataset in a well-structured manner.
Document the dataset Provide comprehensive documentation, including rich metadata, methodology details, a codebook or variable descriptions, and any data collection tools like questionnaires, if applicable.
Address privacy, confidentiality, security Consider any privacy, confidentiality, and security issues to determine if the dataset can be published. If necessary, anonymise or de-identify the data.
Use reusable file formats Ensure the dataset is in reusable file formats, such as open standard formats like CSV files or widely accepted proprietary formats.
Check licensing Apply an appropriate license to your dataset to clarify how others can use it. Consider using licenses like Creative Commons to specify usage rights.
Ensure accessibility Make sure the dataset is accessible to a broad audience, including those with disabilities. This might involve providing alternative text for images or ensuring that data tables are screen reader-friendly.

Find out more:

FAIR and CARE data

Sharing and Access Control are crucial in research data management, especially when considering the FAIR and CARE principles. By adhering to these principles, researchers can ensure their data is both useful and ethically managed, fostering trust and collaboration.

FAIR data means:
Findable

Data should be easy to find for both humans and machines.

How to achieve it?

Keep necessary software: Ensure you have the software needed to open and work with your data, keeping a copy if it’s not widely available.

Accessible

Data should be openly accessible, with clear conditions for use.

How to achieve it?

Check accessibility regularly: Periodically confirm that you can open and access your data files to catch any issues early.

Interoperable

Data should be able to integrate with other datasets and tools.

How to achieve it?

Use non-proprietary formats: Store your data in widely supported formats like .csv or .txt to ensure long-term accessibility, interoperability and flexibility.

Reusable

Data should be well-documented and provided with clear licenses to ensure that it can be reused for future research.

How to achieve it?

Archive raw data: Always keep a copy of your raw data for future reference and validation.

Maintain backups: Back up your data in a separate folder, ensuring compliance with any sensitivity and security concerns.

CARE data means:
Collective benefit

Data should be collected and used in ways that benefit the communities from which it originates.

How to achieve it?

Use data anonymisation techniques where necessary to protect the identity and privacy of individuals while still allowing for broader, community-wide benefits. Implement data access policies that promote open data sharing within ethical boundaries, while ensuring the protection of sensitive data.

Authority to control

Communities should have the authority to control the data that belongs or comes from them. 

How to achieve it?

Enabling communities to have decision-making power over how their data is used. Ensure that communities retain the right to control access to data and decide whether it can be shared with third parties.

Responsibility

Researchers and organisations have a responsibility to manage data ethically and responsibly.

How to achieve it?

Data stewardship: Take responsibility for managing the integrity, security, and accessibility of data throughout its lifecycle. This includes ensuring data is accurate, updated, and securely stored.

Addressing biases in data and ensuring that data use does not harm any individual or community.

Ethics

Ethical considerations should be the priority of data management practices.

How to achieve it?

Ensure ethical data collection by obtaining informed consent from data contributors and explaining how their data will be used, shared, and protected. Data anonymisation, ensures individuals’ privacy is maintained while still enabling data to be used for broader societal benefits.

 

Managing sensitive data

Managing sensitive data responsibly is crucial for maintaining trust and ensuring compliance with ethical and legal standards. Here are some recommendations to consider:

Storing and managing sensitive data

Handle sensitive data with care: Follow ethical guidelines and ensure sensitive or identifiable information is managed securely.

Adhere to ethics and compliance: Stick to the procedures outlined in your ethics approval or participant consent documents, including storage and access requirements.

Plan data interaction: Consider how you will access, use, and share the data while maintaining security and privacy.

Prepare for data breach: Develop a response plan for potential data breaches, notify stakeholders promptly, take corrective actions, and implement preventive measures.

Publishing and sharing De-identification techniques:
  • Use pseudonyms or identifiers instead of real names.
  • Group ages into ranges rather than using specific birth years.
  • Aggregate data by broader categories, such as region (e.g., urban, rural) instead of specific suburbs.
  • Remove key pieces of information that could lead to identification.

Consider dataset combinations:

  • Keep in mind that combining datasets can increase the risk of re-identifying individuals.

Alternative sharing options:

  • Consider publishing only the metadata record or implementing mediated access to your data. This allows others to understand the dataset without exposing sensitive details.

Australian Research Council

Research Data Rights Management Guide

ARDC Research Data Rights Management Guide

Choose the best copyright license to meet the requirements of sharing your data, maximise the ability of reuse for innovation whilst meeting legal, ethical and grant requirements. 

The ARDC Creator flowchart offers data creators an easy-to-follow checklist for addressing licensing queries.

When using datasets created by others, it is important to exercise caution and fully understand the licensing terms. In some instances, seeking legal advice is necessary.

The ARC encourages researchers to deposit data from their projects in publicly accessible repositories.

   
Edith Cowan University Data Management Guidelines These guidelines stipulate that datasets should be made available for reuse unless restricted by compliance obligations.
Publishers on data sharing

Taylor and Francis - guide on sharing and citing data provides valuable insights into publisher policies and highlights the benefits of data sharing.

Springer Nature implemented comprehensive data-sharing policies to increase transparency and reproducibility in research.

2018 Australian Code for the Responsible Conduct of Research from NHMRC includes guidelines on the management of data and information in research.