Skip to Main Content

Manage Research Data: Reusing open data

About reusing datasets

Due to the recent trend towards increasing the transparency and reproducibility of research, more data sets are being made available to be reused by other researchers.

 How is Data and Research Software Found?

  • Ask colleagues or collaborators
  • As supplements to journal articles or links in the article
  • Data journals
  • Data registries
  • Open data portals
  • Institutional repositories (ECU has Research Online)
  • Discipline specific repositories
  • Project websites
  • Data discovery aggregators (Research Data Australia)
  • Library catalogues and databases
  • Think about the data you need and why you need them.
  • Select the most appropriate resource.
  • Construct your query strategically.
  • Make the repository work for you.
  • Refine your search.
  • Assess data relevance and fitness -for -use.
  • Save your search and data- source details.
  • Look for data services, not just data.
  • Monitor the latest data.
  • Treat sensitive data responsibly.
  • Give back (cite and share data).

Considerations when reusing existing data

Often data that you are reusing was collected with a different purpose in mind then what you intend. Aggregating data could have unintended consequences; identifying individuals or sensitive information.

 Some questions you can ask to gain an understanding of the data and the issues it may have:

  • Is there enough data documentation and metadata to enable you to make a judgment about its usability? (why was it collected? by whom? When?)
  • Who funded the data? Be aware of potential biases and agendas
  • Does the data include information about the collection of the data? What methods were used and are they appropriate?
  • Did the original collectors of the data do so ethically? Did the participants have informed consent to the sharing of the data?
  • What biases might be relevant to your research? Region, internet access, age, gender etc.
  • What does the licensing allow? What formats can you use to manipulate and reproduce data, what types of research is allowed? E.g. commercial or for profit use

Portals tend to be smaller and cater to a more specific audience, these can be helpful if you are looking for specific data. Directories usually contain a wider variety of research outputs from a variety of disciplines, or they may aggregate smaller portals or repositories.

Directories

An easy way to find, explore and reuse Australia's public data

A Comprehensive List of Open Data Portals from Around the World

A curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies.

Data sharing made easier: use Repository Finder to find the right repository for your data

Research. Shared. — all research outputs from across all fields of research are welcome! Sciences and Humanities, really!

Find, access and reuse data.

Mendeley Data is a secure cloud-based repository where you can store your data, ensuring it is easy to share, access and cite, wherever you are.

Find, access, and re-use data for research - from over one hundred Australian research organisations, government agencies, and cultural institutions

Portals and Repositories 

AURIN is crucial infrastructure for researchers, government and industry, accelerating research into our towns, cities and communities. We provide the data and tools to allow you to make evidence-based decisions quickly and confidently, and we can help you get in touch with world leading urban experts right here in Australia.

DataONE is a community driven project providing access to data across multiple member repositories, supporting enhanced search and discovery of Earth and environmental data. DataONE promotes best practices in data management through responsive educational resources and material.

Dryad is a nonprofit repository for data underlying the international scientific and medical literature.

GBIF—the Global Biodiversity Information Facility—is an international network and research infrastructure funded by the world’s governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth.

an easily accessible data platform that enables users to store, search, access and manage datasets up to 2TB across a broad scope of topics. The IEEE platform also facilitates analysis of datasets and retains referenceable data for reproducible research.

PLOS has identified a set of established repositories which are recognized and trusted within their respective communities.

VertNet is a NSF-funded collaborative project that makes biodiversity data free and available on the web. VertNet is a tool designed to help people discover, capture, and publish biodiversity data.

When reusing open data recognition must be given to the researchers who collected it originally.

Citing dynamic data

Some data could change after it has been published or been made available. This could be by design or because of other factors:

  • New data may be added on a regular basis, such as sensor data or data from subjects in follow up visits
  • Errors or issues may be found in the data and it may need to updated or modified

In this instance it is important to be aware of which version of the data you have used and ensure that it is clear in your citation.

Citation metrics for data

When data is correctly cited it allows for the collection of metrics on the usage, similar to how metrics are gathered for journal articles.

Expand all