Skip to Main Content

The 10 marine science (rd) Things: Thing 6

Thing 6: Describing data: metadata and controlled vocabularies

Metadata elements are essential for finding and reusing research data. Data is only as valuable as the metadata which describes and connects it. In addition to selecting a metadata standard or schema, whenever possible you should also use a controlled vocabulary. A controlled vocabulary provides a consistent way to describe data.

Activity 1: Metadata: your best friend

Metadata is structured information about a resource that describes characteristics such as content, quality, format, location and contact information. Creating metadata to describe research data is very similar to the process for descriptive cataloguing of library resources.

Metadata schema are sets of metadata elements (or fields) for describing a particular type of information resource. Numerous metadata schema exist for describing research data across different disciplines. 

1. Read the following short ANDS introduction to Metadata to understand what metadata is and why is it the lifeblood of research data sharing!

2. Let’s revisit one of the good quality metadata records for marine science related data we met in previous Things. Why do you think the following record is considered ‘good quality’?  Hint: consider both the type and quality of information provided. What metadata included in this record help discovery and reuse of the data?

3. Explore the UK Digital Curation Center’s Directory of Disciplinary Metadata. You might find a schema that is applicable to your research!

Consider : Why, if metadata is the lifeblood of data discoverability and reuse, is it often neglected or not richly done when data is published.

Activity 2: Controlled vocabularies for data description

In addition to selecting a metadata standard or schema, whenever possible you should also use a controlled vocabulary. A controlled vocabulary provides a consistent way to describe data - location, time, place name, subject.

Controlled vocabularies significantly improve data discovery. It makes data more shareable with researchers in the same discipline because everyone is ‘talking the same language’ when searching for specific data eg plants, animals, medical conditions, places etc.

1. Start by browsing Controlling your Language: a Directory of Metadata Vocabularies from JISC in the UK. Make sure you scroll down to section 5. Conclusion - it's worth a read.

2. We are going to see some controlled vocabularies in action in the Atlas of Living Australia (ALA). 

a. Do a search in the ALA search engine. Type “whale” in the search box. Choose your favourite whale species and click on the (red text) View record link.  

b. Any metadata field where you see Supplied... tells you that the information supplied by the person who submitted the record (often a 'citizen scientist') has been changed to the controlled vocabulary being used in metadata fields eg Observer, Record date and Common name.

Consider: How do you think we could encourage people to use controlled vocabularies in their data descriptions?