Metadata are data about data, are the essential element of every dataset according to FAIR rules, and what is more, they are the key to get the access to research data, to understand and reuse them. There are three types of metadata:

  1. Descriptive metadata - give necessary information to find or identify a dataset and can include elements such as a title, a summary, an author and keywords.
  2. Structural metadata - describe relationships and dependence between each object and elements of these datasets to i.e. navigate easily.
  3. Administrative metadata - include helpful information in a given resource management as well as information about the way and date of creating it, a type of a file and information about the access. There are a few administrative subsets, two of them are mentioned frequently as separate types of metadata:
  • Metadata of managing rights related to intellectual property rights
  • Metadata of preservation including information necessary to archive and maintain the resource

Metadata should inform about: data structure, restrictions (if exist), what they mean and how they should be cited.

  • Dataset name: Determination of the influence of green corrosion inhibitors on aluminium alloys in alkaline media
  • Version: 1.0
  • Author/s: Ryl Jacek; Wysocka Joanna; Krakowiak Stefan; Cieślik Mateusz
  • Description: The studies are devoted to search for green corrosion inhibitors of aluminium and its alloys, offering high corrosion inhibition efficiency in alkaline media. The project will aim at development of instantaneous impedance measurements for accurate determination of the adsorption isotherms.
  • Format: DTA
  • Licence: Creative Commons Attribution 4.0 International
  • Funding Agency/ies: Ministry of Science and Higher Education, Republic of Poland
  • Keywords: corrosion inhibitor, green chemistry, aluminium alloys, instantaneous impedance measurements
  • DOI: (all datasets deposited in the MOST Data repository will receive unique DOI identifiers)
  • Discipline: Chemical sciences
  • Language: English

There have appeared initiatives worldwide, formalising metadata specification to allow to reuse data easily, such as Research Data Alliance (RDA), OpenAire and Metadata 2020.

The aim of metadata standards is to systemise the type of  the data description. Metadata prepared according to a standard have a stable description structure with explicitly defined fields, that is why, the description is understandable both for people and computers.

There are a lot of metadata standards, with distinguished general standards, domain and institutional standards. Dublin Core and Data Cite or Data Documentation Initiative (DDI) are  general standards of metadata and are domain universal  and widely used. Some of metadata standards are used in different disciplines and institutions such as: DC (life sciences) EML (ecology) SDMX (ECB, EUROSTAT, IMF, OECD, UN), SAFE (ESA), INSPIRE ISO 19139 (Earth Science), Project Open Data Metadata Schema v.1.1 (Federal Agencies USA) TEI and CDW (the humanities).

Metadata are also the description of variables, codebook and controlled vocabulary including:

  • names of variables (short and full forms, i.e. AGE and Age of the respondent)
  • units of measurement (i.e. mm)
  • allowed values (i.e. a range from 0 to 100)
  • variables definitions (i.e. Age=Age of the respondent in years)

Illustrative controlled vocabulary:

  • Biology – Convention of Biological Diversity Controlled Vocabulary (CBDVoc)
  • Economy and social sciences – Central Europe Glossary (CEG)
  • Medicine – Unified Medical Dictionary (UMD)
  • Education, social sciences – UNESCO thesaurus

Metadata can be saved in: txt file, spreadsheets, XML file.

There are plenty of useful tools to create metadata such as: Nesstar Publisher in line with DDI and Dublin Coe standards, as well as tools like STATA, SPSS, and Eenvplus and also Metadata Editor prepared to create metadata in the Inspire standard.

Metadata harvesting is an automated way of gathering metadata from different sources to create agregate metadata and related services. XML is used by the  Open Archives Initiative protocol to gather metadata (OAI-PHM) to exchange metadata in the web. OAI-PMH was implemented in 2001 and has been widely used by digital libraries, institutional repositories and digital archives to make the exchange of data between systems easier and to extend the access to sets.