Top Qs
Timeline
Chat
Perspective

Research Object

Tool for gathering and sharing scholarly information on the web From Wikipedia, the free encyclopedia

Remove ads

In computing, a Research Object is a method for the identification, aggregation and exchange of scholarly information on the Web. The primary goal of the research object approach is to provide a mechanism to associate related resources about a scientific investigation so that they can be shared using a single identifier. As such, research objects are an advanced form of Enhanced publication.[1]

Current implementations build upon existing Web technologies and methods including Linked Data, HTTP, Uniform Resource Identifiers (URIs), the Open Archives Initiative Object Reuse and Exchange (OAI-ORE) and the Open Annotation model, as well as existing approaches for identification and knowledge representation in the scientific domain including Digital Object Identifiers for documents, ORCID identifiers for people, and the Investigation, Study, and Assay (ISA) data model.

Remove ads

Principles and motivation

The research object approach is primarily motivated by a desire to improve reproducibility of scientific investigations. Central to the proposal is need to share research artifacts commonly distributed across specialist repositories on the Web including supporting data, software executables, source code, presentation slides, presentation videos. Research Objects are not one specific technology but are instead guided by a set of principles. Specifically research objects are guided by three principles of identity, aggregation and annotation[2]

  • Digital identity - Use unique identifiers as names for things, such as DOIs for publications or data, and ORCID ids for researchers.
  • Data aggregation - Use some form of aggregation to associated related things together that are part of the broader study, investigation etc. so that others may more readily discover those related resources.
  • Annotation - Provide additional metadata about those things, how they relate to each other, their provenance, how they were produced etc.
Remove ads

Communities

Summarize
Perspective

A number of communities are developing the research object concept.

ROSC W3C activity

A W3C community group entitled the Research Objects for Scholarly Communication (ROSC) Community Group was started in April 2013. The community charter states that the goals of the ROSC activity are:[3] "to exchange requirements and expectations for supporting a new form of scholarly communication"

The Community Group aims to produce the following types of deliverables:

  • Use cases for the representation, publishing, and exchange of research objects on the Web
  • Requirements and desiderata distilled from the use cases.
  • A survey of related work on supporting the representation, publishing, and exchange of research objects.
  • Various best practices and guidelines towards a community-wide practice of sharing, citing, and exchanging of research objects

FAIR digital objects

The FAIR digital object forum is a community that brings together experts from the FAIR data movement, semantic web, and digital publishing of scholarly work. The first conference on FAIR digital objects led the coalition to ratify the Leiden Declaration [4] on FAIR digital objects. The principles contained in the Leiden Declaration provides a prescriptive framework for infrastructure development around digital research objects. This framework draws from the FAIR data principles and ideas around distributed infrastructure that relies on open protocols to prevent vendor lock-in and ensure access that is "as open as possible, as restricted as necessary". These enable discovery and reuse of Research Objects, including computational workflows for both humans and machines. To promote the uptake and share experiences creating a FAIR Digital Object, case studies have been published showcasing how to create these with the necessary machine-understandable semantic metadata. Specifications like the ISA metadata framework and RO-Crate supporting these ontology-based annotations of high-throughput experiments and analysis workflows, respectively.[5]

Github and Figshare

The Mozilla Science Lab have initiated an activity in collaboration with GitHub and Figshare to develop "Code as research object". The initial proposal of the activity is to allow users to transfer code from a GitHub repository to figshare, and provide that code with a Digital Object Identifier (DOI), providing a permanent record of the code that can be cited in future publications.

Remove ads

RO-Crate

Summarize
Perspective
Thumb
The overall structure of an RO-Crate, combining research artifacts with standardized metadata

RO-Crate (Research Objects Crate) is a community initiative started around 2019 that provides an approach to package and aggregate research artefacts with their metadata and relationships. RO-Crate is based on Schema.org annotations in JSON-LD, aiming to establish best practices to formally describe metadata in an accessible and practical way. It has the intent of applying “just enough” Linked Data standards for making research outputs FAIR while also enhancing research reproducibility.[6] It seeks to bridge the complexity gap in the tooling for metadata specifications by following 4 principles:

  1. being conceptually simple and easy to understand for developers;
  2. providing strong, easy tooling for integration into community projects;
  3. providing a strong and opinionated guide regarding current best practices;
  4. adopting de-facto standards that are widely used on the Web.[6]

While developing the standard, the base level for simplicity was friendliness to software developers. The team assumed a developer familiar with making Web applications with JSON data, which informed core design choices for the JSON-level documentation approach and RO-Crate serialization.[6] Additionally, in RO-Crate, a referenced contextual entity (e.g. a person identified by ORCID) should always be described within the RO-Crate Metadata File with at least a type and name, even where their persistent identifier (PID) might resolve to further Linked Data. This is so that clients are not required to follow every link for presentation purposes, for instance HTML rendering.[6]

Thumb
UML diagram describing a simplified RO-Crate

An RO-Crate is defined as a self-described "Root Data Entity" that describes and contains data entities, which are further described by referencing contextual entities. A "data entity" is either a file (i.e. a byte sequence stored on disk somewhere) or a directory (i.e. set of named files and other directories). A file does not need to be stored inside the RO-Crate root, it can be referenced via a PID/IRI. A contextual entity exists outside the information system (e.g. a Person, a workflow language) and is stored solely by its metadata. The representation of a data entity as a byte sequence makes it possible to store a variety of research artefacts including not only data but also, for instance, software and text.[6][7]

The Root Data Entity is a directory, the RO-Crate Root, identified by the presence of the RO-Crate Metadata File ro-crate-metadata.json. RO-Crates can be stored, transferred or published in multiple ways, including downloadable ZIP archives in Zenodo or through dedicated online repositories, as well as published directly on the Web, e.g. using GitHub Pages.[6]

A simple RO-Crate metadata file describing data entities (CSV and JPG files) with contextual entities (authors identified by name or ORCID) can be seen below:[6]

{ "@context": "https://w3id.org/ro/crate/1.1/context",
  "@graph": [
    { "@id": "ro-crate-metadata.json",
      "@type": "CreativeWork",
      "conformsTo": {"@id": "https://w3id.org/ro/crate/1.1"},
      "about": {"@id": "./"}
    },
    { "@id": "./",
      "@type": "Dataset",
      "name": "A simplified RO-Crate",
      "author": {"@id": "#alice"},
      "license": {"@id": "https://spdx.org/licenses/CC-BY-4.0"},
      "datePublished": "2021-11-02T16:04:43Z",
      "hasPart": [
        {"@id": "survey-responses-2019.csv"},
        {"@id": "https://example.com/pics/5707039334816454031_o.jpg"}
      ]
    },
    { "@id": "survey-responses-2019.csv",
      "@type": "File",
      "about": {"@id": "https://example.com/pics/5707039334816454031_o.jpg"},
      "author": {"@id": "#alice"}
    },
    { "@id": "https://example.com/pics/5707039334816454031_o.jpg",
      "@type": ["File", "ImageObject"],
      "contentLocation": {"@id": "http://sws.geonames.org/8152662/"},
      "author": {"@id": "https://orcid.org/0000-0002-1825-0097"}
    },
    { "@id": "#alice",
      "@type": "Person",
      "name": "Alice"
    },
    { "@id": "https://orcid.org/0000-0002-1825-0097",
      "@type": "Person",
      "name": "Josiah Carberry"
    },
    { "@id": "http://sws.geonames.org/8152662/",
      "@type": "Place",
      "name": "Catalina Park"
    },
    { "@id": "https://spdx.org/licenses/CC-BY-4.0",
      "@type": "CreativeWork",
      "name": "Creative Commons Attribution 4.0"
    }
  ]
}
Remove ads

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads