Top Qs
Timeline
Chat
Perspective
Archival Resource Key
Form of URLs used as persistent identifiers From Wikipedia, the free encyclopedia
Remove ads
An Archival Resource Key (ARK) is a multi-purpose URL suited to being a persistent identifier for information objects of any type. It is widely used by libraries, data centers, archives, museums, publishers, and government agencies to provide reliable references to scholarly, scientific, and cultural objects. In 2019 it was registered as a Uniform Resource Identifier (URI) scheme.[1]
A URL that is an ARK is distinguished by the label ark: at the beginning of a path component. When submitted to a web browser, the URL terminated by '?info' (an ARK inflection) returns a metadata record that includes a commitment statement from the current service provider.
Implicit in the design of the ARK scheme is that persistence is purely a matter of service and not a property of a naming syntax. Moreover, that a "persistent identifier" cannot be born persistent, but an identifier from any scheme may only be proved persistent over time. The '?info' inflection provides information with which to judge an identifier's likelihood of persistence.
Remove ads
History
Summarize
Perspective
Throughout the 1990s, the Internet Engineering Task Force and other organizations developed standards for persistent identifiers for web resources, including URN, PURL, Handle, and DOI. In each of these standards, indirect identifiers would resolve to URLs, which themselves changed over time. Many believed that such systems would contribute to the persistence of web resources over time. [2]
In 2001, John Kunze of the University of California and R. P. Channing Rodgers of the United States National Library of Medicine released the first draft of The ARK Persistent Identifier Scheme, designed in response to the needs of their two organizations, as an IETF working document.[3] In explaining their motivations for creating a new system, Kunze later wrote that “each [persistent identifier] system had specific problems.” In contrast to the decentralized structure of the web, with many independent publishers, Handle and DOI were related centralized systems which charged for inclusion; they were “antithetical,” according to Kunze, “to an implicit principle that Internet standards must not endorse control by any one entity, over access to the networked resources of another entity.” URNs were free, but lacked a resolver discovery services, and, wrote Kunze, “it seemed to me that the IETF community lost interest in creating a whole new Internet indirection infrastructure that would add little to existing web and DNS mechanisms, especially in light of the small part that indirection plays in keeping links from breaking.”[2]
In contrast to these other systems, the ARK scheme proposed that “persistence is purely a matter of service,… neither inherent in an object nor conferred on it by a particular naming syntax.” The most an identifier could do to solve the problem of persistence, then, was to indicate an organization’s commitment. Accordingly, in the ARK standard, identifiers would refer not only to a web resource, but also to “a promise of stewardship” and metadata about the resource. If a web server was queried with an ARK, it should return the resource itself or some surrogate for it, such as “a table of contents instead of a large complex document.” In inflected form, though, an ARK should return a description—metadata—instead, which “must at minimum answer the who, what, when, and why questions concern an expression of the object.” (The scheme also included a guide to Electronic Resource Citations, a simple format for structuring this metadata.) The metadata should also describe the provider’s policies regarding “object persistence, object naming, object fragment addressing, and operational service support.”[3]
The California Digital Library began using ARKs in 2002, and released the Noid (Nice Opaque IDentifiers) software for managing ARKs and other identifiers in 2004. Other early adopters of ARKs included Portico, the Internet Archive, and the Bibliothèque nationale de France, the first of many francophone institutions to adopt the scheme.
In 2018, the California Digital Library and DuraSpace announced a collaboration, initially named ARKs-in-the-Open and then the ARK Alliance, to build an international community around ARKs and their use in open scholarship. By 2025, over 1600 institutions had registered to use ARKs.[2]
Remove ads
Structure
https://NMA/ark:/NAAN/Name[Qualifier]
- NAAN: Name Assigning Authority Number - mandatory unique identifier of the organization that originally named the object
- NMA: Name Mapping Authority - optional and replaceable hostname of an organization that currently provides service for the object
- Qualifier: optional string that extends the base ARK to support access to individual hierarchical subcomponents of an object,[4] and to variants (versions, languages, formats) of components.[5]
A complete NAAN registry[6] is maintained by the ARK Alliance and replicated at the Bibliothèque Nationale de France and the US National Library of Medicine.
Remove ads
Application
ARKs may be assigned to anything digital, physical, or abstract. Below are examples, as reported (2020) to the ARK Alliance by the linked organizations.
- genealogical records (8 billion FamilySearch)
- publisher content (100 million Portico)
- scientific records (22 million INIST)
- scanned texts (30 million Internet Archive)
- bibliographic records (27 million BnF main catalog)
- museum specimens (15 million going on 100 million Smithsonian)
- public health documents, many from legal discovery (20 million UCSF IDL)
- digitized documents and objects (36 million CDL, 5 million BnF Gallica)
- historical persons, families, and organizations (4 million SNACC)
- educational resources (1.1 million University of Utah)
- fine art (490,000 Louvre museum)
- historic maps (334,000 Princeton University Libraries)
- vocabulary terms (30,000 Periodo, YAMZ)
Generic services
Summarize
Perspective
Three generic ARK services have been defined. They are described below in protocol-independent terms. Delivering these services may be implemented through many possible methods given available technology (today's or future).
Access service (access, location)
- Returns (a copy of) the object or a redirect to the same, although a sensible object proxy may be substituted (for instance a table of contents instead of a large document).
- May also return a discriminated list of alternate object locators.
- If access is denied, returns an explanation of the object's current (perhaps permanent) inaccessibility.
Policy service (permanence, naming, etc.)
- Returns declarations of policy and support commitments for given ARKs.
- Declarations are returned in either a structured metadata format or a human readable text format; sometimes one format may serve both purposes.
- Policy subareas may be addressed in separate requests, but the following areas should be covered:
- object permanence,
- object naming,
- object fragment addressing, and
- operational service support.
Description service
- Returns a description of the object. Descriptions are returned in either a structured metadata format or a human readable text format; sometimes one format may serve both purposes.
- A description must at a minimum answer the who, what, when, and where questions concerning an expression of the object.
- Standalone descriptions should be accompanied by the modification date and source of the description itself.
- May also return discriminated lists of ARKs that are related to the given ARK.
Remove ads
See also
- Persistent identifier
- Digital object identifier (DOI)
- Handle System (Handle)
- Persistent uniform resource locator (PURL)
- Uniform resource name (URN)
- Info URI scheme
Notes and references
External links
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads