DCPPC-DRAFT-Title: KC2 Globally Unique Identifier Services
DCPPC-DRAFT-Type: Design Principle
Name of Contact Person: Mercè Crosas
Email of Contact Person: firstname.lastname@example.org
Submitting Team: KC2
Requested DCPPC-DRAFT posting start date: September 4, 2018
Date Emailed for consideration: September 3, 2018
DCPPC-DRAFT-Status: Active and Open for Comment
URL Link to this Document:
License: This work is licensed under a CC-BY-4.0 License
Globally Unique Identifier Services
Version 10 September 28, 2018
Tim Clark, University of Virginia, School of Medicine & Data Science Institute
Daniel S. Katz, University of Illinois Urbana-Champaign, NCSA, CS, ECE, iSchool
Manuel Bernal Llinares, EMBL/European Bioinformatics Institute
Claris Castillo, University of North Carolina at Chapel Hill, RENCI
Kyle Chard, University of Chicago, Computation Institute
Merce Crosas, Harvard University, Institute for Quantitative Social Sciences
Martin Fenner, Datacite
Zachary Flamig, University of Chicago, Center for Data Intensive Science
Paul Groth, Elsevier Labs
Ray Idaszak, University of North Carolina at Chapel Hill, RENCI
Carl Kesselman, University of Southern California, Information Sciences Institute
John Kunze, University of California, California Digital Library
David Steinberg, University of California Santa Cruz, Genomics Center
Sarala Wimalaratne, EMBL/European Bioinformatics Institute
This document introduces a common model for globally unique identifier (GUID) registration, resolution and landing services, in the NIH Data Commons Program Pilot Consortium (DCPPC). It provides guidelines and context required for developing and harmonizing a set of common APIs, and principles of operation for the set of federated services it describes at a conceptual level. GUIDs and GUID services in DCPPC Phase I focus on identification and interoperability of data, but the intention is to identify all important digital research objects.
Associated required, recommended, and optional object-level Core Metadata are defined in a separate document.
This document introduces a basic conceptual model for persistent globally unique identifier (GUID) registration, resolution and landing services in the NIH Data Commons Program Pilot Consortium (DCPPC), along with the core use cases it is meant to support. We intend this document, along with the Core Metadata for GUIDs (Fenner et al. 2018), to be a foundation of alignment for constituent DCPPC teams with specialized and somewhat differing GUID management practices, who must successfully engage across a set of common practices and approaches to interoperate successfully.
These services are foundational for interoperability in the DCPPC.
We present a conceptual model here, to establish general agreement on principles. A related specification for dealing with persistent GUIDs will then be presented and aligned on API signatures and operational models reflecting this central conceptual model. Extensions and local capabilities beyond the core API model are appropriate so long as interoperability is supported through a common core.
Through this core model we hope to support computational interoperability within and between full stack environments in the DCPPC cloud(s); and interoperability with the FAIR scholarly ecosystem as a whole (see Section 3, Core Use Cases). This FAIR communications ecosystem is now moving towards full persistent registration, archiving, and citability of all digital research objects involved in creating a work of scientific research.
DCPPC as a whole currently supports and utilizes four main types of persistent GUIDs: DataCite DOIs, ARKs (including minids), Compact Identifiers, and DataGUIDs. DOIs, ARKs and DataGUIDs require registration and management of object-level metadata, and presence of a landing service, to which the identifier resolves.
Compact Identifiers require only registration and management of namespace-level metadata; they do not register object-level metadata. They are meant to be used for local identifiers in data repositories that do not publicly register object-level metadata.¶
Core Use Cases¶
The services model discussed in this document implements the FAIR Data principles (specifically F1, F2, F4, I1-I3, and R1-R1.3, Wilkinson et al. 2016) with the intention to create an ecosystem in which all data - and in DCPPC, ultimately all digital research objects - have an appropriately persistent and interoperable GUID, with the required associated core metadata (Fenner et al. 2018). It thereby supports critical DCPPC interoperability and functionality requirements for the following four strategic objectives.
3.1 FAIR DCPPC-wide interoperability. for data provided by the Data Stewards, and other data producers. To support end-to-end FAIR interoperability of the datasets upon which computations are performed, all provider data must be uniquely identifiable, ecosystem-persistent and actionable across cloud platforms.
3.2 Intra-FS functionality: Each DCPPC Full Stack (FS) must preserve the ability to rapidly and scalably create, identify, register, and resolve objects and related FS-specific metadata, via persistent GUIDs, in the cloud provider systems in which they are implemented.
3.3 Inter-FS Interoperability: Each DCPPC Full Stack must be able to identify and resolve persistent GUIDs created by other Full Stacks for objects and related Core Metadata, in their own FS-native environments.
3.4 FAIR Ecosystem-wide Interoperability: DCPPC services must be provided to support long-term persistent identification and resolution of data (Uhlir et al. 2012, CODATA 2013, Martone et al. 2014, Altman et al. 2015, Starr et al. 2016, Cousijn et al. 2018), software (Smith et al. 2016), workflows, and other objects, in the global FAIR ecosystem, so that they can be robustly cited, and tied to the literature, with long-term resolution to the identified objects and their metadata.
Principles of Operation¶
DCPPC GUIDs are (a) persistent, (b) uniquely identifiable and (c) interoperable on the web, across DCPPC, DCPPC Data Stewards, all DCPPC Full Stack environments and cloud providers, and the broader FAIR data scientific communications ecosystem.
a. Ecosystem-level persistence. GUIDs for DCPPC primary data supplied by the data stewards, and for citable computed results data, MUST support ecosystem-level persistence, with long-term stability that MUST extend beyond DCPPC’s scope and lifetime.
b. Project-level persistence. Temporary GUIDs for intermediate results data are not required to support long-term ecosystem-level persistence; but MUST support persistence within the scope of DCPPC; that is, potentially up to the lifespan of the project. These may be timestamped with a discrete lifespan defined in metadata; or the lifespan may be indefinite.
c. Terms of stability. GUIDs requiring ecosystem-level long-term persistence MUST be registered with long-term stable registration services, with a demonstrated scope beyond the DCPPC project, such as EZID/N2T, DataCite/DOI.org or Identifiers.org.
Any DCPPC-compliant GUID service may register and resolve objects associated with any DCPPC-supported GUID type - i.e. APIs and functional descriptions for DCPPC GUID services will be fully open and not owned by any specific provider. Where needed, secondary GUIDs (alternative identifiers) MAY be assigned to any object in addition to its primary GUID. All of the roles and restrictions of primary GUIDs apply to secondary GUIDs.
4.3 Multiple GUID Types are Supported
DCPPC currently supports the following GUID types:
CDL Archival Resource Keys (ARKs);
Minids, when registered as ARKs (Foster 2015, Chard et al. 2016);
DataGUIDs (https://dataguids.org); and
Compact Identifiers (Wimalaratne et al. 2018, McMurry et al. 2017).
While In some contexts, GUIDs are defined as synonyms of UUIDs, in the DCPPC context this is not the case (see Section 5, Definitions, below).
To assure interoperability, any proposed new DCPPC GUID types and associated services require evaluation and approval by KC2; this process remains to be defined but should ensure cross-DCPPC representation.
Persistence requires association of each GUID with (a) accessible and (b) mutable object-level metadata, by registration with robustly maintained stable REST services; as well as the ability to resolve the GUID to an appropriate Landing Service, which provides metadata to requestors.
a. Metadata accessibility preserves the ability for clients accessing the GUID services to make appropriate decisions about how to access the object endpoints.
b. Metadata mutability preserves the ability to resolve objects to their endpoints under changing deployment conditions.
c. Registration may be at the object level or the namespace level. Namespace-level registration is supported only for Compact Identifiers. Where the data provider has established a practice of using Compact Identifiers, as is done in the MODs, namespace-level registration is preferred. Otherwise, object-level registration is preferred.
a. Object-level resolution
Object-level resolution is provided for DOIs (doi.org), ARKs (n2t.net), minids (registered as ARKs, at n2t.net), and DataGUIDs.
These GUID types MUST resolve by default to a DCPPC-compliant public Landing Service. They may not resolve directly to the object contents.
b. Namespace-level resolution
Namespace-level resolution is provided for Compact Identifiers to support specific use cases on MODs datasets, and for stable referencing to objects outside the control of DCPPC participants.
Compact identifiers for objects provided by DCPPC participants (MODs datasets) SHOULD provide resolution to a DCPPC-compliant public Landing Service supported by the Namespace owner.
4.6 Landing Services
DCPPC-compliant Landing Services MUST provide the full set of Core Metadata (Fenner et al. 2018) for the referenced object (see more details below).
A DCPPC Landing Service MAY also provide additional metadata, for example, FS-specific local and extended metadata.
A DCPPC Landing Service MUST provide cloud-specific object resolution endpoints for access to the bit representation of the object. These MAY be restricted to authorized users.
A DCPPC Landing Service MUST provide all its metadata in JSON-LD. Core metadata MUST be provided as elements of schema.org.
Actionable identifier - An identifier that is resolvable on the Internet as a URI (Berners-Lee et al 2005, IETF RFC 3986), so that a human or machine user can discover information about a resource given its identifier. In DCPPC, the preferred form of actionable identifier is an HTTP URI.
Archival Resource Key (ARK) - Persistent identifiers designed to support long-term access to information objects, as specified in (Kunze & Rodgers 2008 https://n2t.net/ark:/13030/c7cv4br18). ARK registration and resolution is decentralized, and for this pilot is supported by the EZID and N2T systems at the California Digital Library, a unit of the University of California. Approximately 24 million objects are registered with N2T ARKs.
Compact Identifier - A GUID expressed in the pattern \<namespace_prefix>:\<local_identifier>, which can be resolved to a URI through a meta-resolver, such as http://identifiers.org or http://n2t.net, that maps the namespace_prefix to a namespace-associated resolver , to which it presents the local_identifier.
Core Metadata - A small subset of GUID type-specific object-management metadata, which enable GUID interoperability; as defined in DCPPC-DRAFT-7_KC2 Core Metadata for GUIDs (Fenner et al 2018).
Data Catalog - A collection of datasets, as defined in the W3C Data Catalog Vocabulary specification (https://www.w3.org/TR/vocab-dcat/#vocabulary-overview).
DataCite - DataCite (http://DataCite.org) is one of ten DOI Registration Agencies. DataCite registers DOIs and associated metadata for datasets, providing resolution and associated metadata access services targeted to support data citation. DOI as used in this document means DataCite DOI.
DataCite DOI - DataCite DOIs are DOIs registered with DataCite (http://DataCite.org), a DOI Registration Agency (https://www.doi.org/registration_agencies.html). DataCite DOIs identify long-term persistent, citable data, and have metadata elements specific to data. Approximately 5 million datasets are registered with DataCite DOIs.
Digital Object Identifier (DOI) - A persistent identifier for digital objects, as specified in ISO 26324:2012 (https://www.iso.org/standard/43506.html). DOIs are implemented within the Handle system (http://handle.net), supported by the International DOI Foundation (https://www.doi.org) and its ten DOI Registration Agencies.
Ecosystem-persistent - GUIDs which interoperate with widely-accepted infrastructure, having demonstrated long-term persistence and support resources in multiple communities beyond the scope of DCPPC.
Globally Unique Identifier (GUID) - An identifier which follows certain conventions to make it unique within a global context. In DCPPC, GUIDs are meant to be globally unique and actionable on the Web - or actionable when prefixed by a resolver URI.
GUID Broker - A GUID registration and landing service, which also serves as a local metadata registry, and maps internal system identifiers to persistent GUIDs.
GUID Type - For persistent GUIDs, the GUID Type (DOI, ARK, DataGUID, Compact Identifier) is determined by the architecture and APIs of the supporting registration and resolver services, and by the metadata they support.
Identifier - A name (alphanumeric string) linked to an object, set of objects, or concept, meant to specify that object, set, or concept uniquely within some context.
Interoperable - Independent software implementations written to a common specification that are verifiably able to exchange and make use of a certain set of information as defined in the specification, are interoperable - within the limits of the specification.
Interoperable Persistent GUIDs - Persistent GUIDs whose APIs for creation and resolution to endpoints for metadata and objects, and other relevant behavior, are formally specified, implemented, and validated, as interoperable software.
Landing Page - An HTML page and sets of associated machine- and human-readable metadata provided by a landing service.
Landing Service - A web service integral to the resolution of persistent GUIDs, which given a GUID, returns a set of GUID-associated metadata, including the GUID’s object resolution endpoint(s) or URIs. A landing service also SHOULD return human-readable metadata in the form of a landing page.
Minid - A minid is a semi-persistent GUID formed according to the specifications in (Ian Foster 2015 https://bd2k.ini.usc.edu/assets/all-hands-meeting/minid_v0.1_Nov_2015.pdf) and registered as an ARK.
Object Resolution Endpoint - The resolvable endpoint of an object (such as a file) by which its content can be accessed; specified as a URI.
Persistent GUID - A GUID which is guaranteed to persist over a defined timespan, i.e. one which is a persistent identifier (q.v.).
Persistent Identifier - A persistent identifier (PI or PID) is a long-lasting reference to a document, file, web page, or other object. In DCPPC, persistent identifiers are equivalent to Persistent GUIDs and are actionable on the Web when represented as URLs by prefixing the persistent GUID with a resolver service URL such as http://doi.org or http://n2t.net.
Resolver service - A GUID resolver service makes persistent GUIDs actionable on the Web by redirecting “get” requests to the GUID’s landing service, which then provides a limited set of metadata, including the object resolution endpoint(s), to the client for action.
Semi-Persistent GUID - A GUID with an explicit expiration datetime in its metadata, meant to identify temporary objects such as intermediate computational results.
Universally Unique Identifier (UUID) - A type of GUID defined using the conventions in RFC 4122 (Leach et al. 2005, https://www.ietf.org/rfc/rfc4122.txt). In some other contexts, GUIDs and UUIDs are defined as synonyms. However in the DCPPC context, they are not synonyms. UUIDs alone are not persistent and web-resolvable; and therefore not supported in DCPPC as independent GUID types. However UUIDs may be usemployed as a qualifying suffix, e.g. in constructing a DataGUID (e.g. .
DataCite DOIs, ARKs, Compact Identifiers and DataGUIDs are supported by DCPPC as persistent GUIDs. The first three of these may be considered ecosystem-persistent and ecosystem-interoperable, because they are persistent, machine actionable, and widely used by several communities beyond DCPPC. They are therefore citable and can be expected to be robustly interoperable beyond the scope of this project. Their registration and resolution are managed by widely used, well supported public services, not limited to the DCPPC (Data Citation Synthesis Group 2014, Altman et al. 2015).
DataGUIDs are important to a significant constituency within DCPPC and can today be considered DCPPC-interoperable, i.e., interoperable and persistent with DCPPC scope. This may change in future with wider adoption.
Within the DCPPC, DataCite DOIs (Digital Object Identifiers) are supported to identify citable long-term persistent and stable primary and analytic results data, and citable software. Within the DCPPC, ARKs (Archival Resource Keys) and DataGUIDs are supported to identify temporary time-limited data such as intermediate results data in workflows. The California Digital Library (CDL), a unit of the University of California, manages the ARK specification and N2T resolver. Minids are supported and registered in DCPPC as ARKs, conforming to the Minid metadata profile (Foster 2015, Chard et al. 2016).
DataCite DOIs are registered with DataCite.org and resolved via the doi.org resolver. Underlying infrastructure for DOI resolution, assisting DOI.org, is the distributed Handle system, maintained by the Corporation for National Research Initiatives (CNRI) in Reston VA. ARKs (and minids as a type of ARK) are registered with the CDL’s EZID service and resolved via the N2T.net resolver. DataGUIDs are registered and resolved with the DataGUID service.
Data catalogs are currently treated as labels for grouping datasets from various providers, and may be associated with Identifiers.org Namespaces.
The Landing Service Model¶
Landing services are used to access core metadata and object endpoints associated with persistent Globally Unique Identifiers (GUIDs).
DOIs, ARKs and DataGUIDs utilize a Landing Service (LS) resolution model, in which the GUID resolver redirects “get” requests to a landing service, which then publicly provides a limited set of metadata to the client, including one or more object resolution endpoints specified as a URI. This metadata SHOULD be both human- and machine-readable (Starr et al. 2015). Human-readable metadata is presented as a Landing Page. Determining how to act upon this metadata, including access to the object endpoints, is a client responsibility. Security on object endpoints is the responsibility of the object provider.
This indirect model
allows the endpoints to be updated as the set of endpoint domains and URIs change,
allows for multiple object resolution endpoints, e.g. for multiple cloud providers, and
provides limited object description metadata to
associate objects with a data catalog context,
allow object contents to be verified (checksum), and
allow the client to determine other actions based on, for example, the object size and cloud provider location.
DCPPC landing services metadata provided conform to the DCPPC Core Metadata Specification, and selection of human or machine-readable metadata is via content negotiation from the same endpoint (Fenner et al. 2018).
- In the LS model, a client registering an object obtains a persistent GUID for it by posting the associated Core Metadata including a Landing Service URI to a well-known community-supported persistent GUID registration service (currently supported: DataCite.org for DOIs; EZID service for ARKs / Minids; DataGUID.org for DataGUIDs) as shown in Figure 1.
Figure 1. Object registration with a FAIR GUID Registry (DataCite, CDL EZID or DataGUID).
To use a GUID, an associated resolver service (DOI.org, N2T.net, or DataGUIDs.org) redirects to the Landing Service, (Figure 2) which is responsible for making both the Core Metadata, and any additional locally cached (extended) metadata, available, with object resolution endpoints. Human-readable, as well as machine-readable metadata, SHOULD be provided and are presented as a Landing Page.
Landing Services are NOT intended to be centralized across the DCPPC. They are microservices under control of specific Object Brokers, such as the Argon Minid service, the Xenon GUID Broker, or the Sodium Object Registration Service (ORS).
An example of how a Landing Service works in practice is the landing page provided by Argon’s minid service for Team Calcium’s Identifier Interoperability document, as shown in Example 1.
Landing Services Federation.¶
The DCPPC model of persistent GUID registration, resolution, and landing services, is based on an interoperable GUID services model. By interoperable, we mean that all services of these types implement a type-specific interface that is verifiably able to exchange and make use of GUID Core Metadata for that object type.
Landing services and GUID management services associated with Full Stack (FS) environments need to conform to local requirements of their stacks, while adhering to common interoperability practices and data models. These are defined in (1) the GUID Core Metadata Model (Fenner 2018) and (2) the GUID Services Overview (this document). GUIDs are federated across DCPPC full stacks, KCs, and the public FAIR ecosystem, following these seven guidelines.
- Persistent GUIDs and the full set of their Core Metadata, applicable to the object type, are registered with robust, sustainable, DCPPC-recognized GUID resolvers, by FS- or KC-specific GUID services supporting common APIs.
Figure 2. Object resolution to a landing service.
Team Calcium’s Identifier Interoperability document is resolvable via the N2T resolver as ark:/57799/b9040f, by prepending the resolver’s URI https://n2t.net to the ark and doing a “get” on https://n2t.net/ark:/57799/b9040f .
ark:/57799/b9040f then resolves by redirection to a landing service endpoint on the Argon minid service, here: https://identifiers.globus.org/ark:/57799/b9040f. .
This landing page on the minid service provides basic metadata, and the object resolution endpoint(s) for the Identifier Interoperability spec document itself.
Minid:b9040f can also be resolved at N2T.net because the minid prefix is registered in N2T.net (harvested from Identifiers.org), i.e., the prefix specifies a Data Catalog of minids.
The minid landing service for this document provides two object resolution endpoints to obtain the object contents at which the actual document is available:
- A checksum is also provided to verify the object contents.
Example 1: Landing service resolution of a document identified by an ARK.
GUIDs are resolved by these recognized resolvers, which then redirect to an interoperable DCPPC landing service, whose URI was defined at GUID registration.
Landing services provide the Core Metadata in response to Core API calls. The authoritative source for Core Metadata is the public resolver for the GUID being resolved. Core metadata SHOULD be provided in both human- and machine-readable form. In human-readable form, the metadata is presented as a Landing Page.
Landing services support mapping and translation of core metadata into common DCPPC vocabularies and format, currently schema.org vocabulary and JSON-LD serialization.
Landing services MAY provide additional metadata from local cache, or from real-time queries to other services, as illustrated in Figure 3.
Figure 3. Optional metadata supplementation from real-time queries to other services.
Content negotiation to provide machine-readable metadata in a variety of formats SHOULD be supported.
Currently supported public GUID registration services are DataCite for DOIs, CDL EZID for ARKS, and DataGUIDs.org for dataGUIDs. Currently supported type-specific resolvers are doi.org for DOIs, n2t.net for ARKs, and DataGUIDs.org for dataGUIDs.
The landing service model is intended to associate specific well-supported stable resolvers with defined GUID types, avoiding “get” requests across potentially many services to resolve a single GUID, while also allowing for distributed resolution and provision by FS-associated brokers of any extended metadata they have cached. It also harmonizes differences in treatment of metadata between EZID, DataCite, and dataGUIDs.org; and it enables a common API aligned on the DCPPC Core Metadata Model.
A standard API is specified for any DCPPC-compliant landing service so that regardless of which landing service is specified as the metadata resolution endpoint at object registration, all will be responsive to the same standard REST calls.
All landing service Core Metadata is meant to be registered with a set of DCPPC-recognized long-term-stable resolvers strongly associated with specific GUID types; currently these are DataCite and DOI.org for DOIs, CDL EZID and N2T services for ARKs, and DataGUIDs for dataGUIDs. The specific elements registered will vary depending upon the type of object being registered. However, importantly, object or GUID brokers or registration services may register additional metadata of any type locally, and operate on it as extensions to the service specifications subsequent documents.
Issue: Should “public registration” of all DCPPC-supported GUIDs be required, e.g. must minids be registered as ARKs (per Foster 2015) and their metadata posted to the public service maintained at CDL.
Proposed Resolution: All DCPPC GUIDs MUST be resolvable [see Issue 7.4] via a DCPPC-declared service [see Issue 7.3] to a public Landing Service, which SHOULD provide a set of FAIR metadata as defined in the DCPPC Core Metadata Specification.
A Landing Service SHOULD provide metadata in both human and machine readable form, as defined in the Core Metadata Specification.
In some cases (e.g., embargo) not all metadata will not be immediately available, but the Landing Service SHOULD at least confirm the existence of the identifier.
Status: Closed. Agreed and documented via changes to 4.3 above.
Provision of Object Endpoints¶
Issue: Should the object endpoints of GUIDs [see Issue 7.4] be registered with their resolver, or may they be kept opaque and only accessible from a landing service to which the resolver redirects?
Proposed Resolution: A Landing Service MUST provide object endpoints for access to the bit representation of the object, e.g. actual data, software, etc. These MAY be restricted to authorized users.
Status: Closed. Agreed and documented via changes to 4.5 above.
GUID Types and Services¶
Issue: Should any and all FS and KC services conforming to the DCPPC model be allowed to register and resolve all types of GUIDs?
Proposed Resolution: KC2 recognizes DataCite DOIs, ARKs, Minids (registered as ARKs), and DataGUIDs as DCPPC-supported persistent GUID types.
DataCite DOIs are resolved (to an associated landing service) at https://doi.org.
ARKs and Minids are resolved at https://N2T.net.
DataGUIDs are resolved at https://dataguids.org
Full DCPPC compliance for GUID services will be defined in a separate API specification.
Status: Closed. Agreed and documented (changes throughout).
Resolvable Object Endpoints¶
Issue: DEFINE “resolvable” as used in 7.1 and 7.2 above. The definition should accommodate the various types of object endpoints used by DCPPC participants, e.g. HTTP URIs, IRODs, etc.
Proposed Resolution: Resolvable object endpoints are defined as Internet-resolvable URIs.
Status: Closed. Agreed and resolved via added definition of Object Resolution Endpoint.¶
Fourth Use Case: Primary Datasets Originating with Data Stewards¶
Issue: Sarala has pointed out that we are missing a key use case regarding Data Stewards.
Proposed Resolution: Fourth use case added prior to the initial three.
Status: Closed. Use case added with agreed text.
Include Compact Identifiers and their Implications¶
Issue: Sarala has pointed out that the prior version of this document focuses on landing services for GUIDs registered at the object level, but does not really discuss Compact Identifiers in an adequate way.
Proposed Resolution: Rewrite section 4 - Principles of Operation, to adequately represent Compact Identifier services. Rewrite section 6 - Landing Services, as a Discussion section to narrate how the principles of operation are orchestrated in practice to register and resolve the different categories of GUIDs.
Status: Closed. Agreed and resolved via Section 4 rewrite.
Altman, M., C. Borgman, M. Crosas and M. Martone (2015). "An introduction to the joint principles for data citation." Bulletin of the Association for Information Science and Technology 41(3): 43-45. https://doi.org/10.1002/bult.2015.1720410313
Berners-Lee et al. (2005) “Uniform Resource Identifier (URI): Generic Syntax”. IETF RFC 3986. https://tools.ietf.org/html/rfc3986
Chard, K., M. D'Arcy, B. Heavner, I. Foster, C. Kesselman, R. Madduri, A. Rodriguez, S. Soiland-Reyes, C. Goble, K. Clark, E.W. Deutsch, I. Dinov, N. Price, A. Toga (2016). “I'll take that to go: Big data bags and minimal identifiers for exchange of large, complex datasets”. Proceedings of the 2016 IEEE International Conference on Big Data, Washington DC, December 5-8, 2016. http://bd2k.ini.usc.edu/pdf/publications/bagminid.pdf
CODATA/ITSCI Task Force on Data Citation (2013). "Out of cite, out of mind: The Current State of Practice, Policy and Technology for Data Citation." Data Science Journal 12: 1-75. https://doi.org/10.2481/dsj.OSOM13-043
Cousijn, H., et al.(2018) A Data Citation Roadmap for Scientific Publishers. bioRXiv Jan. 19, 2017. DOI: https://doi.org/10.1101/100784
Data Citation Synthesis Group (2014). "Joint Declaration of Data Citation Principles." FORCE11. https://doi.org/10.25490/a97f-egyk
Fenner et al. (2018). “Core Metadata for GUIDs.” NIH Data Commons Pilot Program Consortium, DRAFT Document, 31 May 2018. https://tinyurl.com/y8dbuzut
Foster, I. (2015). “Minid: A BD2K Minimal Viable Identifier Pilot: Draft of November 7, 2015”. https://tinyurl.com/y8a6pxm4
Leach, P. et al. (2005) A Universally Unique IDentifier (UUID) URN Namespace. RFC 4122, Internet Engineering Task Force. https://www.ietf.org/rfc/rfc4122.txt
McMurry, J. et al. (2017) “Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data”. PLOS Biology 15(6): e2001414. https://doi.org/10.1371/journal.pbio.2001414
RoyalSociety (2012). Science as an Open Enterprise. London, The Royal Society Science Policy Center. https://tinyurl.com/y7y6q7o4
Smith, A. M., Katz, D. S. and Niemeyer, K. E. (2016). "Software citation principles." PeerJ Computer Science 2: e86. https://doi.org/10.7717/peerj-cs.86
Starr, J., et al. (2015). "Achieving human and machine accessibility of cited data in scholarly publications." PeerJ Computer Science 1:e1. https://doi.org/10.7717/peerj-cs.1
Uhlir, P. et al. (2012). Developing Data Attribution and Citation Practices and Standards. Washington DC: National Academies Press.
Wilkinson et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3:160018. PMID: 26978244; PMCID: PMC4792175; DOI: https://doi.org/10.1038/sdata.2016.18
Wimalaratne, S.M., et al. (2018) Uniform resolution of compact identifiers for biomedical data. Scientific Data 2018, 5:180029. DOI: http://doi.org/10.1038/sdata.2018.29
- However if all mandatory Core Metadata elements are not posted to the native resolver, then the scope of interoperability is limited to that of the referenced landing service.