The Team Argon approach for user identification, authentication, and authorization is Globus Auth: a fully compliant implementation of OpenID Connect 1.0 (OIDC) and OAuth 2.0 (OAuth2), integrated with the U.S. InCommon Federation and international eduGAIN identity providers, and easily extensible to support other federated identity providers. The Globus Auth service, hosted on Amazon Web Services (AWS) and professionally operated by Globus at the University of Chicago, has a user base that includes 90,000 users (37,000 new users in the past year) from 475 institutions, with 1,000+ registered applications and services: a base that is rapidly growing. Here we review its features and propose ways in which it can be used in the Data Commons.
To fully realize the FAIR principles, the Data Commons must invite use by researchers and encourage consistent adoption of its security model by contributing development teams. Ease of use, familiarity, and interoperability with other systems and services are vital to the success of the Data Commons and must be full realized in its security mechanisms. The more complex and onerous a security model becomes, the less value it provides. Globus Auth provides the following key features that we believe are essential to the success of research systems like the Data Commons. Details on each, along with references for further exploration, follow.
Widespread use in Web, mobile, and command line apps, with availability of myriad plugins, modules, SDKs, and adapters for development frameworks
Immediate ability to use the InCommon/eduGAIN identity providers at >450 academic/research institutions, plus ORCID, Google, and national facility providers such as XSEDE
Level-of-assurance features for both identity providers and authentication methods, to ensure timely, multi-factor authentication where appropriate
A strong support base (including subscription-based funding) from participating academic/research institutions, national labs, and NSF via XSEDE
Identity sets - exploit organizational affiliations to locate, reference, and recognize peers and colleagues
Single sign-on across distributed systems integrating services from multiple providers
OAuth2 authorization - create new services whose REST APIs secured using OAuth2 tokens
Widespread use, familiarity, developer support¶
OIDC and OAuth2 are in wide use throughout the Web, which means that developers using nearly any development environment or framework have access to plugins, SDKs, and drop-in modules that can be used with Globus Auth. Integrating a Web application with Globus Auth is the same as adding "Login with Google" or other social login services. (In fact, Globus Auth works with off-the-shelf social login plugins in commercial and open source services like Atlassian JIRA and Confluence, Drupal-based websites, and Wordpress blogs.) Locally installed applications (those downloaded and installed locally, including mobile and command line apps) can use Globus Auth's "native application" support.
The Globus Auth service provides a self-service registration interface for applications and services. OAuth2 requires applications and services to be registered with their OAuth2 provider, so in order to lower barriers to use, Globus Auth's registration is fully automated and can be used by anyone on the Internet. This kind of support for developers--providing online interfaces that can be used 24/7 with immediate responses--will be essential to encouraging consistent use of Globus Auth among service providers in the Data Commons.
Academic and research identity providers¶
A key benefit of Globus Auth over other social login services is its ready-to-use integration with hundreds of academic and research institutions. Most of the Data Commons' target users are at institutions whose authentication services are available via Globus Auth. Those who are not can create identities at national or academic service providers (e.g., XSEDE, ORCID) or at commercial providers (e.g., Google), all of which can be used with Globus Auth. Globus already supports NIH Trust logins and is adding the eRA Commons as an identity provider, allowing eRA Commons identities to be used throughout services that leverage Globus Auth. Data Commons users will authenticate (login) using an organization to which they already belong, and then--on first use--register additional information to establish a profile with the Data Commons.
Identity assurance and policy support¶
Of course, supporting many existing identity providers introduces the need to differentiate their diverse levels of assurance and enforce minimum standards. For example, a data provider may not allow their data to be accessed by users who are authenticated solely via Google. By the end of the pilot period, Globus Auth will allow applications and services to verify the timeliness and forms of authentication used (e.g., to require multi-factor authentication, authentication within the last hour, authentication in the same web browser session) and to limit acceptable identity providers, either for authentication (e.g., require use of one or more providers for authentication) or for identification (e.g., require that the user has previously registered with one or more providers). While these assurance features are vital for enabling access to restricted services and data and otherwise supporting community policies, we are taking pains to ensure that application developers who use Globus Auth can enable these features in their applications without adding significant complexity to their code.
Professional hosting & broad base of support¶
The Globus Auth service is professionally hosted on Amazon Web Services (AWS) by Globus at the University of Chicago. It has been in operation for two years in its current form and there has been only one downtime (a scheduled interruption for upgrade, lasting 68 minutes) during that period, giving Globus Auth a 99.994% uptime since its launch.
Globus Auth is a free service, available for use by the academic and research community without fees or subscriptions. Support for Globus Auth is funded by academic and research institutions (and other initiatives) that subscribe to the Globus research data service and who receive access to priority support and other premium features. Other research systems that use Globus Auth in their applications include XSEDE, Compute Canada, NCAR, NERSC, University of Michigan, and University of Exeter. Globus subscribers like these understand the advantages of supporting Globus to the benefit of the entire research community.
Leveraging organizational affiliation¶
Designed for use by researchers, Globus Auth offers a feature called identity sets. Most researchers have affiliations with multiple institutions and organizations (e.g., University of Southern California, ORCID, NIH, Google) and research colleagues know each other through different affiliations. Globus Auth allows users to link their various identities into an identity set. Applications can use these identity sets to help users find each other using each others' full set of affiliations and related identities.
Single Sign-On across many services and applications
Consistent use of Globus Auth throughout an enterprise results in a Single Sign-On (SSO) experience for users. After authenticating to one Web application, subsequent access to other Globus Auth-enabled services and Web interfaces can reuse the user's identity information, subject to the user's consent. This SSO experience is an important benefit to the Data Commons because it spans both internal and external applications. The SSO experience is consistent across all applications and services that use Globus Auth: Data Commons interfaces and services, services developed and hosted by researchers, or service provided by other systems such as XSEDE or Compute Canada, all of which may--and often already do--use Globus Auth.
A successful Data Commons will involve many groups creating services that can be leveraged by applications and other services. This ecosystem of services requires a flexible authorization framework that can work across and among services provided by multiple teams. The OAuth2 authorization framework on which OIDC and Globus Auth are built is a good base for this framework, but the Data Commons has already posed use cases that OAuth2 alone cannot handle cleanly. Globus Auth can handle more than some OAuth2 services, but even it cannot handle them all. Consider the following three generic use cases.
A Data Commons service provider (e.g., a data collection service) needs to authorize access to a REST API so that the API may be used within other applications (e.g., research tools that need to access the data collection).
A Data Commons service (e.g., a data transfer service) needs to obtain OAuth2-based access to another service (e.g., a data collection service) so that it can act as an agent of an individual (e.g., a researcher) to do something on the individual's behalf (e.g., move data from one location to another). (This is often referred to as delegated access.) Note: The fact that the intermediary service is taking an action on an individual's behalf is critical for auditing purposes!
A research application (e.g., analysis tool) needs to obtain access to multiple Data Commons services, but each service is authorized by a different OAuth2 authorization service. (For a simple case, consider two data collections services: one authorized by an OAuth2 service at Harvard, another authorized by an OAuth2 service at Johns Hopkins.)
The first use case is something that Globus Auth--and other popular OAuth2 implementations, including Auth0--can do today. Globus's OAuth2 authorization features can already be used to secure both RESTful APIs (based on HTTPS) and non-HTTPS protocols (e.g., bulk data transfer, SSH). Applications and services that use APIs for integration with larger systems can authenticate and authorize use via OAuth2 tokens. Applications can define their own "scopes" to provide a least-privilege model of authorization: tokens received on initial authentication provide access to a specific set of privileges, and access beyond those privileges requires explicit user and/or application permission. The flexibility and generality of OAuth2 will be needed in the Data Commons as we bring together research communities, data producers, institutions, and the policies they inevitably bring with them.
The second use case is something that Globus Auth can do today but, as far as we know, other OAuth2 implementations do not support. To handle this use case, Globus Auth has added the notion of dependent tokens to the OAuth2 flow. This enables cases where services invoke other services, with the same rich least-privilege security model as described above. This "delegation" use case has already become common in today's research and education systems (it's an important feature of the Globus data transfer service) and is certain to be important in the Data Commons as well.
The third use case is something we believe no OAuth2 service currently supports, even though it appears--when stated simply, as above--to be of obvious importance to the Data Commons. This is not as much a weakness of OAuth2 as it is a testament to the ambitiousness of the Data Commons vision of a FAIR (and federated) environment. As used commonly today, OAuth2 requires application developers to register their applications with each of the authorization services that they need to use. As the number of authorization services increases, of course, this presents a larger and larger burden on the developer.
What the application developer really wants for the third use case is a single OAuth2 authorization service with which the application can register, and that can issue access tokens.
Depending on how those authorization services are configured, it might even require researchers using the application to authenticate more than once when using the application. The user authentication issue could potentially be mitigated by OIDC and identity federation, but not necessarily. The application registration issue cannot be mitigated with known OAuth2 authorization services.
for use with any of the Data Commons services, even those that use their own unique OAuth2 authorization servers.
We propose to do just that with Globus Auth, by making Globus Auth's OAuth2 authorization service support an "OAuth2 token proxy" mode. In this mode, applications will register with Globus Auth and request OAuth2 access tokens for services authorized by other OAuth2 authorization services. The Globus Auth token proxy service will already be registered with the other Data Commons OAuth2 authorization servers and will broker requests for tokens from applications to the other OAuth2 authorization services as needed. We reiterate that this isn't something that Globus Auth supports today. We propose to add this functionality to Globus Auth, sharing the risks and costs with Data Commons and the rest of our user community, to produce a service that becomes a part of the sustainable Globus platform.
Access management service¶
The Data Commons' vision and FAIR principles clearly point to the need for an access management service that can be used to encode and communicate policies among many--otherwise independent--data services and applications, regarding who has rights to access various resources. This access management service must be flexible and powerful enough to express the real-world policies introduced by data stewards, Data Commons services and applications, laws and regulations, and those who consume and add value to the collective data environment. The access management service must also be easily understood and easy to interface with by the developers of the Data Commons and other research and academic systems. For all of these reasons, it most likely can only succeed if it serves as a third party to the data providers, to the data consumers, and to their identity providers.
As service providers to the research community for almost 25 years, we can say with confidence that there currently is no such access management service that meets the criteria above.
Foremost in the deficits is the lack of widespread adoption and de facto standardization. Unlike OIDC and OAuth2, both of which have huge support in both commercial and open source communities, you will find no access management service that enjoys that level of familiarity and support.
We believe, however, that Globus Auth's benefits in other areas (its basis in OIDC and OAuth2, its connections to InCommon and eduGAIN identity providers, its large user base in research & education, a lightweight and approachable developer experience) offer a sound base on which the Data Commons can build the independent access management service its vision calls for. We will show these advantages during the pilot phase through our integrations with other stacks and demonstrations, and we trust that this evidence will allow consideration of the following approach to developing an access management service for the Data Commons based on Globus Auth.
To construct an access management service that satisfies and enables the FAIR principles, we will extend Globus Auth to be an attribute broker, enabling other Data Commons services to become third-party attribute providers. Attributes will be associated with OAuth2 access tokens and include such things as identity sets, group memberships, assurance information, and other attributes provided by upstream identity and attribute providers. (No one knows, this early in the Data Commons Pilot, what those may be, which is why we need flexibility and independence from existing frameworks.) For example, dbGaP could provide dataset authorization attributes, defined in reference to identities from eRA Commons, InCommon identity providers, and other appropriate identity providers. The Data Commons must either adopt or develop a common access policy language, which we believe should be based on ideas from AWS's Identity and Access Management (IAM), OASIS's XACML, and other present and past efforts. The access policy language would leverage the attributes (identity and other) flowing from Auth. As has been done in AWS IAM, specific policies might be expressed for a given service or shared across services via a central clearinghouse.
As in Globus Auth, these two core elements--the attribute broker and the access policy language--would be realized as a suite of centrally hosted APIs (services) and libraries responsible both for storing access policies and for evaluating policies within applications and services. The ultimate success of these APIs and libraries will depend on the degree to which they are easily understood and easy to interface with by existing applications and services.
While we believe strongly that the Data Commons can only be successful with an access management service that satisfies the criteria identified above, and are confident that such an access management service must be built, we must note that building a new service carries with it costs, risks, and timelines that are different from those associated with integrating things that have already proven widespread success through adoption and use. However, we also reiterate that we are unaware of any credible alternative. Our proposition is that Globus will build a general authorization service, described above, that meets the need of Data Commons, sharing the risks and costs with Data Commons and the rest of our user community, and this service will then become a standard part of the sustainable Globus platform.
Most of the features mentioned in this document are already available via the Globus Auth platform. The table below identifies when and how the others will become available.
Features available now in Globus Auth¶
Widespread use beyond the Data Commons
Widespread availability of developer support
Professional AWS hosting
Experienced user & developer support
Broad base of financial support
Integrated with >450 academic/research identity providers, also including national and commercial providers
Identity sets - linked institutional affiliations
Require registration with specific identity providers
Single sign-on (SSO)
Authorize access to REST APIs
Authorize access to non-REST protocols
Support for desktop and mobile applications
Delegation support (service A accesses service B on behalf of specific user) via dependent tokens
Features available in Globus Auth by the end of the 180-day Phase I period:
Require recent and same-session authentication
Require multi-factor authentication
Require authentication with specific identity providers
Features we would like to provide in Globus after the 180-day Phase I period, subject to negotiation and funding:
Attribute broker API (accept attributes from 3rd parties)
Access policy storage and evaluation API
OAuth2 token proxy mode (offer tokens from "federated" OAuth2 authorization servers)
Documentation for the Globus Auth API (including OIDC and OAuth2 implementation plus Globus additions) includes a Developer's Guide, the Auth API Reference, and the formal Globus Auth Specification.
The documentation for the Globus Python Software Development Kit (SDK) has a section on Globus Auth and also includes several basic examples that use the Auth API in Python. Note: This SDK is open source and hosted on GitHub.
S. Tuecke et al., \"Globus Auth: A research identity and access management platform,\" IEEE 12th International Conference on e-Science, Baltimore, MD, 2016, pp. 203-212. doi: 10.1109/eScience.2016.7870901 https://www.globus.org/sites/default/files/GlobusAuth.pdf