Lessons learned: Project organization.¶
To identify overall strategic and tactical objectives of the project we generated a series of reports for each working group that listed the strategic goals across the groups, itemized the milestones required to achieve those goals, and generated a Project Execution Plan containing our mission, overarching goals, initial timeline, project wide milestones, as well as smaller tasks to be executed by each individual group. While this served as an initial framework to proceed further, it was still quite evident that given the size and complexity of the overall consortium we would need several aspects of communication and interaction that were not originally contemplated in the OT announcement. The follow areas of improvement were rapidly deployed to address these concerns:
Intergroup communication. In addition to regular conference calls of all KC groups and stacks, monthly face to face meetings were used to increase KC/stack and KC/KC interactions. These internal workshops allowed consortium members to develop strong working relationships that solidified the notion that we are a community of diverse people working towards a common goal. This sense of community and trust, allowed to achieve a level of cooperation, collaboration and sharing that would not have been possible if all our cross-team communication virtually. The workshops have been heavily (Fig. 1) subscribed and will continue throughout the course of the project.
Documentation of technical products. An additional layer designed to improve project-wide communication and oversight we have established is that members of both stacks and KCs are creating an Request for Comment (RFC) collection, which also formalizes the groups that have reviewed and commented on these documents. We also anticipate that the Design Guidelines Working Group will help with oversight of how products are developed for the pilot. We also note that the working group F2F meetings that have been initiated represent a significant improvement in technical communication and consensus building across the DCPPC.
Use case analysis. Perhaps one of the most remarkable aspects of this project is the broad diversity of objectives that it has been expected to accomplish, which is reflective of the broad ranging needs of the research community. All awardees entered the DCPPC with their own specific aims their components of the project, and consultation with NIH leadership has also exhibited a broad, sometimes conflicting, set of expectations for the functionality of the Commons. Establishing a coherent plan has been achieved in part by employing user-centric design principles where the scientific objectives, analysis tasks, and workflows of the Commons have been carefully evaluated. This has been achieved in part by formation of Design Guideline Working Group composed of personnel from the KCs, stacks and the Data Stewards. The primary output has been to collect narratives for researchers seeking to analyze biomedical data and to translate these narratives to a use case library. These use cases have and will continue to serve as drivers for all development of the Commons, and are drawn upon heavily in the evaluation of monthly internal demos presented by each of the Commons working groups.
Commons Consortium Coordination Committee. Roughly half way through the first phase of the pilot period, NIH staff assembled a Commons Consortium Coordination Committee (C4) to address several important concerns related to the DCPPC. The C4 overlays additional communication onto the DCPPC by formalizing important roles for several group leaders. The C4 has points of contact to oversee Project Management, Full Stacks, KC Interactions, DCPPC Architecture, Communications, and Data Management & Access. The C4 is in part meant to address the requirement for significant time investment in DCPPC organization needed for effective channeling of communication. Each point of contact can more easily serve as a member of an ad hoc groups to more rapidly respond to issues as they arise during the project, enabling them to work closely with all team members in assisting with review of activities DCPPC operations, encouraging the use of the Request For Comment (RFC) process, assisting with policy development, and other activities.
DCPPC community onboarding. Many [people have joined this project (Fig. 2) and the community continues to grow daily. The onboarding process we developed for Phase 1 ties each new community member to a team, ensures they are signed up for the appropriate communications channels (e-mail, Slack, etc.), and enables access to Consortium resources. There is a separate whitelisting process that provides access to PHI if the appropriate data access agreements have been filed.
On boarding data from the stewards. Prior to the start of the project the three Data Steward communities received resources to assist in harmonizing and adding value to their datasets for the Commons, however many critically important issues had yet to be addressed. The Data Stewards have been actively involved in several areas related to data accessibility and interoperability. Members of the Data Stewards are actively involved in working groups such as KC1-FAIR guidelines and metrics, KC2-Global Unique Identifiers, KC7-Indexing and Search, KC8-Use Cases/Design Guidelines. They have been instrumental in submitting Use Cases and developing a Use Case Library. Members of the Data Stewards regularly attend Face-to-Face Workshops and have presented breakout sessions to outline their data, explain complexities and examine issues. TOPMed has identified and made available a small number of datasets to full stacks and working groups for development purposes. GTEx and AGR and the MODs provide access to datasets through FTP downloads, APIs and S3 cloud buckets. AGR/MODs have worked with the KC1 group (FAIR guidelines and metrics) to review initial efforts to assess degree of FAIRness of these resources for tools and datasets and modify rubrics for greater applicability to such resources. RGD and Wormbase are using the assessment results to modify aspects of their sites to increase FAIRness. These modifications include greater visibility of license and citation information, as well as the development of github sites for accessibility to code for tools.
One long term issue that we need to address are the social incentives for data resources to participate in the Data Commons. However, we have not yet established a sustainable long-term model for participation in the Commons. Review of the incentives for participation of the current Data Stewards, as well as future contributors to the Commons, is of critical importance going forward and will be among the issues addressed by the C4.