5M.3.DEMO: Data Analysis: User adds data, workflow, harmonize with repository¶
This deliverable is mainly a progress report and plan for the Data Analysis Demo. The current demo is now estimated for completion in early October. So, instead of the full demo, the progress and plan were submitted for this deliverable.
What they achieved¶
The general steps of the planned demo are:
1. Stacks onboard GTEx WGS data and GUIDs for files. 2. Full Stacks divide up GTEx WGS data. Each Stack will end up processing ~a quarter of the data through the alignment workflow. 3. Full Stacks implement a way of sharing data between stacks. 4. Full Stacks share auth tokens so they can all access the other 3 stacks. 5. Full Stacks actually run their subsets of GTEx CRAMs through TOPMed alignment workflow, then share the outputs with Team Calcium. 6. Team Calcium runs joint variant calling on all re-aligned CRAMs, then shares the results with Helium, Argon, and Xenon. 7. All four Full Stacks perform downstream analysis.
Why is this valuable?¶
One of the Key Capabilities of the Data Commons is multiple robust and sustainable software stacks implementing Commons standards. These platforms need to cooperate on the base layer of functionality and interoperability is needed to achieve the core Commons goals.
This demo is important to show how the Stacks are supporting KC standards, common workflows to run on data provided by Data Stewards, cross-platform workflow support, and how they are addressing the need for interoperability.