Skip to content

5M.3.DEMO: Data Analysis: User adds data, workflow, harmonize with repository

Description

This deliverable is mainly a progress report and plan for the Data Analysis Demo. The current demo is now estimated for completion in early October. So, instead of the full demo, the progress and plan were submitted for this deliverable.

What they achieved

The general steps of the planned demo are:

1.   Stacks onboard GTEx WGS data and GUIDs for files.

2.   Full Stacks divide up GTEx WGS data. Each Stack will end
     up processing ~a quarter of the data through the alignment
     workflow.

3.   Full Stacks implement a way of sharing data between
     stacks.

4.   Full Stacks share auth tokens so they can all access the
     other 3 stacks.

5.  Full Stacks actually run their subsets of GTEx CRAMs
     through TOPMed alignment workflow, then share the outputs
     with Team Calcium.

6.  Team Calcium runs joint variant calling on all re-aligned
     CRAMs, then shares the results with Helium, Argon, and
     Xenon.

7.  All four Full Stacks perform downstream analysis.

Why is this valuable?

One of the Key Capabilities of the Data Commons is multiple robust and sustainable software stacks implementing Commons standards. These platforms need to cooperate on the base layer of functionality and interoperability is needed to achieve the core Commons goals.

This demo is important to show how the Stacks are supporting KC standards, common workflows to run on data provided by Data Stewards, cross-platform workflow support, and how they are addressing the need for interoperability.