BioJupies: Automated Generation of Interactive Notebooks for RNA-seq Data Analysis in the Cloud

Authors: Denis Torre, Alexander Lachmann, Derek Wang, Avi Ma'ayan (Team Nitrogen, PI: Ma'ayan)

Contact point: Avi Ma'ayan

Links: and bioRxiv pre-print

Tags: KC1, Jupyter, Workflows, RNA-seq, GTEx, Cloud Computing, Signatures, Data Visualization, Pipelines

Interactive notebooks, such as Jupyter Notebooks, can make bioinformatics data analyses more transparent, accessible, and reusable. They are customizeable and can be packaged within Docker containers, making them compatible for running on many systems. However, creating notebooks requires computer programming expertise. The Data Commons aims to make tools, like interactive notebooks, open to scientists with a broad range of skills. As part of this goal, the team developed BioJupies, a web server that enables automated creation, storage, and deployment of Jupyter Notebooks containing RNA-seq data analyses. Through an intuitive interface, novice users can rapidly generate tailored reports to analyze and visualize their own raw sequencing files or gene expression tables. In addition, >11,000 RNA-seq samples from the publicly available GTEx dataset, or >250,000 samples from >8,000 published RNA-seq studies from the Gene Expression Omnibus (GEO) can be fetched into BoiJupies. Generated notebooks have executable code of the entire pipeline, rich narrative text, interactive data visualizations, and differential expression and enrichment analyses. The notebooks are permanently stored in the cloud and made available online through a persistent URL for retreival. By providing an intuitive user interface for notebook generation for RNA-seq data analysis, starting from the raw reads, all the way to a complete interactive and reproducible report, BioJupies is a useful resource for experimental and computational biologists. In addition to making Jupyter Notebooks more accessible, the BioJupies platform is significant to the Data Commons because it offers a new way to communicate research results. Instead of waiting for a publication to appear months or years after the data was collected, investigators that use BioJupies can communicate their newly acquired results in minutes. BioJupies is freely available as a web-based application from and as a Chrome extension from the Chrome Web Store.

BioJupies Flowchart