Elysium: Free RNA-seq Alignment in the Cloud¶
Authors: Alexander Lachmann, Zhuorui Xie, Avi Ma'ayan (Team Nitrogen, PI: Ma'ayan)
Contact point: Avi Ma'ayan firstname.lastname@example.org
Tags: KC1, Workflows, RNA-seq, Cloud Computing, Signatures, Data Processing, Kallisto, Pipelines
RNA-sequencing (RNA-seq) is currently the leading technology for genome-wide transcript quantification. However, biomedical researchers who apply this method for their research projects are often stuck at the first step of data analysis which is sequence alignment. Sequence alignment is the mapping of the raw reads to transcript and gene level counts. The sequence alignment step requires significant computational resources and basic programming skills. Elysium enables users of all skill levels to perform a uniform RNA-seq alignment in the cloud. The Elysium infrastructure is comprised of four components: A file upload API that enables storage of FASTQ files on Amazon S3 without Amazon credentials (yellow box); a web server that handles the cloud alignment job scheduling for uploaded files (green box); containers that perform the alignment jobs (blue box); and a graphical user interface (GUI) to provide intuitive access to users that do not have command-line access skills. Elysium is significant to the Data Commons because it solves a critical bottle neck for investigators who collect RNA-seq data but need help with processing the raw data so it can be further analyzed. The Elysium service is free because we were able to reduce the compute cost per sample to less than 1 cent. The project demonstrates how cloud computing can become an efficient way to offer useful services to biomedical researchers that overall reduce cost significantly. The Elysium source code is available under the Apache License 2.0 on GitHub at: https://github.com/maayanlab/elysium. The service of cloud-based RNA-seq alignment is freely accessible through the Elysium GUI at: [http://elysium.cloud](http://elysium.cloud].