St. Jude Cloud-a Pediatric Cancer Genomic Data Sharing Ecosystem

Abstract
Effective data sharing is key to accelerating research to improve diagnostic precision, treatment efficacy, and long-term survival of pediatric cancer and other childhood catastrophic diseases. We present St. Jude Cloud (https://www.stjude.cloud), a cloud-based data sharing ecosystem for accessing, analyzing and visualizing genomic data from >10,000 pediatric cancer patients and long-term survivors, and >800 pediatric sickle cell patients. Harmonized genomic data totaling 1.25 petabytes are freely available, including 12,104 whole genomes, 7,697 whole exomes and 2,202 transcriptomes. The resource is expanding rapidly with regular data uploads from St. Jude's prospective clinical genomics programs. Three interconnected apps within the ecosystem-Genomics Platform, Pediatric Cancer Knowledgebase and Visualization Community-enable simultaneously performing advanced data analysis in the cloud and enhancing the pediatric cancer knowledgebase. We demonstrate the value of the ecosystem through use cases that classify 135 pediatric cancer subtypes by gene expression profiling and map mutational signatures across 35 pediatric cancer subtypes.
- Received August 21, 2020.
- Revision received November 17, 2020.
- Accepted December 14, 2020.
- Published first January 6, 2021.
- Copyright ©2021, American Association for Cancer Research.
This OnlineFirst version was published on January 8, 2021
doi: 10.1158/2159-8290.CD-20-1230