The OASIS Team

Datasets

TCGA:
The Cancer Genome Atlas (TCGA) project is a joint effort of the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing. Please follow TCGA publication guidelines when using TCGA data in your publications.

CCLE1:
The Cancer Cell Line Encyclopedia (CCLE) project is a collaboration between the Broad Institute, and the Novartis Institutes for Biomedical Research and its Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacologic characterization of a large panel of human cancer models. The CCLE provides public access to genomic data for about 1000 cell lines. Please follow CCLE Terms of Usage when using CCLE data in your research.

GTEx2:
The Genotype-Tissue Expression (GTEx) project aims to provide a resource with which to study human gene expression and regulation and its relationship to genetic variation. This project will collect and analyze multiple human tissues from donors who are also densely genotyped, to assess genetic variation within their genomes.

ACRG-HCC3:
The ACRG-HCC dataset from the Asian Cancer Research Group contains array-based gene expression data and copy number variation, somatic mutation data based on whole genome sequencing (WGS) of a predominantly HBV driven Hepatocellular carcinoma (HCC) cohort.

Samsung-HCC4:
The Samsung-HCC dataset was generated through collaboration between Pfizer Oncology Research and the Samsung Medical Center (SMC) in South Korea. It contains array-based gene expression and copy number variation data from a HCC cohort.

HKU-GC5:
The HKU-GC dataset was generated through collaboration between Pfizer Oncology Research and the University of Hong Kong. It contains array-based gene expression, copy number variation data as well as mutation data based on whole exome sequencing (WES) and whole genome sequencing (WGS) on a cohort of gastric cancer patients.

METABRIC6:
The METABRIC dataset contains clinical traits, expression, CNV profiles, and SNP genotypes derived from breast tumors collected from participants of the METABRIC trial.

Acknowledgements

BioMart7:
The OASIS web portal was developed based on a custom version of the BioMart framework designed for oncogenomics data analysis8. The BioMart project provides free software and data services to the international scientific community in order to foster scientific collaboration and facilitate the scientific discovery process. We sincerely thank Arek Kasprzyk and the ICGC data portal development team for help with implementation of the BioMart framework.

OncoLand:
The Cancer Genome Atlas (TCGA) and the Genotype Tissue Expression (GTEx) analyses for this portal were generated using the OncoLand data service provided by Omicsoft Corporation. OncoLand, Array Studio, Array Viewer and Array Server and all other Omicsoft products or service names are registered trademarks or trademarks of Omicsoft Corporation, Cary, NC, USA.

cBio portal for Cancer Genomics9:
The original version of the Oncoprint code was provided by the cBio Portal for Cancer Genomics group at the Memorial Sloan Kettering Cancer Center.

Pfizer colleagues:
We would like to acknowledge the contributions of former team members Heather Estrella, Sarathy Mattaparti, Ming Cui and Yu 'David' Liu. We sincerely thank support and contributions from colleagues in Pfizer Oncology Research Computational Biology and Research Business Technology Kai Wang, Keith Ching, Shibing Deng, Ying Ding, Tao Xie, Zhou Zhu, Jack Yee, Tse Da, Susan Stephens and Paul Rejto.

References

1. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603-7 (2012).
2. Consortium, G.T. The Genotype-Tissue Expression (GTEx) project. Nat Genet 45, 580-5 (2013).
3. Kan, Z. et al. Whole-genome sequencing identifies recurrent mutations in hepatocellular carcinoma. Genome Res 23, 1422-33 (2013).
4. Wang, K. et al. Genomic landscape of copy number aberrations enables the identification of oncogenic drivers in hepatocellular carcinoma. Hepatology 58, 706-17 (2013).
5. Wang, K. et al. Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer. Nat Genet 46, 573-82 (2014).
6. Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346-52 (2012).
7. Kasprzyk, A. BioMart: driving a paradigm change in biological data management. Database (Oxford) 2011, bar049 (2011).
8. Zhang, J. et al. International Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics data. Database (Oxford) 2011, bar026 (2011).
9. Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2, 401-4 (2012).