Canadian and US Leaders in Cancer Research Announce a Big Data Challenge to Develop Robust Methodologies for Predicting Cancer Mutations

An open Challenge that merges the efforts of the world’s largest cancer genome sequencing consortia, the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) with those of Sage Bionetworks and DREAM.

RECOMB/ISCB Conference 2013

TORONTO--()--Cancer research leaders from the Ontario Institute for Cancer Research (OICR) and the University of California, Santa Cruz (UCSC), in collaboration with Sage Bionetworks and IBM’s DREAM, will announce tomorrow the opening of the ICGC-TCGA-DREAM Somatic Mutation Calling (SMC) Challenge (https://www.synapse.org/#!Challenges:DREAM) at the Sixth Annual RECOMB/ISCB conference (http://www.iscb.org/recomb-regsysgen2013).

Like previous DREAM Challenges in the series, this new Challenge will engage a diverse community of scientists to solve a specific problem in a given time period by placing scientific data, tools, scoreboards and the resulting predictive models into an open Commons.

The specific problem the SMC Challenge will address is the need for accurate methods to identify cancer-associated mutations from whole-genome sequencing data. Cancer is a disease of the genome, caused by disruptions in DNA that alter specific gene functions. Although today’s DNA sequencing instruments can amass great quantities of sequence data from a patient’s normal and tumor tissues, the ability to identify DNA mutations and rearrangements accurately on the basis of those data remains elusive; current studies agree in only about 20% of their predictions.

To address this need, the Challenge will post the raw DNA sequencing data of 10 human tumor-normal pairs (5 prostate, 5 pancreatic), comprising approximately 9 terabytes of data to a high-speed distribution server. Contestants will have 6 months to optimize their predictive models. After the Challenge closes in July, 2014, at least 5,000 DNA candidate mutations predicted by different participating teams will be prospectively validated on an independent sequencing platform by the Challenge organizers. The accuracy of participants’ predictions will be ranked using the newly generated validation data based on sensitivity, specificity and balanced accuracy amongst other metrics.

To participate in the Challenge that opens tomorrow, individuals will need to register at https://www.synapse.org/ #!Challenges:DREAM. In addition, they must be approved by OICR’s ICGC Data Access Compliance Office to access the data.

As Canadian OICR researcher and Challenge organizer Professor Paul Boutros puts it, “Governments around the world have committed hundreds of millions of dollars to sequence cancer genomes to find new drug targets and to develop treatments that are personalized to each person’s cancer genome. But realizing these goals is currently blocked by scientists’ inability to identify mutations in cancer genomes. It is really tremendous that ICGC and TCGA are coming together with Sage Bionetworks and DREAM to address this problem using a DREAM Challenge that will set a gold standard that groups around the world can use to understand the cancer genome!”

To help realize this Challenge, industrial partners have stepped up. Google is making their Google Cloud Platform available to OICR-approved participants, including free access to the contest data in Google Cloud Storage and discounted Google Compute Engine cycles. Cloud processing will open the door for a whole new set of participants who do not have access to large compute clusters at their own institutions. Hitachi has provided free storage to host the data on a 1PB disk donated for cancer genomics. Annai Systems (http://www.annaisystems.com/) is providing their Annai-GNOS™ data management platform to facilitate upload, hosting and access to the data in the Hitachi store. Annai’s GeneTorrent software will provide high-speed data transfer to the Challenge participants.

Challenge participants will use the Synapse infrastructure (http://www.synapse.org), built by Sage Bionetworks, that allows collaboration by Challenge teams on an open platform. Synapse’s tools and forum will allow Challenge participants to: (1) record what processing and analysis they’ve done on the data; (2) submit their predictive models to a real-time leaderboard for scoring; and (3) share their ideas, model code and analysis results with others in the Challenge.

Publications based on the highest-ranking predictive methods from the Challenge will be considered with Nature Publishing Group. Nature Genetics editor Myles Axton will advise the Challenge on publication strategy and work with Synapse to understand the scientific quality control that can be obtained via competitive collaboration. “The exciting thing about this exercise from an editorial standpoint is that we can analyze just how much the strategies are improved during the contest and how much peer review is then needed to obtain a useful research publication at the end. The beauty of doing this on an open platform is to see the rigor, transparency and detail of each group’s approach and to be able to replay each strategy in a robust way,” Myles says. “This is a good use of editorial time since peer review improves the strategies, it improves the resulting publications and it improves the databases and journals by preparing us for the future of knowledge production. I really hope the winners combine elements of the best strategies into fuller publishable units, in that way they will get the best out of the challenge as well as our involvement with it.”

Explains UC Santa Cruz Professor Josh Stuart, “The timing of this Challenge couldn’t be better. ICGC and TCGA recently announced that they plan to jointly analyze a dataset of approximately 2,000 pairs of tumor-normal whole genomes as part of a 2014-2015 Pan-Cancer effort to elucidate comprehensively the genomic changes present in many forms of cancers. Thus, the winning algorithms selected by this DREAM Challenge will help ICGC/TCGA researchers provide the largest unified view of cancer genome variation to date.”

Cancer researcher Dr. Stephen Friend founded Sage Bionetworks out of a conviction that “…the best approach towards developing robust and accurate predictions such as those needed for mutation calling is to enable an open diverse community where data access is simple and people are incentivized to share. Sage and DREAM have already shown that in the span of several months, DREAM Challenges can attract hundreds of teams who end up submitting thousands of predictive models to a Challenge. Sage and DREAM couldn’t be more delighted to be partnered with the ICGC and TCGA research communities to provide the largest public methodology assessment for the field of somatic mutation identification.”

ABOUT ONTARIO INSTITUTE FOR CANCER RESEARCH: www.oicr.on.ca

OICR is an innovative cancer research and development institute dedicated to prevention, early detection, diagnosis and treatment of cancer. The Institute is an independent, not-for-profit corporation, supported by the Government of Ontario. OICR research supports more than 1,600 investigators, clinician scientists, research staff and trainees located at its headquarters and in research institutes and academia across the Province of Ontario. OICR has key research efforts underway in small molecules, biologics, stem cells, imaging, genomics, informatics and bio-computing. For more information, please visit the website at www.oicr.on.ca

ABOUT SAGE BIONETWORKS: http://sagebase.org/

Sage Bionetworks is a nonprofit biomedical research organization, founded in 2009, with a vision to promote innovations in personalized medicine by enabling a community-based approach to scientific inquiries and discoveries. Sage Bionetworks strives to activate patients and to incentivize scientists, funders and researchers to work in fundamentally new ways in order to shape research, accelerate access to knowledge and transform human health. Sage Bionetworks is located on the campus of the Fred Hutchinson Cancer Research Center in Seattle, Washington, and is supported through a portfolio of philanthropic donations, competitive research grants, and commercial partnerships.

ABOUT THE DREAM PROJECT: http://www.the-dream-project.org

The Dialogue on Reverse Engineering Assessment and Methods (DREAM) project is an initiative to advance the field of systems biology through the organization of Challenges to foster the development of predictive models of relevance in biomedicine. With the experience gathered by the launching of 27 successful DREAM challenges over the past seven years, the “Challenge” concept has reached a status of legitimacy and maturity. This success has triggered considerable interest by different government institutions and private organizations in working with DREAM to engage distributed teams to solve tough computational problems in biomedical research.

ABOUT ANNAI SYSTEMS INC.: (http://www.annaisystems.com)

Annai Systems Inc., located in California’s Silicon Valley, is a technology company with a core competency in big data management solutions. We are focused on applying our technology to the field of genomics to enable easier, more efficient access to genomic data. Our technology enables the producers and consumers of genomic data to achieve better results faster and at lower operating costs by providing products that overcome key data-related bottlenecks and roadblocks that are impeding progress in the growing fields of genomic research and medicine. Annai Systems offers a variety of products and services to address the “big data” challenges associated with using genomic data in personalized medicine and healthcare improvement.

Contacts

Sage Bionetworks
Thea Norman, 206-667-3192
friend@sagebase.eu
or
Ontario Institute for Cancer Research
Christopher Needles, 416-673-8505
christopher.needles@oicr.on.ca
or
UC Santa Cruz
Tim Stephens, 831-459-2495
stephens@ucsc.edu
Publicist, Science & Engineering
or
Nature Publishing Group
Neda Afsarmanesh
n.afsarmanesh@us.nature.com
Press Officer
or
Annai Systems Inc.
Jay Kaufman, 650-400-5812
jayk@annaisystems.com

Release Summary

Research leaders from Ontario Institute for Cancer Research and UC Santa Cruz, with Sage Bionetworks and IBM’s DREAM, announce an open challenge to develop methods for predicting cancer mutations.

Contacts

Sage Bionetworks
Thea Norman, 206-667-3192
friend@sagebase.eu
or
Ontario Institute for Cancer Research
Christopher Needles, 416-673-8505
christopher.needles@oicr.on.ca
or
UC Santa Cruz
Tim Stephens, 831-459-2495
stephens@ucsc.edu
Publicist, Science & Engineering
or
Nature Publishing Group
Neda Afsarmanesh
n.afsarmanesh@us.nature.com
Press Officer
or
Annai Systems Inc.
Jay Kaufman, 650-400-5812
jayk@annaisystems.com