Simons Foundation Data Resources

Data sharing has always been an essential component of scientific research and is especially important as large, complex scientific data sets become commonplace.

The Simons Foundation is committed to facilitating collaborative sharing of all data that results from its various initiatives in a way that respects the rights of research subjects, as well as the intellectual investment of investigators.


Life Sciences Data

A project in the Life Sciences, the Simons Genome Diversity Project will sequence some of the most diverse known human genomes — from groups ranging from small, isolated and genetically distinct communities to large, accessible ones. The project will sequence two individuals per studied population.

Data are currently available by request via the Genome Diversity Project page, and will be available via Amazon’s Public Data Project in the near future.


Simons Foundation Autism Research Initiative (SFARI) Data

SFARI Base is a database of demographic, phenotypic, imaging and genetic data about families affected by autism and other neurodevelopmental disorders.

A central component of SFARI Base is the Simons Simplex Collection (SSC), which established a permanent repository of genetic samples from 2,700 families, each of which has one child affected with an autism spectrum disorder and unaffected parents and siblings.

Another component of SFARI Base is the Simons Variation in Individuals Project (Simons VIP), which complements the SSC’s phenotype-first approach to studying the genetics of autism by taking a genetics-first approach. Simons VIP collects demographic, phenotypic, medical, neurological and structural and functional imaging (MRI and MEG) data on more than 200 individuals and families with deletions or duplications at the 16p11.2 locus, a genetic event highly associated with neurodevelopmental disorders, including autism. High-resolution genomic analyses are also underway and will be made available.

All SSC and VIP data are available by request after logging into SFARI Base.


SSC whole exome sequence data analyzed in the following publications are also available from the National Database of Autism Research (NDAR) at NIH:

• Iossifov I, et al, The contribution of de novo coding mutations to autism spectrum disorder. Nature. 2014 Oct 29. doi: 10.1038/nature13908. Sequence data at NDAR.

• Iossifov I, et al. De novo gene disruptions in children on the autistic spectrum. Neuron. 2012 Apr 26;74(2):285-99. doi: 10.1016/j.neuron.2012.04.009. Sequence data at NDAR.

• O’Roak BJ, et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature. 2012 Apr 4;485(7397):246-50. doi: 10.1038/nature10989. Sequence data at NDAR.

• Sanders SJ, et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature. 2012 Apr 4;485(7397):237-41. doi: 10.1038/nature10945.  Sequence data at NDAR.