Fifteen years ago, autism genetics research was at something of an impasse. Twin and family studies suggested that the disorder had a strong genetic component. But even though geneticists believed that there were likely dozens of different mutations responsible for autism, when researchers went looking for the mutated genes that were presumably passed down from parent to child, they largely came up dry.
Michael Wigler, a geneticist at Cold Spring Harbor Laboratory in New York, suspected that researchers were going about their search in the wrong way. Most autism genetics studies were being done on ‘multiplex’ families, in which more than one family member has the disorder, because those are the cases with the highest chance of having been caused by inherited mutations. But autism, in fact, appears more often in ‘simplex’ families, in which only one individual is affected, suggesting that the disorder may frequently result not from an inherited mutation but from a spontaneous, or de novo, mutation in a sperm or egg cell. “I had the hypothesis that people were failing [in their search for autism genes] because they were using the wrong tool,” Wigler says.
Together with Jonathan Sebat, then in Wigler’s lab and now at the University of California, San Diego, Wigler analyzed DNA from the Autism Genetic Resource Exchange, a gene bank consisting mostly of genetic material from multiplex autism families: a valuable dataset from many perspectives, but the opposite of what is needed to best isolate de novo mutations. Indeed, as Wigler had predicted, the simplex families the team studied from that collection had a higher proportion of de novo mutations than the multiplex families did.
In 2003, Wigler broached to Jim Simons the idea of creating a large collection of simplex families. The Simons Foundation was already looking for ways to invigorate the field of autism research, and earlier that year it had convened a large meeting of autism experts who had concluded that it was imperative to lure more talented researchers into the field. A large, carefully curated collection of data from simplex families, freely accessible to all researchers, would be an ideal way to jump-start research, the foundation decided.
Today, the Simons Simplex Collection (SSC), which holds genetic, phenotypic and biological data from more than 2,600 simplex families, has helped lead the revolution in autism research. “I don’t know of another autism collection that compares to it,” says Evan Eichler, an autism researcher at the University of Washington in Seattle.
The logic behind the SSC has proved to be “profoundly right,” says Matthew State, a geneticist at the University of California, San Francisco. “It has made the field.”
Sequencing studies of the collection over the past five years are bringing the genetic landscape of autism into sharp focus. Instead of dozens of autism mutations, researchers now believe that 300 to 1,000 genes will eventually be implicated in the disorder. A recent analysis of the exomes — the protein-coding regions of the genome — of most of the SSC families, published November 13, 2014, in Nature, has identified 27 autism genes with high confidence, as well as hundreds more candidate genes worthy of further study.
The collection has helped to spark a new era in autism research, and its accessibility has attracted many scientists into the field who had not previously focused on autism.
“The fact that the SSC was out there without any strings attached — that it wasn’t wrapped up in someone’s empire — is a big part of the reason I moved into autism research,” Eichler says. “The collection allows an entirely new dimension of researchers to explore autism.”
“It’s hard to imagine where the genomics of autism would be without the SSC,” says State, who carried out the exome study together with Wigler, Eichler and Jay Shendure, Eichler’s colleague at the University of Washington. “It has absolutely transformed autism research.”
From the earliest days of the SSC, the emphasis was on creating a resource that could be used in many different ways by many different kinds of scientists. “No other research group has put so much effort into making sure their dataset would be widely usable,” says Catherine Lord of Weill Cornell Medical College in New York City, who oversaw the formation of the collection together with Gerald Fischbach, then director of the Simons Foundation Autism Research Initiative (SFARI) and now the foundation’s chief scientist. “Normally, people collect data for themselves, but we were thinking from the start about what data outside researchers would want.”
To be most useful to the autism research community, Lord and Fischbach decided, the collection must be not only large but also deep, with detailed phenotypes and biospecimens. “We wanted people to be able to use it to test their hypotheses even if they didn’t have access to anyone with autism,” says Lord. The meticulous data the collection acquired, Lord says, “cut out millions of steps for many researchers who otherwise wouldn’t have even gotten started.”
No one institution would have been able to collect data on as many simplex families as the collection needed, so Fischbach and Lord enlisted the aid of 12 clinics and universities in the U.S. and Canada: Baylor College of Medicine; the University of California, Los Angeles; Columbia University; Emory University; Harvard University/Boston Children’s Hospital; the University of Illinois at Chicago; the University of Michigan; McGill University; the University of Missouri; Vanderbilt University; the University of Washington; and Yale University. The different clinics brought in simplex families — not just the affected child, but also the parents and siblings — for a full day of diagnostic tests and blood draws.
“It’s a significant commitment of time and energy for the families,” says Casey White Lehman, the collection’s project manager at the Simons Foundation. “And they did it mainly out of a desire to contribute to research, which I’ve always found inspiring.”
Silvia Verga of New York City, whose family participated in the Simons Simplex Collection, says, “We wanted to be part of something that could be the beginning of discoveries in the future.” The family reaped some immediate benefit from the detailed assessment process: “The evaluation we received cleared up a lot of questions I had with regard to my son’s diagnosis, and helped us figure out what kinds of services would be best for him,” Verga says. With 12 different sites collecting data, it was imperative that the tests be carried out consistently from one site to another. “If you’re going to produce a number, it should mean something,” Lord says. “We had to make sure the numbers meant the same thing at different places.”
Lord brought the various clinicians to the University of Michigan (where she was based at the time) for rigorous training on the diagnostic tests to be performed, some of which she herself had pioneered. The clinicians were later videotaped as they assessed families, to make sure they were carrying out the tests uniformly, and the teams also participated in site visits, monthly phone calls and biannual group meetings with the staff at the Simons Foundation.
This attention to consistency paid off. “When the scores were tallied at these sites scattered around the country, they all came out very close on a wide range of measures — of social cognition, or repetitive movements, or the severity of the disease,” Fischbach says. “Given what a heterogeneous disorder autism is, that seems like a miracle.”
Somewhat to her surprise, Lord found that the area of greatest variability among the different clinics was the name the clinicians assigned to each diagnosis — autistic disorder, Asperger syndrome or pervasive developmental disorder-not otherwise specified (PDD-NOS). Each clinic seemed to assign these names according to its own internal logic, which varied greatly from site to site. Lord’s statistical analysis of this variation, with Eva Petkova of New York University, played an influential role in the decision of the American Psychiatric Association in 2012 to replace the three labels with the umbrella term “Autism Spectrum Disorder” in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders.
By 2011, the collection had completed its data accumulation phase, with a total of 2,659 families — well beyond its initial target of 2,000. In the process, the teams at the 12 sites had become a community. “There has been a lot of back and forth since then, with people working together who met through the SSC,” Lord says.
Data from the collection are available to qualified researchers through a central database called SFARI Base. The foundation has also recently federated with the National Database for Autism Research (NDAR), maintained by the National Institutes of Mental Health, so that a researcher who types a query into NDAR will receive results from both databases. To date, more than 200 projects have used data from the SSC, and more than 60 published papers have resulted from their analyses, appearing in publications such as Nature, Science, Neuron, Cell and Nature Genetics.
The first clear indication of the collection’s potential for elucidating autism’s genetic architecture came in 2010, when Wigler and State completed an analysis of de novo copy number variants (CNVs) — genetic aberrations in which a chunk of DNA is duplicated or deleted — in more than 1,000 SSC families. The study highlighted six genomic regions that appeared to be strongly linked to autism, and about 70 other candidate autism CNVs.
These findings supported the creation in 2010 of the Simons Variation in Individuals Project (Simons VIP), which has collected clinical information and blood samples from more than 200 carriers of 16p11.2 CNVs, to home in on the shared neurological and behavioral features of this group. The project’s long-term goal is to identify the features of different genetic subtypes of autism, which might respond to different therapeutic approaches.
In 2012, Wigler, State and Eichler’s labs sequenced the whole exomes of nearly 800 SSC families, identifying several high-confidence autism risk genes. More recently, the 2014 exome study of nearly the entire SSC by Wigler, State, Eichler and Shendure has suggested hundreds of candidate autism risk genes. “Many of these candidates will be confirmed in the coming years, by additional deep sequencing of autism collections,” predicts Alan Packer, a senior scientist at SFARI.
Seven of the genes identified in the 2014 study had mutations in three or more children with autism, establishing them unassailably as autism risk genes. Another 20 genes had mutations in two children, which translates into more than a 90 percent likelihood of their being genuine autism genes.
“Before the SSC was created, I was waiting for the one rare kid to walk into my lab that had a de novo mutation in a gene, and then we would still have to figure out how to prove that the mutation was related to the disorder,” State says. “It used to take us a decade to find one autism gene, so to publish a paper with 27 is amazing.”
While the studies so far have illuminated the incredible genetic diversity of autism, they also strongly suggest that the hundreds of autism genes likely converge on a much smaller set of biological pathways. Many of the candidate genes to emerge from the exome study, for instance, interact with targets of the gene causing fragile X syndrome, which causes intellectual disability in boys. Other candidate genes are involved in the regulation of chromatin, a DNA-protein complex that helps package DNA in the cell nucleus and controls gene expression. Understanding these biological pathways, researchers hope, will eventually lead to targeted therapies for the different genetic types of autism.
The genetic studies have given rise to a burst of animal studies to try to decode the biological mechanisms of the strongest autism candidate genes. “In the long run, the neurobiology is going to be even more important than the genetics for understanding mechanisms,” Fischbach says. “But without the genes, we wouldn’t even know which animal models to create and study.”
Although genetics has been the primary thrust of SSC research so far, researchers are examining its data from a host of other perspectives as well. For example, a paper in the November 2013 issue of Molecular Psychiatry reported that mothers of children in the SSC were four times more likely than controls to harbor anti-brain antibodies, which might be pathogenic to the developing brain, and they also had an increased prevalence of autoimmune disorders such as rheumatoid arthritis and lupus. The SSC has also enabled scientists to study, for example, repetitive behaviors, the relationship between head circumference and IQ, and even the stigma of autism.
In addition, the collection has offered researchers a variety of ways to tackle the puzzling question of why autism in girls seems to be so different from autism in boys — simultaneously rarer and more severe. State, Wigler and Eichler’s genetic studies of the SSC indicate that girls with autism typically have more damaging mutations than boys do.
The Simons Foundation has created away for researchers to engage many of the SSC families in future studies. SSC@IAN — administered by the foundation in partnership with the Interactive Autism Network of the Kennedy Krieger Institute in Baltimore — is an online platform that connects researchers whose projects have been approved by SFARI with a pool of SSC families. More than 1,500 of the original families in the SSC have agreed to take part. “Time and time again the families have shown us how engaged they are,” White Lehman says. The foundation itself is conducting the SSC@IAN Family Update Study, a set of online questionnaires to find out how the families have fared in the years since the original data collection.
The project may well become a multi-year study, White Lehman says. “It’s important to get information about people’s lives over the long term,” she says. “There hasn’t been a lot of research on what happens as individuals with autism transition to the adult world.”
Much more remains to be mined from the SSC’s genetic data, as the exome accounts for only 1.5 percent of the human genome. The Simons Foundation has launched a pilot study, to be carried out by the nonprofit New York Genome Center, to sequence the entire genomes of 40 families from the collection. If that goes well, larger studies will follow.
The exome studies of the SSC suggest that ultimately, at least 30 percent of simplex autism will be traceable to de novo mutations. “I still read in newspapers that we don’t know what causes autism, or that autism is thought to involve gene mutations, but that’s not really just a hypothesis anymore,” Packer says. “Autism is less mysterious than it used to be.”