data

We drive many collaborative studies that aim to aggregate large numbers of samples to conduct genetic association studies in the areas of schizophrenia, bipolar, ADHD, and other psychiatric phenotypes. We're continually improving our workflow that handles the processing, harmonization, QC and analysis and are committed to sharing the data we produce as broadly and as promptly as possible.

 
 
 

Psychiatric genetics consortium 

The Psychiatric Genetics Consortium (PGC) unites investigators around the world to conduct meta- and mega-analyses of genome-wide data for psychiatric disorders including schizophrenia, bipolar, ADHD, PTSD and autism. Our group is responsible for data production, bringing together over 20,000 samples from more than 60 different sites worldwide. We also lead the analytical activities for the schizophrenia and bipolar subgroups.

 
 
 

GWAS of Attention Deficit Hyperactivity Disorder

In collaboration with the ADHD Working Group of the Psychiatric Genomics Consortium (PGC) and The Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH), we’re working to unravel the genetics of Attention Deficit/Hyperactivity Disorder (ADHD). By identifying the variants, genes, and biological domains associated with ADHD and comorbid traits, we hope to advance our understanding of this highly heritable neurodevelopmental disorder.

Through international collaboration with the PGC and iPSYCH, we’ve assembled genotype data on >20,000 individuals with ADHD and >35,000 matched controls across 12 cohorts. Genome-wide meta-analysis of these cohorts has for the first time identified common genetic variants at 12 loci that are significantly associated with risk for ADHD, and highlights substantial correlation of genetic risk between ADHD, comorbid psychiatric disorders, and related behavioral traits in the population (manuscript on biorxiv).

The results of the ADHD genome-wide meta-analysis are freely available from the PGC downloads page.

 
 
 

GWAS of Post Traumatic Stress Disorder

Posttraumatic stress disorder (PTSD) is a common mental illness that follows exposure to extreme traumatic events. The contribution of genetic variants to the risk of PTSD has long been established. However, the genetic architecture of PTSD is not fully understood and genes associated with PTSD are yet to be found. As part of the PTSD working group in the Psychiatric Genomics Consortium (PGC), we collaborate with researchers around the world to investigate the genetic basis of PTSD using genome-wide approaches.

The latest genome-wide association study (GWAS) on PTSD from the workgroup consists of 5,182 PTSD cases and 15,548 controls (PMID: 28439101). This large PTSD GWAS aggregated subjects from 11 contributing studies, with a diverse population representation of African American, European American, Latino, and African populations. We showed that the genome-wide heritability of PTSD is higher among females than males. We also showed that there is shared genetic components between PTSD and schizophrenia through polygenic score analysis.

The full PTSD GWAS results can be downloaded here.

 
 
 

GWAS of substance use disorders

The Substance Use Disorders Workgroup of the Psychiatric Genomics Consortium (PGC) aims to identify the contribution of genetics to the use and abuse of licit and illicit substances, including alcohol, cannabis, cocaine, opioids, and tobacco. Within the Neale Lab, Raymond Walters is currently leading analysis of the effects of common genetic variants on risk for alcohol dependence.

 
 
 

GWAS of NON-HETEROSEXUALITY

Non-heterosexual behaviour is associated with reduced fecundity and twin studies have estimated the genetic component to vary between 34% and 39%. The largest genome-wide association study (GWAS) for a similar trait conducted by 23andMe on more than 23,000 participants (1,373 self-reported exclusively homosexual) did not identify any significant associations however. Interestingly, twin studies have also suggested genetic correlation between non-heterosexual behaviour and depression, and phenotypic correlation between psychiatric disorders and homosexuality has also been observed. 

In this project we aim to identify variants associated with non-heterosexual behaviour via GWAS in the UK Biobank and 23andMe data sets, and evaluate the genetic correlation between non-heterosexual behaviour and several reproductive and neuropsychiatric phenotypes. We will explore the epidemiological associations between non-heterosexual behaviour and relevants traits, and empirically address the hypothesis that genetic factors predisposing to non-heterosexuality may increase mating success in heterosexuals.

 
 
 

Whole Genome sequencing in psychiatric disorders consortium

 
 

The Whole Genome Sequencing in Psychiatric Disorders (WGSPD) Consortium is a partnership between the United States National Institutes of Mental Health (NIMH) and the Stanley Center for Psychiatric Research at the Broad Institute that aims to advance our understanding of the biological basis of severe mental illness through large-scale whole genome sequencing (WGS). The WGSPD consortium is targeting four major disorders: autism spectrum disorder (ASD), schizophrenia, bipolar disorder and major depressive disorder. The mechanisms of pathogenesis of these conditions remains unknown, and we hope that comprehensive genetic studies in large numbers of samples will begin to shed light on the causal factors of these diseases. Through a combination of case-control and family-based studies, RNA, epigenetic and phenotypic data, we aim to inform our understanding of the genetic architecture of these disorders and to pin point the associated variants. 

 
 
 

Meta-analysis of schizophrenia whole-exome sequencing data 

In previous studies, schizophrenia has been demonstrated to have a substantial genetic component, with common and rare variants contributing to individual risk. While genome-wide association studies have successfully implicated over a hundred common risk loci for schizophrenia, rare variant analyses from sequencing studies have had more limited success in identifying individual genes, presumably owing to power limitations. The Schizophrenia Exome Sequencing Meta-Analysis (SCHEMA) Consortium is a large multi-site collaboration to aggregate, generate, and analyze high-throughput sequencing data of schizophrenia to advance gene discovery. To date, over 20,000 individuals with schizophrenia and 40,000 matched controls have been sequenced, and efforts are currently underway to meta-analyze these data together to further our understanding of the genetic architecture of schizophrenia and pinpoint individual risk genes. 

 
 
 

Whole exome sequencing in bipolar disorder

Psychiatric disorders have long been known to have a significant heritability, however it has only recently become possible to actually find the genes that contribute to risk. Discovery of the precise DNA sequence variants that increase risk for a disease yields crucial molecular information about the disease mechanisms, and allows us to begin to learn important biology underlying these complex disorders. Whole exome sequencing (WES) provides a cost effective way to examine variants within the protein coding regions of the genome. We're employing WES in studies of schizophrenia, bipolar, ADHD and epilepsy to allow us to learn more about the role that protein coding variants play in each of these disorders. Through international collaboration between the Dalio Initiative in Bipolar Disorder, the Stanley Centre, and iPSCYH, we have assembled whole exome sequencing data from >7,000 individuals with Bipolar disorder and >10,000 matched controls from across Sweden, the UK and Ireland, and Denmark. Following the lead of Schizophrenia, we will double these numbers in the coming months and use the resultant boost in power to increase our understanding of the genetic contribution to bipolar disorder.

 
 
 

whole exome sequencing in epilepsy : epi25k Consortium

The international Epi25k consortium has been formed combining previous national and multinational research groups in order to unite synergistic efforts to discover new genetic risk factors for epilepsy. The Epi25k analysis is a joint effort from Columbia University, the Broad Institute, and a large list of clinicians and epilepsy research groups around the world to expand on the Epi4k consortium effort. With larger cohorts, our primary aim is to facilitate new genetic discoveries by aggregation of larger patient cohorts, particularly for more common forms of epilepsy.

Within the Epi25k framework, whole exome sequence (WES) data have recently been generated at the Broad Institute for over 6,011 epilepsy patients: 3,037 individuals diagnosed with genetic generalized epilepsy (GGE), 2,186 individuals diagnosed with familial or sporadic non-acquired focal epilepsy (NAFE), and 788 individuals diagnosed with epileptic encephalopathies (EE). These patients were compared against a pool of over 14k individuals not ascertained for epilepsy drawn from multiple independent collections at the Broad Institute. All controls were screened negative for neurodevelopmental disorders. All WES data were jointly called, with extensive quality control to match for population structure, remove low quality samples and variants, and restrict to shared exome capture targets.

Previous publication links:

Ultra-rare genetic variation in common epilepsies: a case-control sequencing study

Application of rare variant transmission disequilibrium tests to epileptic encephalopathy trio sequence data.

De novo mutations in epileptic encephalopathies

 
 
 

Impact of ultra-rare coding variation across a range of PHENOTYPES

Protein truncating variants (PTVs) are likely to modify gene function and have been linked to hundreds of Mendelian disorders. However, the impact of PTVs on complex traits has been limited by the available sample size of whole-exome sequencing studies. In this project, we assemble whole-exome sequencing data from 100,304 individuals to quantify the impact of rare PTVs on 13 quantitative traits and 10 diseases. We focus on those PTVs that occur in PTV-intolerant (PI) genes, as these are more likely to be pathogenic. Our initial results show that individuals carrying PI-PTV have an increased risk of autism, schizophrenia, bipolar disorder, intellectual disability and ADHD. In controls, without these disorders, we found that this burden associated with increased risk of mental, behavioral and neurodevelopmental disorders as captured by electronic health record information. Furthermore, carriers of PI-PTVs tended to be shorter, have fewer years of education and be younger; the latter observation possibly reflecting reduced survival or study participation. We further leveraged population health registries from 14,117 individuals to study the phenome-wide impact of PI-PTVs and identified an increase in the number of hospital visits among PI-PTV carriers. This project will provide the most thorough investigation to date of the impact of rare deleterious coding variants on complex traits. 

Read the manuscript here:
http://www.biorxiv.org/content/early/2017/06/09/148247

 
 
 

whole exome sequencing in Amyotrophic lateral sclerosis

Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease characterized by degeneration of motor neurons, leading to progressive weakening of limb, bulbar, and respiratory muscles and is ultimately fatal, typically within 3-5 years of onset. Although ALS is classified as a rare disease, affecting 1-2 per 100,000 individuals, it is the most common motor neuron disease in adults and has a lifetime risk of 1 in 350/500 for men and women, respectively. There are two forms of ALS: familial (FALS) and sporadic (SALS), which differ based on the known family history of disease, but are clinically indistinguishable. FALS accounts for only 5-10% of all cases however, the genes that have been identified through FALS pedigrees can also explain many SALS cases. Therefore, the application of genetic and genomic approaches in ALS has the potential to identify novel disease loci.

The FALS and the ALSGEN Consortia are composed of international researchers who are willing to contribute sequencing data of ALS cases with the unifying goal of unveiling the complete ALS genetic landscape. We have aggregated sequencing data from >5000 cases and >10,000 healthy individuals as controls. We have processed the data jointly using Hail to improve statistical power to potentially discover novel genetic risk factors as well as eliminate confounding biases. In addition to our own statistical analyses, we plan to replicate the test of published reports to corroborate or refute previous associations. Upon novel genetic discoveries, we plan to study the defective gene product in stem cells from ALS patients included in our dataset.

Relevant publications:

(1) Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways. Ciriulli et al., 2015. Science. PMID: 25700176.

(2) NEK1 variants confer susceptibility to amyotrophic lateral sclerosis. Kenna et al., 2016. Nature Genetics. PMID: 27455347.