.. _gwas_tutorial: GWAS Tutorial ============= This quickstart tutorial shows how to perform genome-wide association studies using Glow. Glow implements a distributed version of the `Regenie `_ method. Regenie's domain of applicability falls in analyzing data with extreme case/control imbalances, rare variants and/or diverse populations. Therefore it is suited for working with population-scale biobank exome or genome sequencing data. .. tip:: Other bioinformatics libraries for GWAS can be distributed using the :ref:`Glow Pipe Transformer `. You can view html versions of the notebooks and download them from the bottom of this page. The notebooks are written in Python, with some visualization in R. .. tip:: We recommend running the :ref:`Data Simulation ` notebooks first to prepare data for this tutorial before trying with your own data. .. important:: Please sort phenotypes and covariates by sample ID in the same order as genotypes. 1. Quality Control ------------------ The first notebook in this series prepares data by performing standard quality control procedures on simulated genotype data. 2. Glow Whole Genome Regression (GloWGR) ---------------------------------------- :ref:`GloWGR ` implements a distributed version of the Regenie method. Please review the Regenie paper in `Nature Genetics `_ and the `Regenie Github `_ repo before implementing this method on real data. 3. Regression ------------- The GloWGR notebook calculated offsets that are used in the genetic association study below to control for population structure and relatedness. .. notebook:: . tertiary/1_quality_control.html :title: Quality control .. notebook:: . tertiary/2_quantitative_glowgr.html :title: Quantitative glow whole genome regression .. notebook:: . tertiary/3_linear_gwas_glow.html :title: Linear regression .. notebook:: . tertiary/4_binary_glowgr.html :title: Binary glow whole genome regression .. notebook:: . tertiary/5_logistic_gwas_glow.html :title: Logistic regression