Genome-Wide Association Studies (GWAS)
These materials are still under development
Overview
Genome-Wide Association Studies (GWAS) investigate the genetic basis of complex traits and/or diseases. These materials cover the bioinformatic and statistical methods required to identify associations between genetic variants and traits. You will learn to use essential software for genotype data processing, including quality control crucial for downstream analysis. We discuss how population ancestry may impact association results and how this can be adjusted for in the analysis. We introduce key statistical concepts relevant to GWAS, with applications to both quantitative and binary traits. Finally, we introduce methods to assess potential biases in GWAS results and demonstrate how to generate effective visualisations.
- Describe key concepts, advantages and limitations of GWAS.
- Use PLINK to generate key metrics for quality control of samples and variants.
- Recognise the effect of population structure when performing association tests and how to adjust for it.
- Summarise the statistical methods used for association analysis and how to interpret their outcomes.
- Run a GWAS for quantitative and binary traits and assess the quality of the results.
- Visualise and report the findings of the association analysis.
Target Audience
Researchers and students interested in the genetics of complex traits.
Prerequisites
- Knowledge of key genetics concepts and terms, such as: gene, locus, allele, linkage, inheritance, homozygous and heterozygous genotypes.
- See NIH’s genetics glossary for reference.
- Knowledge of basic statistical concepts, such as: linear regression, null hypothesis testing, p-value, effect size. Knowledge of logistic regression is also desirable.
- See our Core Statistics and Generalised Linear Models materials as a reference.
- Basic usage of the Unix command line: listing files (
ls
), moving between directories (cd
) and an understanding of using options/flags with commands (e.g.command --input file.csv --output result.csv
).- See the “Basics” section of our Introduction to Unix command line materials.
- Using R and the
tidyverse
package for data exploration and visualisation.
Acknowledgements
- List any other sources of materials that were used.
- Or other people that may have advised during the material development (but are not authors).