4  PLINK basics

Caution

These materials are still under development

Learning objectives
  • Recognise how the PLINK software can be used for GWAS analysis.
  • List the file formats required by PLINK.
  • Recognise the general structure of a PLINK command and the structure of the output files it generates.
  • Use R to import, explore and visualise the results generated by PLINK.

4.5 Summary

Key Points
  • PLINK is a widely-used software for GWAS, as it includes a wide range of functions, from quality control to downstream analysis.
  • PLINK requires three critical types of input:
    • Genotype data (.pgen/.bed).
    • Information about genetic variants (.pvar/.bim).
    • Information about the samples (.psam/.fam).
  • A typical PLINK command will include:
    • Option specifying the input files’ prefix: --pfile (or --bfile if using the older formats).
    • Option specifying the output files’ prefix: --out.
    • Other options specifying the task we want it to
  • PLINK generates output files with specific extensions for each option it provides. All file extensions are detailed in the documentation.
  • Most of PLINK’s output files are simple text files with tab-separated values (TSV), which can therefore be imported to standard data analysis software such as R.