Analysis of ChIP-seq Data
Overview
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a method used to identify binding sites for transcription factors, histone modifications and other DNA-binding proteins across the genome. These materials cover the fundamentals of ChIP-seq data analysis, from raw data processing to downstream applications.
We will start with an introduction to ChIP-seq methods, including important considerations when designing your experiments. We will cover the bioinformatic steps in a standard ChIP-seq analysis workflow, covering raw data quality control, trimming/filtering, mapping, duplicate removal, post-mapping quality control, peak calling and peak annotation. We will discuss metrics used for quality assessment of the called peaks when multiple replicates are available, as well as the analysis of differential binding across sample groups. Finally, we will also cover tools and packages that can be used for visualising and exploring your results.
Target Audience
This course is aimed at researchers with no prior experience in the analysis of ChIP-seq data, who would like to get started in processing their data using a standardised pipeline and perform downstream analysis and visualisation of their results.
Prerequisites
- Basic understanding of high-throughput sequencing technologies.
- Watch this iBiology video for an excellent overview.
- A working knowledge of the UNIX command line (course registration page).
- If you are not able to attend this prerequisite course, please work through our Unix command line materials ahead of the course (up to section 7).
- A working knowledge of R (course registration page).
- If you are not able to attend this prerequisite course, please work through our R materials ahead of the course.
Citation
Please cite these materials if:
- You adapted or used any of them in your own teaching.
- These materials were useful for your research work. For example, you can cite us in the methods section of your paper: “We carried our analyses based on the recommendations in Cortijo S et al. (2023).”.
You can cite these materials as:
Cortijo S, Martinez Cuesta S, Nagarajan S, Sawle A, Seyres D, Tavares H (2023) “cambiotraining/chipseq: Analysis of ChIP-seq Data”, https://cambiotraining.github.io/chipseq/
Or in BibTeX format:
@Misc{,
author = {Cortijo, Sandra AND Martinez Cuesta, Sergio AND Nagarajan, Sankari AND Sawle, Ashley AND Seyres, Denis AND Tavares, Hugo},
title = {cambiotraining/chipseq: Analysis of ChIP-seq Data},
month = {July},
year = {2023},
url = {https://cambiotraining.github.io/chipseq/}
}
Acknowledgements
There are many online resources that inspired our own materials (e.g. package vignettes) and we cite them where relevant.
We also recommend the following training materials:
- Understanding chromatin biology using high throughput sequencing from the Harvard Chan Bioinformatics Core
- Introduction to ChIPseq using HPC from the Harvard Chan Bioinformatics Core
- ChIP-seq analysis from the Babraham Institute