Skip to content
Snippets Groups Projects

WGBS pipeline

Preparing the VM to be able to run things

sudo bash prepare_vm.sh

Running a sample

The reference genome is GRCh38 no alt (from ENCODE) to which I added the Lambda phage DNA to capture methylation spike-ins. It's included in the repository. Also included are the indices that are built using the "bismark_genome_preparation" utility, so no need to re-run that.

bash wgbs-pipeline.sh /folder/with/fastqfiles/sample_id.R1.fastq.gz /folder/with/fastqfiles/sample_id.R2.fastq.gz

Creating references

The wgbs-pipeline.sh pipeline should take care of downloading and indexing references if the appropriate files cannot be found in the ./ref folder. This can also be done beforehand if so desired using the generate_bs_indices.sh script. You need to provide the folder in which the reference files will be stored. By default (so that you don't need to change anything in the pipeline) you can run the following:

bash ./apps/generate_bs_indices.sh ./ref

The script requires samtools, bismark and bowtie2 to be installed in very specific directories. To that end, the prepare_vm.sh script needs to be run.