ktmeaton/plague-phylogeography

An open-source pipeline to construct a global phylogeny of the plague pathogen Yersinia pestis.

License: MIT Build Status Pipeline CI GitHub issues Docker Image Open in Gitpod

Pipeline Overview

  1. Create a metadata database of NCBI genomic assemblies and SRA data (NCBImeta)
  2. Download assemblies and SRA fastq files (sra-tools)
  3. Build SnpEff database from reference (SnpEff)
  4. Align to reference genome (snippy,eager)
  5. Mask problematic regions (dustmasker, mummer, vcftools)
  6. Evaluate statistics (qualimap, multiqc)
  7. Construct a Maximum Likelihood phylogeny (iqtree)
  8. Optimize time-scaled phylogeny (augur, treetime)
  9. Web-based narrative visualization (auspice)

Showcase

DHSI2020 NextStrain Exhibit
SCDS2020 NextStrain Exhibit

Install

All install options start by cloning the pipeline repo.

git clone https://github.com/ktmeaton/plague-phylogeography.git
cd plague-phylogeography

1. Conda (Laptop)

conda install -c conda-forge mamba
mamba env create -f workflow/envs/merge/environment.yaml
conda activate plague-phylogeography
snakemake --profile profiles/laptop help

(While mamba is not strictly necessary, it is heavily recommended.)

2. Docker (Laptop)

docker pull ktmeaton/plague-phylogeography:dev
docker run \
  -v $PWD:/pipeline \
  -w /pipeline \
  ktmeaton/plague-phylogeography:dev \
  snakemake --profile profiles/laptop help

3. Singularity (HPC - Compute Canada)

singularity pull docker://docker.io/ktmeaton/plague-phylogeography:dev
singularity exec plague_phylogeography_dev.sif \
  snakemake --profile profiles/compute-canada help

If you will be downloading data from the SRA with singularity, the sra toolkit must be configured:

mkdir -p ~/.ncbi/
printf '/LIBS/GUID = "%s"\n' `uuidgen` > ~/.ncbi/user-settings.mkfg;

Credits

Author: Katherine Eaton
Logo: Emil Karpinski, Katherine Eaton