This report was automatically generated on November 7, 2022.
Katherine Eaton
| National Microbiology Laboratory, PHAC
| katherine.eaton@phac-aspc.gc.ca
The ncov-recombinant update from v0.5.1 to v0.6.0 has two major changes.
The first change is a Nextclade upgrade to the sars-cov-2 2022-10-27 dataset, which introduces recombinant sublineages for the first time (ex. XBB.1) and two new lineages: XBD and XBE.
The second major change is the calculation and visualization of immune-related statistics. In v0.6.0, the number of key receptor binding domain (RBD) mutations is calculated for every sample. This is performed by comparing the Nextclade aaSubstitutions column (amino acid substitutions) produced by Nextclade, to the list of 12 key RBD mutations provided by Nextstrain. In addition, the statistics immune_escape and ace2_binding from Nextclade are included in the final linelists.
Between v0.5.1 and v0.6.0, 14.2% of sequences in the controls-gisaid dataset had different detection results. 4.0% of sequences were newly classified (NA → X*) and represent lineages not present in the v0.5.1 model. 10.2% of sequences had sublineage assignment changes as a result of the Nextclade dataset upgrade. 0.0% of positive controls were dropped (X* → NA), indicating no observed loss in sensitivity.
ncov-recombinant v0.6.0 is a recommended upgrade for recombinant surveillance to enable sublineage classification and to access enhanced statistics regarding immune-escape.
For a comprehensive summary of the methodological changes, please see the release notes for v0.6.0
Verify that the update of ncov-recombinant pipeline from version 0.5.1 to0.6.0:
controls-gisaid)This dataset includes SARS-CoV-2 genomes from GISAID that reflect the known diversity of recombinant sequences to date. These include 501 positive controls (recombinants), representing lineages XA - XBE and 186 negative controls (non-recombinants) selected from the Nextstrain Reference Phylogeny.
In total, 687 control sequences were used as input and a strain list is available here.
The snakemake pipelines for v0.5.1 and v0.6.0 were run independently on the same dataset (controls-gisaid). Please see the Procedure section in the Supplementary for detailed command-line instructions.
controls-gisaid)XBB → XBB.1).NA).Note: Lineage assignments in
v0.6.0are identical to those in pango-designation and are the expected values.
New detections (NA → X*) result from the following changes in v0.6.0:
Sublineage changes result from the following updates in v0.6.0:
The following plots report recombinant sequences over the last 16 weeks.
Note: Download the GISDAID sequences and metadata in the strains list to
data/controls-gisaid/.
Download the pipeline.
git clone https://github.com/ktmeaton/ncov-recombinant.git 0.5.1
cd 0.5.1
git checkout v0.5.1Symlink controls-gisaid, data.
rm -rf data/controls-gisaid
ln -s ../data/controls-gisaid data/controls-gisaidCreate a version-controlled conda environment.
# Local
mamba env create -f workflow/envs/environment.yaml -n ncov-recombinant-0.5.1
# HPC
sbatch -J conda-ncov-recombinant-0.5.1 --wrap="mamba env create -f workflow/envs/environment.yaml -n ncov-recombinant-0.5.1"Run the pipeline.
# Local
conda activate ncov-recombinant-0.5.1
snakemake --profile profiles/controls-gisaid-hpc
# HPC
scripts/slurm.sh --profile profiles/controls-gisaid-hpc --conda-env ncov-recombinant-0.5.1Download the pipeline.
git clone https://github.com/ktmeaton/ncov-recombinant.git 0.5.0
cd 0.6.0
git checkout v0.6.0Symlink controls-gisaid, data.
rm -rf data/controls-gisaid
ln -s ../data/controls-gisaid data/controls-gisaidCreate a version-controlled conda environment.
# Local
mamba env create -f workflow/envs/environment.yaml -n ncov-recombinant-0.6.0
# HPC
sbatch -J conda-ncov-recombinant-0.5.1 --wrap="mamba env create -f workflow/envs/environment.yaml -n ncov-recombinant-0.6.0"Run the pipeline.
# Local
conda activate ncov-recombinant-0.6.0
snakemake --profile profiles/controls-gisaid-hpc
# HPC
scripts/slurm.sh --profile profiles/controls-gisaid-hpc --conda-env ncov-recombinant-0.6.0After the pipelines are complete for each version, run the following to compare lineage assignments.
python3 0.6.0/scripts/compare_positives.py \
--positives-1 0.5.1/results/controls-gisaid/linelists/positives.tsv \
--positives-2 0.6.0/results/controls-gisaid/linelists/positives.tsv \
--ver-1 "v0.5.1" \
--ver-2 "v0.6.0" \
--outdir compare/controls-gisaid \
--node-order alphabetical \
--min-link-size 1csvtk cut -t -f "strain" 0.5.1/results/controls-gisaid/linelists/positives.tsv \
| tail -n+2 \
| csvtk grep -t -f "strain" -P - -v 0.6.0/results/controls-gisaid/linelists/positives.tsv \
| csvtk cut -t -f "strain" \
| tail -n+2 \
| csvtk grep -t -f "strain" -P - 0.5.1/results/controls-gisaid/linelists/linelist.tsv \
| csvtk pretty -t \
| less -Scsvtk cut -t -f "strain" 0.6.0/results/controls-gisaid/linelists/positives.tsv \
| tail -n+2 \
| csvtk grep -t -f "strain" -P - -v 0.5.1/results/controls-gisaid/linelists/positives.tsv \
| csvtk cut -t -f "strain" \
| tail -n+2 \
| csvtk grep -t -f "strain" -P - 0.6.0/results/controls-gisaid/linelists/linelist.tsv \
| csvtk pretty -t \
| less -SThe following plots report all recombinant sequences.
controls-gisaid)