Intro
Our lab uses phylodynamic approaches to answer virologic questions about how viral evolution is shaped by evolutionary, ecological, and epidemiologic processes. Phylodynamics is a broad field that focuses on using genomes to produce phylogenies that reflect these processes, and in turn, using the shape of trees to learn about viral transmission and the processes shaping them. For most beginners in the lab, I recommend reading a series of papers/blogs that introduce the basic concepts of phylodynamics, and then getting hands-on training using Nextstrain, a user-friendly phylodynamics toolkit that offers rapid inference and interactive visualizations. For someone new, you can absolutely work through these two things (theory and practice) simultaneously.
Phylodynamics background reading
- This is a quick, nice overview of how to read a tree: https://artic.network/how-to-read-a-tree.html
- A seminal paper that introduced phylodynamics: https://pubmed.ncbi.nlm.nih.gov/14726583/
- A great review on phylodynamics by some of the leaders in the field: https://www.nature.com/articles/nrg2583
- Another useful review: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002947
- A review of genomic epidemiology approaches: https://www.nature.com/articles/s41564-018-0296-2
- Genomic Epidemiology Handbook: This handbook was written by Alli Black and Gytis Dudas, specifically targeting individuals working in the field of public health practice. It is a really fantastic reference that goes over basic concepts in genomic epidemiology (which is the subfield of phylodynamics that focuses on epidemiologic questions). While you may not necessarily need to read the entire thing, it has some very nice background on concepts and practical uses and interpretations of trees.
- A nice review of viral evolution and population dynamics, which includes some nice background on coalescent theory, which is the foundation for phylodynamics: https://www.nature.com/articles/nrmicro863
- Felsenstein pruning algorithm: If there is one paper that is important for understanding phylogenetics, it is this one. This is Joe Felsenstein's original paper on the Maximum Likelihood approach that is used to infer trees. This basic algorithm is used in most software that is used today, and has been applied to many other applications. It's also a very well-written paper that I personally found quite useful for learning about phylogenetics. It’s probably not the easiest read if you’re new to phylo, but it’s a crucial one to master if you join the lab.
Nextstrain materials
Nextstrain is on the main phylogenetic toolkits we use in the lab, and it is great for learning phylodynamics for the first time. Nextstrain is constantly changing because it is maintained by a full team of developers, but it also has wonderful documentation.
- I'd get started by installing the nextstrain tools. There is extensive documentation here: https://docs.nextstrain.org/en/latest/index.html. Once you’ve installed everything, read through all of the documentation that describes how the software works and is organized.
- Next, I would go through a few tutorials to get you familiar with the software and to make sure that you have everything installed correctly. Definitely go through “running a phylogenetic workflow” and “generating a phylogenetic workflow”.
- Once you've run through those tutorials, you should also take a look at the snakefiles and play around a bit with altering it by changing the rules within it.
- Once you've done that, take a stab at working through this avian-flu quickstart build. https://github.com/nextstrain/avian-flu/tree/master/quickstart-build. This is a starting template for avian influenza phylogenies, and is a good starting point for any work you would be doing in the lab. I’ve added tons of information as comments into the Snakefile, so please read these comments to learn about what each rule does.
- Finally, I developed a set of slides and tutorial on Nextstrain for a workshop a few years ago. This tutorial assumes that you have Nextstrain correctly installed, but then really goes through step by step each rule, how to edit them, and how to pipeline them together into an increasingly complex Snakefile. Give it a whirl! https://github.com/lmoncla/csu-genomics-disease-wildlife-workshop/tree/main/nextstrain-lab