Resources for learning about the terminal, shell, and python
Below are some resources to learn how to code in python, which is the primary language that we use in the lab. The first step in learning to program is actually learning about how your computer stores information and files, and how to access those files via something called the command line. Once you understand how to use the command line (usually via a program called Shell), you can then start to learn the fundamentals of python programming. Python was developed as a human-readable and easy to learn language, and is probably the current primary programming language used by biologists. While other languages run faster (Julia, C, java), python’s benefit is that it is easy to learn, intuitive to use, and the commands read like a human might say them. Unlike R, python is a true, object oriented programming language, which is meant to utilize all the classic features and benefits of such program. While many students learn R first, R is actually rather strange, and relies heavily on dataframes rather than classes/objects, which I think makes it a weird language to learn first. I think that python is an excellent language to learn first, and now has so many features and packages that you can do almost anything that you could do in R, in python.
For someone starting from scratch, meaning that you have never programmed anything before, I recommend starting with working through a book, Programming for biologists, which introduces the terminal, shell scripting, and all the basic concepts of programming before moving into python. It’s a great book, and we have a copy in the lab. If you have used the terminal before and are comfortable navigating it, you can skip the early chapters of the book and move onto the python chapters. If for some reason you dislike the book, I would start as step one with item 2 on the list, Codecademy’s learn to code, which will cover the shell and terminal. Beyond that, there are a series of free tutorials/courses online that can probably accomplish the same goal.
Free options:
- Practical Computing for Biologists: this book is how I learned to code, and it really helped me. I think it does a wonderful job of introducing the terminal, concepts in scripting and programming, and teaching python for biologists. We have a copy in the lab that you can use. The only downside of this book is that it is written in python2, which is now deprecated. In truth, very little is different at this level between python2 and 3, so it still mostly works for concepts and fundamental syntax. I recommend everyone start with the following sequence:
- start with chapters 1, and 4, and then move on to 7-10
- Scripts converted to python 3 are available from the course website: https://practicalcomputing.org/index.html
- Codecademy Learn how to code (free): this free Codecademy course looks like it goes over variables, loops, and functions broadly (what they mean, what they represents, etc…). This would be a good thing to work through that could complement that book above.
- [University of Waterloo, python from scratch](https://open.cs.uwaterloo.ca/python-from-scratch/](https://open.cs.uwaterloo.ca/python-from-scratch/)): This website has a set of free totorials other set for learning python, which my husband, who is a professional software developer, recommended to me.
- Codecademy Python for data science (free): this looks like an intro to using jupyter notebooks and pandas to construct and filter dataframes. In our lab we often use jupter notebooks as useful for scripting and exploratory data anaysis. Pandas is written to mimic R, and is a program that is meant for generating and manipulating dataframe in python. This skill is very useful, and essential for large data manipulation and plotting. However, dataframes can also be a bit tricky, and often will behave a bit oddly if you don’t know exactly how to specify a particular manipulation. For that reason, it can be quite easy to accidentally delete or alter data without realizing it when using dataframes, particularly when cleaning large files that can’t be visually checked. So I recommend that new programmers work through and get very comfortable mastering basic python functions (variables, if/else statements, loops, functions, etc…) before they get into dataframing.
Paid options:
- Codecademy learn python 3 (paid): this goes over all the basics, from variables to classes! If none of the above options work for you, we can buy a subscription to Codecademy. It requires doing an in person purchase with Deb Argento’s procard, which is a bit of a hassle, but it can be done.