Home • Exploration • Basic Area Skills • Advanced Area Skills • Datasets • Resources
Foundations
Computer Science
Suggested Materials
Basic Computer Science & Your First Programming Language
Khan Academy – Digital Information
To get started you should understand the basics about digital information. Khan Academy has put together a nice module on this topic with exercises; completing this will give you
a firm grasps on how computers process information, which is very useful background knowledge when you doing computational biology for various reason that will be apparent in future courses (working with massive datasets, parallelization, and more).
w3 Schools – Python
This tutorial put together by w3 is the perfect learn-by-doing guide for learning python. You will move through simple concepts all the way through to advance concepts like error catching, package building, etc. The minimum course recommendation is to complete all succesive modules until “Classes/Objects”, then move to Python Modules (Numpy, Pandas, SciPy tutorials), and finally the Matplotlib tutorial. All of these will prepare you to apply what you have learned in the next course on EdX.
UCSC – Python for Data Science
Bash & Version Control (Git)
GitHub
GitHub is the de facto code sharing platform in the world. We will be using git & GitHub throughout the OSCB, so it’s of uttmost importance to learn the basics now using GitHub’s Quick Start Guide.
Data Science at the Command Line – Jeroen Janssens
Though you’ve likely already done a little command line work in the GitHub Tutorial & Python for Data Science Course, this free book by Jeroen Janssens has made a wonderful introduction to using the command line for data science. Reading through and coding along with this book will make you a star as using the Unix command line for all things data science. Moving forward you will understand much more about the documentation surrounding many of the softwares you will use as it is common to use the command line for basic scripting as a computational biologist. For the purposes of foundation building my recommendation is completing up to Chapter 8, though of course you can always do the entire book!
SQL
w3 Schools – SQL
SQL (Structured Query Language) is commonly used with with structured database solutions. Understanding the basics of SQL will aid in the retrieval of data from various bioinformatics databases like the CTTI for clinical trials! We recommend doing the initial tutorial on w3 Schools to get the basics of SQL and so that you can reference the tutorial throughout the course.
Containerization
Docker Tutorial
Containerization is the packaging of software code with just the operating system (OS) libraries and dependencies required to run the code to create a single lightweight executable that runs consistently on any infrastructure. This is the current industry standard for packaging and deploying software for others to use. It solves the inconsistent infrastructure problem making your software more portable. We recommend using Docker and going through their tutorial to get started in the topic and so that when it is used throughout the course you will know what is going on and how to work with Docker containers.
Additional Resources
Git Cheatsheet
GitHub CLI Manual
Docker CLI Manual
Math
Suggested Materials
Khan Academy – Calculus 1
Quite simply, computational biology uses a lot of math and especially Calculus. Here we recommend using Khan Academy’s Calculus I course to begin your calculus journey. Working through this course is more than enough to get started with concepts in computational biology, but we will continuously reference Khan Academy for math explanation in advanced topics.
Khan Academy – Calculus 2
Quite simply, computational biology uses a lot of math and especially Calculus. Here we recommend using Khan Academy’s Calculus II course to begin your calculus journey. Working through this course is more than enough to get started with concepts in computational biology, but we will continuously reference Khan Academy for math explanation in advanced topics.
Khan Academy – Linear Algebra
Linear Algebra is an extremely useful way of thinking about very common computational problems done with computers, specifically linear systems of equations, vector spaces, determinants, eigenvalues, similarity, and positive definite matrices. If you’re not familiar with these concepts the Khan Academy course is a perfect starting place for a novice mathmatician and a great reference! This course will be enough for now as we’ll introduce more concepts later as needed.
Practical Statistics by Peter Bruce
Get started with some basis coding in R and Python while learning statistics! This book is the de facto reference for statistics for data science and will allow you to get a feel for how to do statistical computing!
Additional Resources
Infinite Series by Steven Strogatz
Get inspired to do math by reading this amazing book by Steven Strogatz! The book goes over the history of calculus and it’s inception from intuitive human understanding of calculus, to the development of calculus practice, to the formal field that it is today. Understanding why we need calculus and just how useful it is a practice, as well as a way of approaching problems will have you seeing the world as a great mathematical symphony.
Biology
Suggested Courses
Khan Academy – Biology
Here we are recommending the Khan Academy biology series as a primer for those that need a biology background. Just like our other recommendations Khan Academy has done a great job at outlining the bare essentials, giving you enough to understand future concepts when they are introduced. Here at OSCB we don’t believe in reinventing the wheel, use Khan Academy to your advantage and learn basic biology with them!
Suggested Books
Concepts in Biology
This OpenStax Biology Book will give you a great, free reference for introductory biology concepts!
Additional Resources
MIT Open CourseWare – Introduction to Biology
This 2018 course will give you another great resource for understanding basic biology. In this course, MIT Professor Barbara Imperiali, highlights key developments in therapeutics, as well as tools for advancing research.