As datasets have rapidly grown larger in biology, coding has been recognized as an increasingly important skill for biologists. This is especially true in “omics” research with data from e.g. genomics, transcriptomics, and microbiomics, which can usually not be analyzed on a desktop computer, where most software has a command-line interface, and workflows can include many steps that need to be coordinated.
In this course, students will gain hands-on experience with a set of general and versatile tools for day-to-day data-intensive work. The course will focus on foundational skills such as working in the Unix shell and writing shell scripts, installing software and submitting jobs at a compute cluster (the Ohio Supercomputer Center), coding in R, building flexible, automated workflows, but also on organizing, documenting, and version-controlling your research projects. Taken together, this will allow students to reproduce their own research, and enable others to reproduce their research, with as little as a single command.
This course is designed to provide students with foundational training in computing skills for reproducible research. At the end of this course, students will be able to start applying these skills and the associated tools in their own research, and will also have a firm understanding of how this will make their research more robust, reproducible, and efficient.