Now that we have a basic grasp of concepts surrounding data management, manipulation, and visualization, we’re ready to start focusing on some of the more specialized data encountered in computational biology research. Sequencing of nucleic acids is almost ubiquitous in biological research. In this lecture, we will introduce some common resources for depositing and retrieving sequence data generated by consortium efforts and independent laboratories. We will introduce concepts and practical steps of querying, inspecting, and visualizing sequence data. Then, we will cover the types of genomic variation and common tools used to predict these from sequencing data.
This lecture focuses on concepts surrounding genome sequence data and their associated workflows. This lecture will include demonstrations and student exercises. We will dive into details of sequencing data and formats, as well as outputs for specific sequencing analysis commands. There will also be materials included as a resource for your future reference.
Outline of content from the slides:
Please be able to locate files for in-class exercises. Data and examples shown in lecture 15
and lecture 16 are available on Fred Hutch filesystem at /fh/fast/subramaniam_a/tfcb
and on DropBox. For both lecture 15
and lecture 16
, you will need to download these files onto your laptop.
Please install Integrative Genomics Viewer (IGV).