This lecture will unite the last lecture’s content on genomic analysis with our previous coding in R. The packages we’ll use this week are from Bioconductor, a collection of software specifically designed for genomic analysis in R.
Genome variant analysis (Background)
Genomic Data (hands-on tutorials)
We will be working through some tutorials directly on your laptop using R Studio.
## start R session ##
R
## run this command within R session ##
source("../../software/genomic_data.R")
Rsamtools
: querying BAM filesVariantAnnotation
: reading VCF filesGenomicRanges
: manipulating genomic dataplyranges
: fast & easy tool for mannipulating GRangeslecture16
) containing the following three RMarkdown tutorials:
Extensions
(on left panel) > Type in search bar: "R Extension"
> Select R Extension for Visual Studio Code
by Yuki Uedapandoc
pandoc
.pandoc
outside of VScode by downloading the installer here: https://pandoc.org/installing.htmllecture16
directory. The files should have the following filenames:
BRCA.genome_wide_snp_6_broad_Level_3_scna.seg
BRCA_IDC_cfDNA.bam
BRCA_IDC_cfDNA.bam.bai
GIAB_highconf_v.3.3.2.vcf.gz
(if this file was automatically uncompressed on your computer, resulting in a file named GIAB_highconf_v.3.3.2.vcf
, look in your Trash folder to find the original file ending in gz
)GIAB_highconf_v.3.3.2.vcf.gz.tbi