Setup and installation

Author

Kim Dill-McFarland

Published

December 11, 2023

Data introduction

We will be working with one of the Hackday data sets.

Darrah PA et al. Airway T-cells are a correlate of i.v. Bacille Calmette-Guerin-mediated protection against tuberculosis in rhesus macaques. Cell Host Microbe. 2023 Jun 14;31(6):962-977.e8. doi: 10.1016/j.chom.2023.05.006. PMID: 37267955; PMCID: PMC10355173.

These data are single-cell RNAseq from Rhesus macaques with and without BCG vaccination as well as before and after M. tuberculosis (Mtb) challenge. Vaccinations include aerosol (AE), intradermal high-dose (IDhigh), intradermal low-dose (IDlow), intravenous (IV), and naïve-controls. Bronchoalveolar lavage (BAL) was collected at weeks 13 and 25.

You can explore the data at https://singlecell.broadinstitute.org/single_cell/study/SCP796/prevention-of-mycobacterium-tuberculosis-infection-and-disease-in-nonhuman-primates-following-intravenous-bcg-vaccination?scpbr=the-alexandria-project

Download and install

At the Hackday

An RStudio server with the necessary data and R packages has been setup for the event. Sign-in instructions will be given at the event.

You can find a copy of all the tutorial data and scripts at /home/seatrac-hackday-2023/. In R, you can view these files with list.files("/home/seatrac-hackday-2023")

On your own

Download the data

You can find all the data used in this tutorial at https://github.com/FredHutch/seatrac-hackday-2023/tree/main/1.rnaseq_tutorial/data. Download each file and save them wherever you’ve made your Rproject.

Install R and RStudio

When you open RStudio, it should look like so with multiple panels. If you see only 1 panel, then you’re likely in R, not RStudio.

Install R packages

Install R packages by running a script in your R console in RStudio (left panel in the above image).

If prompted, say a to “Update all/some/none? [a/s/n]” and no to “Do you want to install from sources the packages which need compilation? (Yes/no/cancel)”

This can take several minutes.

#CRAN packages
install.packages("tidyverse", Ncpus=4)
install.packages("Seurat", Ncpus=4)
install.packages(c("usethis","statmod"))
#Bioconductor packages
install.packages("BiocManager")
BiocManager::install("limma")
#GitHub packages
install.packages("devtools")
devtools::install_github("BIGslu/kimma")

Optional install data cleaning R packages

These packages are not used in the actual tutorial but are part of the data cleaning scripts used to prepare the data. If you would like to explore the data cleaning steps, please also install the following.

#CRAN packages
install.packages(c("patchwork", "data.table", "janitor"))
#Bioconductor packages
BiocManager::install("edgeR")
#GitHub packages
devtools::install_github("BIGslu/RNAetc")

Check R package install

To make sure packages are correctly installed, load each of them individually into R with library( ).

For example, the tidyverse is a meta-package containing multiple packages. It gives the following message when loaded into R. Your exact version numbers way vary slightly.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

In contrast, packages such as limma load silently with no messages.

library(limma)

The key is to look for any messages that contain ERROR or there is no package called X. This means the package was not installed correctly.