In this function, a `gimap_dataset` is annotated as far as which genes should be used as controls.

gimap_annotate(
  .data = NULL,
  gimap_dataset,
  annotation_file = NULL,
  control_genes = NULL,
  cell_line_annotate = TRUE,
  custom_tpm = NULL,
  cell_line = NULL
)

Arguments

.data

Data can be piped in with tidyverse pipes from function to function. But the data must still be a gimap_dataset

gimap_dataset

A special dataset structure that is setup using the `setup_data()` function.

annotation_file

If no file is given, will attempt to use the design file from https://media.addgene.org/cms/filer_public/a9/9a/a99a9328-324b-42ff-8ccc-30c544b899e4/pgrna_library.xlsx

control_genes

A vector of gene symbols (e.g. AAMP) that should be labeled as control genes. These will be used for log fold change calculations. If no list is given then DepMap Public 23Q4 Achilles_common_essentials.csv is used https://depmap.org/portal/download/all/

cell_line_annotate

(Optional) TRUE or FALSE you'd also like to have cell_line_annotation from DepMap.

custom_tpm

(Optional) You may supply your own data frame of transcript per million expression to be used for this calculation if you can't or don't want to use DepMap data annotation for your cell_line. This data frame needs to have two columns: 'log2_tpm' that has the log2 tpm expression data for this cell line and and 'genes' which needs to be gene symbols that match those in the data. eg. "NDL1". Note that you can use custom_tpm with cell_line_annotate but your custom_tpm will be used instead of the tpm data from DepMap. However other data from DepMap like CN will be added.

cell_line

which cell line are you using? (e.g., HELA, PC9, etc.). Required argument if cell_line_annotate is TRUE.

Examples