Skip to content

Data Deep Dive: Building Computational Workflows

Course material for the Data Deep Dive workshop — learn to find, run, and customize WDL workflows using the WILDS WDL Library, Sprocket, and PROOF.


Course Overview

This course will guide readers through the basics of WDL computational workflows, why they might be advantageous in research contexts, and how to best utilize them in the context of the Fred Hutch ecosystem.

Who This Course Is For

  • No previous workflow experience is required, just basic command line experience.
  • The specifics of WDL execution will be most useful for researchers actually performing the research (e.g. post-docs, grad students), but PI's would definitely benefit from the broad concepts and potential advantages of WDL workflows.

Learning Objectives

By the end of this course, you will be able to:

  • Understand the structure of a WDL workflow
  • Specify inputs and execute a workflow via PROOF
  • Navigate the WILDS WDL Library to find relevant workflows
  • Interpret workflow outputs and troubleshoot common issues
  • Customize existing workflows by adding or swapping modules

Course Modules

# Module Description
0 Pre-Work What to prepare before diving in
1 Introduction Why workflows? Motivation and context
2 WDL Concepts Core WDL concepts learned through a Hello World walkthrough
3 A Real-World Workflow End-to-end example with ww-sra-salmon from the WILDS WDL Library
4 Customizing Workflows Modifying and extending existing pipelines
5 Running Workflows Executing workflows with PROOF
6 Common Pitfalls Diagnosing errors and fixing common issues
7 Resources & Next Steps Where to go from here

Reference

  • Glossary — Key terms and definitions