Data Deep Dive: Building Computational Workflows
Course material for the Data Deep Dive workshop — learn to find, run, and customize WDL workflows using the WILDS WDL Library, Sprocket, and PROOF.
Course Overview
This course will guide readers through the basics of WDL computational workflows, why they might be advantageous in research contexts, and how to best utilize them in the context of the Fred Hutch ecosystem.
Who This Course Is For
- No previous workflow experience is required, just basic command line experience.
- The specifics of WDL execution will be most useful for researchers actually performing the research (e.g. post-docs, grad students), but PI's would definitely benefit from the broad concepts and potential advantages of WDL workflows.
Learning Objectives
By the end of this course, you will be able to:
- Understand the structure of a WDL workflow
- Specify inputs and execute a workflow via PROOF
- Navigate the WILDS WDL Library to find relevant workflows
- Interpret workflow outputs and troubleshoot common issues
- Customize existing workflows by adding or swapping modules
Course Modules
| # | Module | Description |
|---|---|---|
| 0 | Pre-Work | What to prepare before diving in |
| 1 | Introduction | Why workflows? Motivation and context |
| 2 | WDL Concepts | Core WDL concepts learned through a Hello World walkthrough |
| 3 | A Real-World Workflow | End-to-end example with ww-sra-salmon from the WILDS WDL Library |
| 4 | Customizing Workflows | Modifying and extending existing pipelines |
| 5 | Running Workflows | Executing workflows with PROOF |
| 6 | Common Pitfalls | Diagnosing errors and fixing common issues |
| 7 | Resources & Next Steps | Where to go from here |
Reference
- Glossary — Key terms and definitions