Overview of Scientific Workflows
Callaghan is a research programmer at the Southern California Earthquake Center (SCEC) based in Los Angeles, though he works remotely from St. Louis, Missouri. He first got involved at SCEC as an undergraduate intern, then as a graduate student, and now as staff. He is the project lead on a software application which performs physics-based probabilistic seismic hazard analysis for California. This software typically runs on large HPC systems such as Blue Waters at NCSA and Titan at Oak Ridge Leadership Computing Facility. His research interests include scientific workflows and high throughput computing.
I will present an overview of scientific workflows. I'll discuss what the community means by "workflows" and what elements make up a workflow. We'll talk about common problems that users might be facing, such as automation, job management, data staging, resource provisioning, and provenance tracking, and explain how workflow tools can help address these challenges. I'll present a brief example from my own work with a series of seismic codes showing how using workflow tools can improve scientific applications. I'll finish with an overview of high-level workflow concepts, with an aim to preparing users to get the most out of discussions of specific workflow tools and identify which tools would be best for them.