Workflow

The term workflow is now very widely used. Generically, it refers to a process of research or business, in which multiple steps are performed some concurrently, and some sequentially, in order that an objective be achieved.

Scientific workflow systems bring these concepts into the scientific research domain. They simplify the design and execution, the modification and reuse of analyses comprising many elements. Each step or element takes defined inputs, and generates outputs. The inputs and outputs might be files, but are not necessarily so. Among prominent systems are Taverna, VisTrails and Kepler.

A scientific workflow system permits the whole analysis to be defined prior to execution; the analyses comprise elements that are reused in different workflows. They can be configured graphically or in scripting. They are widely used in some research communities. Other research groups have more fluid modes of work in which many pre-existing tools and software libraries are used, and their scripts might be generated for a single use, or are extensively modified for each use. Some of these researchers prefer hands-on programming to the abstraction of the workflow system. That abstraction can for them be an obstacle, unless it is offset by a simplification in the management of data and by permitting flexible development of the computational elements.

One common mode of work is that researchers execute one program, pause, decide the next step, execute another; do a lot of trial and error. The effect is that a workflow can be inferred only after an analysis has been completed, from the chain of use of input and output data from each element. Capturing provenance about such steps in research is required to allow that inference to happen, and to allow the learning of the trial and error to persist where helpful.

Text prepared by Mike Mineter, University of Edinburgh

1 Comment on “Workflow

Leave a Reply

Your e-mail address will not be published. Required fields are marked *

*