Some notes about open source workflow engines.
OpenWFE (WorkFlow Engine)
OpenWFE (http://www.openwfe.org/) claims that it is a system for business processes. It has proprietary definitions of several workflow patterns in XML. It uses a few different programming languages to build the workflow engine, e.g. Java, Perl, C#, Python, Ruby. Although the main development efforts has move from Java to Ruby (OpenWFEru - open source ruby workflow engine).
The workflow model is not clear but the workflow patterns are expressed in "Task-based" workflow definitions in XML, rather than using a graphical model - i.e. state-transition approach.
How the workflow engine works is not clear.
wftk: Open Source Workflow Tool Kit
The wftk stores custom XML definitions of Processes in a repository. As the Processes or Tasks are activated from the repository, the Tasks are processed (i.e. what happens is determined) as the Task Engine interprets the Process/Task definitions as the executions of the Processes "move" on. The system consists of two main components, Repository (with Repository Manager) and Task Engine, and other associated components.
Apache ODE (Orchestration Director Engine)
Apache ODE is based on Java and WS-BPEL 1.x and 2.0. It is an AXIS application, which deploys and provides BPEL Processes as Endpoints.
Processes are described in BPEL, Process Deployment Descriptor, WSDL and related multiple XSDs, and they are "deployed" into WEB-INF/processes directory. (The ODE does not provide editors for BPEL and WSDL. They are composed with some other applications). The ODE-AXIS servlet reads in the "deployed" Processes and "compile" them (makes Java Process models), which may be executed in memory or persisted outside. When a Process is instantiated, it is issued an "instance ID" to "correlate" the Process instance (sometimes called "case") to the particular runtime instance of the Process model inside system. The Process may be "long-running" and when the server reaches the maximum number of Processes or the Process is not used for a long time, it is "dehydrated" (i.e. the references are nuked in the system). When the Process instance is required again, it is "hydrated" from the persistence mechanism (SQL database is used).
Processes also may be "long-running" and it is not suitable for assigning normal Java Threads to Process instances to handle the concurrency. For this reason, the concurrent behaviour is provided by a special type of pseudo-thread Class, named Jacob Object. It mimics "wait", "lock" or "monitor", or "release" behaviors of Java Thread without actually using Java Thread.
ODE also provides "versioning" mechanism for Processes, which may be revised and redeployed over older versions during its life-cycle.
The "compiler" treats BPEL "nodes" as different types of "Message Exchanges" (i.e message-in and then message-out). This is suitable for data-driven workflow like BPEL (as messages passed around), but it may not be the best way to handle workflow in other situations.
In this Process model, Processes are deployed to the system as equivalent of Java Classes. They become Objects and Methods, which makes up the system, to form logics of data (message) processing. How a Process is run is:
- Process is invoked as Web Service with a given input data.
- This creates a Jacob thread to go through a series of message exchanges in the compiled Process.
- When the Process is "completed", the output data is returned as the Web Service response.
The role of the Workflow Engine seems to manage the loading/unloading - hydration/dehydration of Processes. There does not seem to have a need to handle transitions of Activities or execution tokens like other workflow models. Although it has to instantiate a new Jacob Thread for each "case" of the use of a Process - which may be regarded to have some "mobile parts", the Process seems to work pretty much a "solid-state" code, so-to-speak.
Taverna Workbench for Grid Workflow
"TheTaverna Workbench provides a desktop authoring environment and enactment engine for scientific workflows expressed in Scufl (Simple Conceptual Unified Flow language) ... developed and maintained by the myGrid". (http://taverna.sourceforge.net/)
Taverna actually uses XScufl, an XML representation of Scufl, which is Taverna's own workflow language roughly based on IBM's WSFL language (Note: WS-BPEL superseded WSFL). It is more domain-specific.
Taverna Workbench shows pre-configured Grid "Services" elements, executable Grid RPCs and other local or external workflow or Grid components from NCBI, EBI, DDBJ, SoapLab, BioMOBY and EMBOSS. These are several different types of Processors and Taverna can process different types of service definition files provided by the providers. Mostly they seem to be either RPCs or WSDL of RPC Web Services. The workflows are composed by placing input, output, Processors and other composition elements and by linking them together with data and control links. Therefore, it is a Bioinformatics-specific Grid Service workflow.
Taverna Workbench sends a "job" request to a remote Grid service and listen to the incoming response from the server. When the response is back, it moves the execution point on the workflow to the next step.
Some links
Open Source Workflow Engines in Java
Open Source Workflow Engines Written in Java
