WORKS'11 was the sixth issue in the WORKS workshop series. The call for papers attracted 23 submissions from Europe, North and South America. The program committee accepted 15 papers that cover a variety of topics, ranging from large-scale execution management of workflows (scalability, fault-tolerance, performance, optimization, etc) to workflows exploitation (reuse, portability, interoperability, traceability, etc). This issue was also the opportunity for an inspiring keynote that surveyed the past issues and future trends given by the former WORKS chair, Ewa Deelman. The attendance was above expectations with more than 80 participants.
To foster discussions and encourage more exchanges between researchers working on closely related topics, we organized mini-panel discussions at the end of each thematic session this year. The five sessions organized confirmed the vivid interest for distributed-computing workflows animating several communities: experts of distributed computing systems as well as end users in need for high-end, accessible computing infrastructures. Traditional research topics from distributed-computing were well represented with papers on scalability, fault-tolerance, performance and optimization of workflow management systems. The now well-identified scientific data provenance theme in workflows was also well subscribed. In addition we could observe the growing interest for workflow reuse and workflow systems interoperability showing a maturity level of the scientific workflows community. With extended usage of the existing solutions, users are increasingly looking for on-line workflow resources. Beyond mere computational capability, workflows are also used for knowledge and know-how transfer, causing new needs to emerge such as community distribution of workflows, high-level representation and automated transformations.
Proceeding Downloads
Scientific workflow reuse through conceptual workflows on the virtual imaging platform
An increasing number of scientific experiments are "in-silico": carried out at least partially using computers. Scientific Workflows have become a key tool to model and implement such experiments, but they tangle domain knowledge, technical know-how and ...
Workflow overhead analysis and optimizations
The execution of scientific workflows often suffers from a variety of overheads in distributed environments. It is essential to identify the different overheads and to evaluate how optimization methods help reduce overheads and improve runtime ...
Provenance for MapReduce-based data-intensive workflows
MapReduce has been widely adopted by many business and scientific applications for data-intensive processing of large datasets. There are increasing efforts for workflows and systems to work with the MapReduce programming model and the Hadoop ...
Supporting dynamic parameter sweep in adaptive and user-steered workflow
Large-scale experiments in computational science are complex to manage. Due to its exploratory nature, several iterations evaluate a large space of parameter combinations. Scientists analyze partial results and dynamically interfere on the next steps of ...
Optimizing bioinformatics workflows for data analysis using cloud management techniques
With the rapid development in recent years of high-throughput technologies in the life sciences, huge amounts of data are being generated and stored in databases. Despite significant advances in computing capacity and performance, an analysis of these ...
A new approach for publishing workflows: abstractions, standards, and linked data
In recent years, a variety of systems have been developed that export the workflows used to analyze data and make them part of published articles. We argue that the workflows that are published in current approaches are dependent on the specific codes ...
Provenance opportunities for WS-VLAM: an exploration of an e-science and an e-business approach
Scientific applications are frequently modeled as a workflow that is executed under the control of a workflow management system. One crucial requirement during the execution is the validation of the generated results and the traceability of the ...
Object reuse and exchange for publishing and sharing workflows
The workflow paradigm can provide the means to describe the complete functional pipeline for a scientific experiment and therefore expose the underlying scientific processes for enabling the reproducibility of results. However, current means for ...
Making data analysis expertise broadly accessible through workflows
The demand for advanced skills in data analysis spans many areas of science, computing, and business analytics. This paper discusses how non-expert users reuse workflows created by experts and representing complex data mining processes for text ...
Exploring workflow interoperability tools for neuroimaging data analysis
- Vladimir Korkhov,
- Dagmar Krefting,
- Tamas Kukla,
- Gabor Z. Terstyanszky,
- Matthan Caan,
- Silvia D. Olabarriaga
Neuroimaging is a field that benefits from distributed computing infrastructures (DCIs) to perform data processing and analysis, which is often achieved using grid workflow systems. Collaborative research in neuroimaging requires ways to facilitate ...
IWIR: a language enabling portability across grid workflow systems
Today there are many different scientific Grid workflow management systems using a wide array of custom workflow languages. Some of them are geared towards a data-based view, some are geared towards a control-flow based view and others try to be as ...
Failure prediction and localization in large scientific workflows
Scientific workflows provide a portable representation for scientific applications' coordinated input, output, and execution management for highly parallel executions of interdependent computations, as well as support for sharing and validating the ...
Characterizing quality of resilience in scientific workflows
The enactment of scientific workflows involves the distribution of tasks to distributed resources that exist in different administrative domains. Such resources can range in granularity from a single machine to one or more clusters and file systems. The ...
Achieving reproducibility by combining provenance with service and workflow versioning
Capturing and exploiting provenance information is considered to be important across a range of scientific, medical, commercial and Web applications, including recent trends towards publishing provenance-rich, executable papers. This article shows how ...
AME: an anyscale many-task computing engine
Many-Task Computing (MTC) is a new application category that encompasses increasingly popular applications in biology, economics, and statistics. The high inter-task parallelism and data-intensive processing capabilities of these applications pose new ...
- Proceedings of the 6th workshop on Workflows in support of large-scale science