Automatic running of bioinformatical tools on remote servers

Audrius Meskauskas, Frank Lehmann-Horn, Karin Jurkat-Rott

Motivation: Bioinformatical tools on remote servers can be used more effectively by creating a group of specialized internet robots. A universal package, including specialized code generators and reusable library significantly facilitates this task.

Results: We created and tested the java-based package Sight, including code generator set and library. Sight builds the entire application without programming, realizing requested data flow diagram. The generated web robots can also work as parts of the user-written program. The library provides date-sensitive databases of the previously received responses, strategies of connecting the remote server, a security system that blocks multiple parallel submissions and organizing system that provides a real-time view on the running processes

Sight workflow is based on the following conceptions:

  1. The reusable elementary unit (agent) executes a single remote or local algorithm.
  2. Each agent contains explanatory data structures, defining its request and response format.
  3. New agents are created in a programming-free way by agent generators.
  4. Sight application generator connects agents into workflow, performing required type conversions in the user-defined way.
Send a message to Sight development group Please cite: Meškauskas A., Lehmann-Horn F., Jurkat-Rott K (2004). Sight: automating genomic data-mining without programming skills. Bioinformatics, 20: 1718-1720.

open source

Tell us your opinion about Sight by filling in this feedback form (check boxes only)

Important installation note: Under Windows environment, this recent versions need 1.4 java runtime environment that must be accessible by the path without spaces. For instance, C:\Program files\java is not an appropriate location, despite C:\Programme is ok. This minor bug will be fixed in a subsequent releases.

Acknowledgements

Sight 3.1.0 alpha

In Sight 3.1.0 alpha WSDL is upgraded from 1.0 to 1.1, implementing import statement and more complete support of xml schema constructs. We also extended XSLT transform generator that parses WSDL responses. Greate example of such WSDL is NCBI E-tools.

We also implemented the auto update system, able to upgrade only modified files and changed part of archives (also changed part of the source code). This will allow to add minor enhancements and fix smaller bugs immediately, without waiting of the ordinary new version release..

Sight 3.0.0 alpha and subsequent versions are developed by Audrius Meškauskas as a second stage of the GPL-based Sight project, initially started in Ulm university. This release has several important new features:

  1. Sight now supports workflows, containing loops and confluences. This allows to implement much more variable strategies of the bioinformatical research.
  2. Built – in Smith and Waterman search allows searching the local database for the related sequences.
  3. Cross platform implementation of WSDL compatible protocol allows to integrate web services using this way of communication.
  4. The whole Sight workflow now can be converted into a single Sight agent. Such combined agents are valuable not just as an “elementary units” in more complicated workflows, but also as a simply usable modules for java programmer.
  5. Sight now has agents for reading and writing in SwissProt and FASTA formats an can be used as a system to search or classify data in flat files having these formats.
  6. Sight provides a “mini environment” to test and create data filters, based on java regular expressions.
  7. Improved user environment (for example, possibility to copy/paste a part of workflow) makes creating of workflows easier.
  8. XML agent generator supports namespaces.
  9. Table agent generator supports turned tables and some cases when the response must be composed from the several tables in the document.
  10. Manual landmark marking agent generator now uses regular expressions to specify landmarks.
  11. Sight now can work with workflows where the final result is a graph rather than a table. The annotation event listener accepts registers events when two named nodes must be connected by the named link. Generated graphs can be viewed using CytoScape).
  12. The new version has a large number of the other small improvements and fixed bugs.

This alpha-pre release is still not properly documented. While the user manual is in preparation, we suggest to read the button tool tips and try to understand the general logic of the program. Audrius Meškauskas will also be glad to answer all questions.

Sight is released under GPL and takes advantage of using other GPL-based projects.

SourceForge.net Logo