Code Structure of OpenBabel

From Open Babel
Revision as of 09:04, 17 April 2006 by Chrismorl (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Since version 2.0 OpenBabel has had the modular structure shown in the diagram. Particularly for the use of OpenBabel as a chemical file format converter, it aims to:

  • separate the chemistry, the conversion process and the user interfaces, reducing, as far as possible, the dependency of one on another.
  • put all the code for each chemical format in one place (usually a single cpp file) and make the addition of new formats simple.
  • allow the format conversion of not just molecules, but also any other chemical objects, such as reactions.
Structure of OpenBabel

The separate parts of the OpenBabel program are:

  • The Chemical core, which contains OBMol etc. and has all the chemical structure description and manipulation. This bit is the heart of the application and its API can be used as a chemical toolbox. It has no input/output capabilities.
  • The Formats, which read and write to files of different types. These classes are derived from a common base class, OBFormat, which is in the Conversion Control module. They also make use of the chemical routines in the Chemical Core module. Each format file contains a global object of the format class. When the format is loaded the class constructor registers the presence of the class with OBConversion. This means the formats are plugins - new formats can be added without changing any framework code.
  • The Conversion control, which also keeps track of the available formats, the conversion options and the input and output streams. It can be compiled without reference to any other parts of the program. In particular, it knows nothing of the Chemical core: mol.h is not included.
  • The User interface, which may be a command line (in main.cpp), a Graphical User Interface(GUI), especially suited to Windows users and novices, or may be part of another program which uses OpenBabel's input and output facilities. This depends only on the Conversion control module (obconversion.h is included), but not on the Chemical core or on any of the Formats.
  • The Fingerprints, which are bit arrays which describe an object and which facilitate fast searching. They are also built as plugins, registering themselves with their base class OBFingerprint which is in the Chemical Core.

It is possible to build each box in the diagram as a separate DLL or shared library and the restricted dependencies helps to limit the amount of recompilation. For the formats or the fingerprints built in this way it may be possible to use only those whose DLL or so files are present when the program starts. Several formats or fingerprints may be present in a single dynamic library.

Alternatively, and most commonly, the same source code can be built into a single executable. The restricted dependencies still provide easier progam maintainance.

This separation of function requires some discipline when adding new code, and sometimes non-obvious work-arounds are necessary. For instance:

  • The error reporting facility OBError cannot be used in the Conversion control or the User interface sections because its code in in the Chemical core.
  • In order to set the level of warning/errror messages that are displayed from the command line, it was neccessary to use a (pseudo) format OBAPIInterface. This could be extended to do a similar job in tweaking parameters for any of the functions in the core API from a User Interface module.
  • Sometimes one format needs to use code from another format, for example, rxnformat needs to read mol files with code from mdlformat. The calling format should not use the code directly but should do it through a OBConversion object configured with the appropriate helper format.

The objects passed between the modules in the diagram above are polymorphic OBBase pointers. This means that the conversion framework can be used by any object derived from OBBase (which essentially means anything - chemical or not). Most commonly these refer to OBMol objects, less commonly to OBReaction objects, but could be extended to anything else without needing to change any existing code.