Python

From Open Babel
Revision as of 11:09, 22 June 2006 by Ghutchis (Talk | contribs)

Jump to: navigation, search

Accessing Open Babel with Python

There are two ways to access the Open Babel library using Python:

  1. The openbabel module, a direct binding of the Open Babel C++ library, created using the SWIG package.
  2. The pyopenbabel module, a Pythonic shell around the openbabel module.

For more examples of using Open Babel inside Python, see the developer Python tutorial.

The openbabel module

The openbabel module provides direct access to the C++ Open Babel library from Python. This binding is generated using the SWIG package and provides access to almost all of the Open Babel interfaces via Python, including the base classes OBMol, OBAtom, OBBond, and OBResidue, as well as the conversion framework OBConversion. As such, essentially any call in the C++ API is available to Python scripts with very little difference in syntax.

This guide is designed to give examples of common Python syntax for the openbabel module and pointers to the appropriate sections of the API documentation.

The example script below creates atoms and bonds one-by-one using the OBMol, OBAtom, and OBBond classes.

 import openbabel

 mol = openbabel.OBMol()
 print 'Should print 0 (atoms)'
 print mol.NumAtoms()

 a = mol.NewAtom()
 a.SetAtomicNum(6)   # carbon atom
 a.SetVector(0.0, 1.0, 2.0) # coordinates

 b = mol.NewAtom()
 mol.AddBond(1, 2, 1)   # atoms indexed from 1
 print 'Should print 2 (atoms)'
 print mol.NumAtoms()
 print 'Should print 1 (bond)'
 print mol.NumBonds()

 mol.Clear();

More commonly, Open Babel can be used to read in molecules using the OBConversion framework. The following script reads in molecular information (a SMI file) from a string, adds hydrogens, and writes out an MDL file as a string.

import openbabel

obConversion = openbabel.OBConversion()
obConversion.SetInAndOutFormats("smi", "mdl")
 
mol = openbabel.OBMol()
obConversion.ReadString(mol, "C1=CC=CS1")

print 'Should print 5 (atoms)'
print mol.NumAtoms()

mol.AddHydrogens()
print 'Should print 9 (atoms) after adding hydrogens'
print mol.NumAtoms()

outMDL = obConversion.WriteString(mol)

The following script writes out a file using a filename, rather than reading and writing to a Python string.

import openbabel

obConversion = openbabel.OBConversion()
obConversion.SetInAndOutFormats("pdb", "mol2")

mol = openbabel.OBMol()
obConversion.ReadFile(mol, "1ABC.pdb.gz")   # Open Babel will uncompress automatically

mol.AddHydrogens()

print mol.NumAtoms()
print mol.NumBonds()
print mol.NumResidues()

obConversion.WriteFile(mol, '1abc.mol2')

That's it! There's more information on particular calls in the library API. Feel free to address questions to the openbabel-scripting mailing list.

Note on Compiling the SWIG wrapper

If you need to recompile the SWIG wrapper for the library, make sure you have the latest version of SWIG installed and run 'configure' as follows:

 ./configure --enable-maintainer-mode

Running 'make' as usual should recreate the SWIG wrappers.

The pyopenbabel module

The motivation behind the pyopenbabel module is to provide a more Pythonic access for the Open Babel C++ library. By 'more Pythonic', it means that it should behave like a typical Python library so that it should:

  1. provide easy access to attributes and methods (no need for GetVar())
  2. keep things simple (only one good way to do things)
  3. things should work 'as expected' in Python (iterators, etc.)

More information is available in the API for pyopenbabel.

Examples

The following example shows how to:

  • read a SMILES string
  • display the atomic numbers of all of the atoms
  • write a SMILES string
>>> from pyopenbabel import *

>>> mymol = readstring("smi","C1=CC=CS1")

>>> print "The number of atoms is %d" % len(mymol.atoms)
The number of atoms is 5

>>> for atom in mymol:
...    print "Atomic number:%d" % atom.atomicnum
...
Atomic number:6
Atomic number:6
Atomic number:6
Atomic number:6
Atomic number:16

>>> print "This molecule's InChI representation is %s" % mymol.write("inchi")
This molecule's InChI representation is InChI=1/C4H4S/c1-2-4-5-3-1/h1-4H

pyopenbabel also provides access to the OpenBabel SMARTS pattern matcher:

>>> mol = readstring("smi","CCN(CC)CC") # triethylamine
>>> smarts = Smarts("[#6][#6]") # Matches an ethyl group
>>> print smarts.findall(mol) 
[(1, 2), (4, 5), (6, 7)]