Difference between revisions of "Python"

From Open Babel
Jump to: navigation, search
(Wrote pyopenbabel documentation)
(Moving information on --enable-maintainer-mode to install instructions)
Line 78: Line 78:
  
 
That's it! There's more information on particular calls in the [http://openbabel.sourceforge.net/api/ library API]. Feel free to address questions to the [mailto:openbabel-sc[email protected] openbabel-scripting] mailing list.
 
That's it! There's more information on particular calls in the [http://openbabel.sourceforge.net/api/ library API]. Feel free to address questions to the [mailto:openbabel-sc[email protected] openbabel-scripting] mailing list.
 
=== Note on Compiling the SWIG bindings ===
 
 
If you need to recompile the SWIG bindings for the openbabel module, make sure you have the latest version of SWIG installed and run 'configure' as follows:
 
 
<pre>
 
./configure --enable-maintainer-mode
 
</pre>
 
 
Running 'make' as usual should recreate the SWIG bindings.
 
  
 
== The pyopenbabel module ==
 
== The pyopenbabel module ==

Revision as of 04:46, 16 November 2006

Accessing Open Babel with Python

There are two ways to access the Open Babel library using Python:

  1. The openbabel module, a direct binding of the Open Babel C++ library, created using the SWIG package.
  2. The pyopenbabel module, a set of convenience functions and classes that uses the openbabel module. (Only available with OpenBabel >= 2.0.3)

For more examples of using Open Babel from Python, see the developer Python tutorial.

The openbabel module

The openbabel module provides direct access to the C++ Open Babel library from Python. This binding is generated using the SWIG package and provides access to almost all of the Open Babel interfaces via Python, including the base classes OBMol, OBAtom, OBBond, and OBResidue, as well as the conversion framework OBConversion. As such, essentially any call in the C++ API is available to Python scripts with very little difference in syntax.

This guide is designed to give examples of common Python syntax for the openbabel module and pointers to the appropriate sections of the API documentation.

The example script below creates atoms and bonds one-by-one using the OBMol, OBAtom, and OBBond classes.

 import openbabel

 mol = openbabel.OBMol()
 print 'Should print 0 (atoms)'
 print mol.NumAtoms()

 a = mol.NewAtom()
 a.SetAtomicNum(6)   # carbon atom
 a.SetVector(0.0, 1.0, 2.0) # coordinates

 b = mol.NewAtom()
 mol.AddBond(1, 2, 1)   # atoms indexed from 1
 print 'Should print 2 (atoms)'
 print mol.NumAtoms()
 print 'Should print 1 (bond)'
 print mol.NumBonds()

 mol.Clear();

More commonly, Open Babel can be used to read in molecules using the OBConversion framework. The following script reads in molecular information (a SMI file) from a string, adds hydrogens, and writes out an MDL file as a string.

import openbabel

obConversion = openbabel.OBConversion()
obConversion.SetInAndOutFormats("smi", "mdl")
 
mol = openbabel.OBMol()
obConversion.ReadString(mol, "C1=CC=CS1")

print 'Should print 5 (atoms)'
print mol.NumAtoms()

mol.AddHydrogens()
print 'Should print 9 (atoms) after adding hydrogens'
print mol.NumAtoms()

outMDL = obConversion.WriteString(mol)

The following script writes out a file using a filename, rather than reading and writing to a Python string.

import openbabel

obConversion = openbabel.OBConversion()
obConversion.SetInAndOutFormats("pdb", "mol2")

mol = openbabel.OBMol()
obConversion.ReadFile(mol, "1ABC.pdb.gz")   # Open Babel will uncompress automatically

mol.AddHydrogens()

print mol.NumAtoms()
print mol.NumBonds()
print mol.NumResidues()

obConversion.WriteFile(mol, '1abc.mol2')

That's it! There's more information on particular calls in the library API. Feel free to address questions to the openbabel-scripting mailing list.

The pyopenbabel module

The pyopenbabel module provides convenience functions and classes that make it simpler to use the Open Babel libraries from Python, especially for file input/output and for accessing the attributes of atoms and molecules. The Atom and Molecule classes used by pyopenbabel can be converted to and from the OBAtom and OBMol used by the openbabel module. These features are discussed in more detail below.

Information on the pyopenbabel API can be found at the interactive Python prompt using the help() function, and is also available here: pyopenbabel API.

Atoms and Molecules

A Molecule can be created in any of four ways:

  1. From an OBMol, using Molecule(myOBMol)
  2. By reading from a file (see Input/Output below)
  3. By reading from a string (see Input/Output below)
  4. An empty Molecule can be created using Molecule()

An Atom can be created in three different ways:

  1. From an OBAtom, using Atom(myOBAtom)
  2. By accessing the .atoms attribute of a Molecule
  3. An empty Atom can be created using Atom()

It is always possible to access the OBMol or OBAtom on which a Molecule or Atom is based, by accessing the appropriate attribute, either .OBMol or .OBAtom. In this way, it is easy to combine the convenience of pyopenbabel with the many additional capabilities present in openbabel.

Molecules have the following attributes: atoms, charge, dim, energy, exactmass, flags, formula, mod, molwt, spin, sssr, title. The .atoms attribute provides a list of the Atoms in a Molecule. The remaining attributes correspond directly to attributes of OBMols: e.g. Molecule.formula is equivalent to OBMol.GetFormula(). For more information on what these attributes are, please see the Open Babel Library documentation for OBMol.

Molecules have a .write() method that writes a representation of a Molecule to a file or to a string. See Input/Output below.

For convenience, a Molecule provides an iterator over its Atoms. This is used as follows:

for atom in myMolecule:
   # do something with atom

Atoms have the following attributes: atomicmass, atomicnum, cidx, coords, coordidx, exactmass, formatcharge, heavyvalence, heterovalence, hyb, idx, implicitvalence, index, isotope, partialcharge, spin, type, valence, vector. The .coords attribute provides a tuple (x, y, z) of the atom's coordinates. The remaining attributes are as for OBAtom, and more information can be found in the Open Babel Library documentation.

Input/Output

pyopenbabel greatly simplifies the process of reading and writing molecules to and from strings or files. There are two functions for reading Molecules:

  1. readstring(format, string) reads a Molecule from a string
  2. readfile(format, filename) provides an iterator over the Molecules in a file

Here are some examples of their use. Note in particular the use of .next() to access the first (and possibly only) molecule in a file:

>>> mymol = readstring("smi", "CCCC")
>>> print mymol.molwt()
45
>>> for mymol in readfile("sdf", "largeSDfile.sdf")
...	print mymol.molwt()
>>> singlemol = readfile("pdb", "1CRN.pdb").next()

If a single molecule is to be written to a molecule or string, the .write() method of the Molecule should be used:

  1. mymol.write(format) returns a string
  2. mymol.write(format, filename) writes the Molecule to a file. An optional additional parameter, overwrite, should be set to True if you wish to overwrite an existing file.

For files containing multiple molecules, the Outputfile class should be used instead. This is initialised with a format and filename (and optional overwrite parameter). To write a Molecule to the file, the .write() method of the Outputfile is called with the Molecule as a parameter.

Here are some examples of output using the pyopenbabel methods and classes:

>>> print mymol.write("smi")
'CCCC'
>>> mymol.write("smi", "outputfile.txt")
>>> largeSDfile = Outputfile("sdf", "multipleSD.sdf")
>>> largeSDfile.write(mymol)
>>> largeSDfile.write(myothermol)

SMARTS matching

pyopenbabel also provides a simplified API to the Open Babel SMARTS pattern matcher. A Smarts object is created, and the .findall() method is then used to return a list of the matches to a given Molecule.

Here is an example of its use:

>>> mol = readstring("smi","CCN(CC)CC") # triethylamine
>>> smarts = Smarts("[#6][#6]") # Matches an ethyl group
>>> print smarts.findall(mol) 
[(1, 2), (4, 5), (6, 7)]