Let’s say we want to print out the molecular weights of every molecule in an SD file. Why? Well, we might want to plot a histogram of the distribution, or see whether the average of the distribution is significantly different (in the statistical sense) compared to another SD file.
from openbabel import * obconversion = OBConversion() obconversion.SetInFormat("sdf") obmol = OBMol() notatend = obconversion.ReadFile(obmol,"../xsaa.sdf") while notatend: print obmol.GetMolWt() obmol = OBMol() notatend = obconversion.Read(obmol)
from pybel import * for molecule in readfile("sdf","../xsaa.sdf"): print molecule.molwt
First of all, look at all of the classes in the Open Babel API that end with “Iter”. You should use these whenever you need to do something like iterate over all of the atoms or bonds connected to a particular atom, iterate over all the atoms in a molecule, iterate over all of the residues in a protein, and so on.
As an example, let’s say we want to find information on all of the bond orders and atoms connected to a particular OBAtom called ‘obatom’. The idea is that we iterate over the neighbouring atoms using OBAtomAtomIter, and then find the bond between the neighbouring atom and ‘obatom’. Alternatively, we could have iterated over the bonds (OBAtomBondIter), but we would need to look at the indices of the two atoms at the ends of the bond to find out which is the neighbouring atom:
for neighbour_atom in openbabel.OBAtomAtomIter(obatom): print neighbour_atom.GetAtomicNum() bond = obatom.GetBond(neighbour_atom) print bond.GetBondOrder()
The following was a request on the CCL.net list:
Hi all, Does anyone have a script to split an SDFfile into single sdfs named after each after each individual molecule as specified in first line of parent multi file?
The solution is simple...
import pybel for mol in pybel.readfile("sdf", "bigmol.sdf"): mol.write("sdf", "%s.sdf" % mol.title)
TJ’s book, “Design and Use of Relational Databases in Chemistry”, also contains examples of Python code using Open Babel to create and query molecular databases (see for example the link to Open Babel code in the Appendix).