OBSmartsPattern Class Reference

SMARTS (SMiles ARbitrary Target Specification) substructure searching. More...

#include <openbabel/parsmart.h>

List of all members.

Public Member Functions

 OBSmartsPattern ()
virtual ~OBSmartsPattern ()
 OBSmartsPattern (const OBSmartsPattern &cp)
OBSmartsPatternoperator= (const OBSmartsPattern &cp)
void WriteMapList (std::ostream &)
Initialization Methods
bool Init (const char *pattern)
bool Init (const std::string &pattern)
Pattern Properties
const std::string & GetSMARTS () const
std::string & GetSMARTS ()
bool Empty () const
bool IsValid () const
unsigned int NumAtoms () const
unsigned int NumBonds () const
void GetBond (int &src, int &dst, int &ord, int idx)
int GetAtomicNum (int idx)
int GetCharge (int idx)
int GetVectorBinding (int idx) const
Matching methods (SMARTS on a specific OBMol)
bool Match (OBMol &mol, bool single=false)
bool RestrictedMatch (OBMol &mol, std::vector< std::pair< int, int > > &pairs, bool single=false)
bool RestrictedMatch (OBMol &mol, OBBitVec &bv, bool single=false)
unsigned int NumMatches () const
std::vector< std::vector< int > > & GetMapList ()
std::vector< std::vector< int >
>::iterator 
BeginMList ()
std::vector< std::vector< int >
>::iterator 
EndMList ()
std::vector< std::vector< int > > & GetUMapList ()

Protected Attributes

std::vector< bool > _growbond
std::vector< std::vector< int > > _mlist
Pattern_pat
std::string _str


Detailed Description

SMARTS (SMiles ARbitrary Target Specification) substructure searching.

Substructure search is an incredibly useful tool in the context of a small molecule programming library. Having an efficient substructure search engine reduces the amount of hard code needed for molecule perception, as well as increases the flexibility of certain operations. For instance, atom typing can be easily performed based on hard coded rules of element type and bond orders (or hybridization). Alternatively, atom typing can also be done by matching a set of substructure rules read at run time. In the latter case customization based on application (such as changing the pH) becomes a facile operation. Fortunately for Open Babel and its users, Roger Sayle donated a SMARTS parser which became the basis for SMARTS matching in Open Babel.

For more information on the SMARTS support in Open Babel, see the wiki page: http://openbabel.sourceforge.net/wiki/SMARTS

The SMARTS matcher, or OBSmartsPattern, is a separate object which can match patterns in the OBMol class. The following code demonstrates how to use the OBSmartsPattern class:

  OBMol mol;
  ...
  OBSmartsPattern sp;
  sp.Init("CC");
  sp.Match(mol);
  vector<vector<int> > maplist;
  maplist = sp.GetMapList();
  //or maplist = sp.GetUMapList();
  //print out the results
  vector<vector<int> >::iterator i;
  vector<int>::iterator j;
  for (i = maplist.begin();i != maplist.end();++i)
  {
     for (j = i->begin();j != i->end();++j)
        cout << j << ' `;
     cout << endl;
  }

The preceding code reads in a molecule, initializes a SMARTS pattern of two single-bonded carbons, and locates all instances of the pattern in the molecule. Note that calling the Match() function does not return the results of the substructure match. The results from a match are stored in the OBSmartsPattern, and a call to GetMapList() or GetUMapList() must be made to extract the results. The function GetMapList() returns all matches of a particular pattern while GetUMapList() returns only the unique matches. For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned. A unique match is defined as one which does not cover the identical atoms that a previous match has covered.


Constructor & Destructor Documentation

OBSmartsPattern (  )  [inline]

~OBSmartsPattern (  )  [virtual]

OBSmartsPattern ( const OBSmartsPattern cp  )  [inline]


Member Function Documentation

OBSmartsPattern& operator= ( const OBSmartsPattern cp  )  [inline]

bool Init ( const char *  pattern  ) 

bool Init ( const std::string &  pattern  ) 

Parse the pattern SMARTS string.

Returns:
Whether the pattern is a valid SMARTS expression

const std::string& GetSMARTS (  )  const [inline]

Returns:
the SMARTS string which is currently used

std::string& GetSMARTS (  )  [inline]

Returns:
the SMARTS string which is currently used

bool Empty (  )  const [inline]

Returns:
If the SMARTS pattern is an empty expression (e.g., invalid)

bool IsValid (  )  const [inline]

Returns:
If the SMARTS pattern is a valid expression

unsigned int NumAtoms (  )  const [inline]

Returns:
the number of atoms in the SMARTS pattern

Referenced by OpenBabel::EvalAtomExpr(), OBChemTsfm::Init(), OBChemTsfm::IsAcid(), OBChemTsfm::IsBase(), and OBPhModel::ParseLine().

unsigned int NumBonds (  )  const [inline]

Returns:
the number of bonds in the SMARTS pattern

Referenced by OBChemTsfm::Init().

void GetBond ( int &  src,
int &  dst,
int &  ord,
int  idx 
)

Access the bond idx in the internal pattern

Parameters:
src The index of the beginning atom
dst The index of the end atom
ord The bond order of this bond
idx The index of the bond in the SMARTS pattern

Referenced by OBChemTsfm::Init().

int GetAtomicNum ( int  idx  ) 

Returns:
the atomic number of the atom idx in the internal pattern

Referenced by OBChemTsfm::Init().

int GetCharge ( int  idx  ) 

Returns:
the formal charge of the atom idx in the internal pattern

Referenced by OBChemTsfm::Init(), OBChemTsfm::IsAcid(), and OBChemTsfm::IsBase().

int GetVectorBinding ( int  idx  )  const [inline]

Returns:
the vector binding of the atom idx in the internal pattern

Referenced by OBChemTsfm::Init().

bool Match ( OBMol mol,
bool  single = false 
)

Perform SMARTS matching for the pattern specified using Init().

Parameters:
mol The molecule to use for matching
single Whether only a single match is required (faster). Default is false.
Returns:
Whether matches occurred

Referenced by OBChemTsfm::Apply(), OBBondTyper::AssignFunctionalGroupBonds(), OpenBabel::CorrectBadResonanceForm(), OBMol::DoTransformations(), and OBAtom::MatchesSMARTS().

bool RestrictedMatch ( OBMol mol,
std::vector< std::pair< int, int > > &  pairs,
bool  single = false 
)

bool RestrictedMatch ( OBMol mol,
OBBitVec bv,
bool  single = false 
)

unsigned int NumMatches (  )  const [inline]

Returns:
the number of non-unique SMARTS matches To get the number of unique SMARTS matches, query GetUMapList()->size()

std::vector<std::vector<int> >& GetMapList (  )  [inline]

Returns:
the entire list of non-unique matches for this pattern
See also:
GetUMapList()

Referenced by OBRotorRules::GetRotorIncrements().

std::vector<std::vector<int> >::iterator BeginMList (  )  [inline]

Returns:
An iterator over the (non-unique) match list, starting at the beginning

std::vector<std::vector<int> >::iterator EndMList (  )  [inline]

Returns:
An iterator over the non-unique match list, set to the end

std::vector< std::vector< int > > & GetUMapList (  ) 

Returns:
the entire list of unique matches for this pattern A unique match is defined as one which does not cover the identical atoms that a previous match has covered.
For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned.

Referenced by OBChemTsfm::Apply(), OBBondTyper::AssignFunctionalGroupBonds(), OpenBabel::CorrectBadResonanceForm(), and OBAtom::MatchesSMARTS().

void WriteMapList ( std::ostream &  ofs  ) 

Debugging -- write a list of matches to the output stream.


Member Data Documentation

std::vector<bool> _growbond [protected]

Deprecated:
(Not used)

std::vector<std::vector<int> > _mlist [protected]

Pattern* _pat [protected]

std::string _str [protected]

The string of the SMARTS expression.

Referenced by OBSmartsPattern::Init(), and OBSmartsPattern::operator=().


The documentation for this class was generated from the following files: