Open Babel  3.0
Public Types | Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
OBSmartsPattern Class Reference

#include <openbabel/parsmart.h>

Public Types

enum  MatchType { All, Single, AllUnique }
 

Public Member Functions

 OBSmartsPattern ()
 
virtual ~OBSmartsPattern ()
 
 OBSmartsPattern (const OBSmartsPattern &cp)
 
OBSmartsPatternoperator= (const OBSmartsPattern &cp)
 
void WriteMapList (std::ostream &)
 
Initialization Methods
bool Init (const char *pattern)
 
bool Init (const std::string &pattern)
 
Pattern Properties
const std::string & GetSMARTS () const
 
std::string & GetSMARTS ()
 
bool Empty () const
 
bool IsValid () const
 
unsigned int NumAtoms () const
 
unsigned int NumBonds () const
 
void GetBond (int &src, int &dst, int &ord, int idx)
 
int GetAtomicNum (int idx)
 
int GetCharge (int idx)
 
int GetVectorBinding (int idx) const
 
Matching methods (SMARTS on a specific OBMol)
bool Match (OBMol &mol, bool single=false)
 
bool Match (OBMol &mol, std::vector< std::vector< int > > &mlist, MatchType mtype=All) const
 
bool HasMatch (OBMol &mol) const
 
bool RestrictedMatch (OBMol &mol, std::vector< std::pair< int, int > > &pairs, bool single=false)
 
bool RestrictedMatch (OBMol &mol, OBBitVec &bv, bool single=false)
 
unsigned int NumMatches () const
 
std::vector< std::vector< int > > & GetMapList ()
 
std::vector< std::vector< int > >::iterator BeginMList ()
 
std::vector< std::vector< int > >::iterator EndMList ()
 
std::vector< std::vector< int > > & GetUMapList ()
 

Protected Member Functions

PatternParseSMARTSPattern (void)
 
PatternParseSMARTSPart (Pattern *, int)
 
PatternSMARTSError (Pattern *pat)
 
PatternParseSMARTSError (Pattern *pat, BondExpr *expr)
 
AtomExprParseSimpleAtomPrimitive (void)
 
AtomExprParseComplexAtomPrimitive (void)
 
AtomExprParseAtomExpr (int level)
 
BondExprParseBondPrimitive (void)
 
BondExprParseBondExpr (int level)
 
PatternParseSMARTSString (char *ptr)
 
PatternParseSMARTSRecord (char *ptr)
 
int GetVectorBinding ()
 
PatternSMARTSParser (Pattern *pat, ParseState *stat, int prev, int part)
 

Protected Attributes

OBSmartsPrivate * _d
 
std::vector< bool > _growbond
 
std::vector< std::vector< int > > _mlist
 
Pattern_pat
 
std::string _str
 
char * _buffer
 
char * LexPtr
 
char * MainPtr
 

Detailed Description

SMARTS (SMiles ARbitrary Target Specification) substructure searching.

Substructure search is an incredibly useful tool in the context of a small molecule programming library. Having an efficient substructure search engine reduces the amount of hard code needed for molecule perception, as well as increases the flexibility of certain operations. For instance, atom typing can be easily performed based on hard coded rules of element type and bond orders (or hybridization). Alternatively, atom typing can also be done by matching a set of substructure rules read at run time. In the latter case customization based on application (such as changing the pH) becomes a facile operation. Fortunately for Open Babel and its users, Roger Sayle donated a SMARTS parser which became the basis for SMARTS matching in Open Babel.

For more information on the SMARTS support in Open Babel, see the wiki page: http://openbabel.org/wiki/SMARTS

The SMARTS matcher, or OBSmartsPattern, is a separate object which can match patterns in the OBMol class. The following code demonstrates how to use the OBSmartsPattern class:

OBMol mol;
...
OBSmartsPattern sp;
sp.Init("CC");
sp.Match(mol);
vector<vector<int> > maplist;
maplist = sp.GetMapList();
//or maplist = sp.GetUMapList();
//print out the results
vector<vector<int> >::iterator i;
vector<int>::iterator j;
for (i = maplist.begin();i != maplist.end();++i)
{
for (j = i->begin();j != i->end();++j)
cout << j << ' `;
cout << endl;
}

The preceding code reads in a molecule, initializes a SMARTS pattern of two single-bonded carbons, and locates all instances of the pattern in the molecule. Note that calling the Match() function does not return the results of the substructure match. The results from a match are stored in the OBSmartsPattern, and a call to GetMapList() or GetUMapList() must be made to extract the results. The function GetMapList() returns all matches of a particular pattern while GetUMapList() returns only the unique matches. For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned. A unique match is defined as one which does not cover the identical atoms that a previous match has covered.

Member Enumeration Documentation

◆ MatchType

enum MatchType
Enumerator
All 
Single 
AllUnique 

Constructor & Destructor Documentation

◆ OBSmartsPattern() [1/2]

OBSmartsPattern ( )
inline

◆ ~OBSmartsPattern()

~OBSmartsPattern ( )
virtual

◆ OBSmartsPattern() [2/2]

OBSmartsPattern ( const OBSmartsPattern cp)
inline

Member Function Documentation

◆ ParseSMARTSPattern()

Pattern * ParseSMARTSPattern ( void  )
protected

◆ ParseSMARTSPart()

Pattern * ParseSMARTSPart ( Pattern result,
int  part 
)
protected

◆ SMARTSError()

Pattern * SMARTSError ( Pattern pat)
protected

◆ ParseSMARTSError()

Pattern * ParseSMARTSError ( Pattern pat,
BondExpr expr 
)
protected

◆ ParseSimpleAtomPrimitive()

AtomExpr * ParseSimpleAtomPrimitive ( void  )
protected

◆ ParseComplexAtomPrimitive()

AtomExpr * ParseComplexAtomPrimitive ( void  )
protected

◆ ParseAtomExpr()

AtomExpr * ParseAtomExpr ( int  level)
protected

◆ ParseBondPrimitive()

BondExpr * ParseBondPrimitive ( void  )
protected

◆ ParseBondExpr()

BondExpr * ParseBondExpr ( int  level)
protected

◆ ParseSMARTSString()

Pattern * ParseSMARTSString ( char *  ptr)
protected

◆ ParseSMARTSRecord()

Pattern * ParseSMARTSRecord ( char *  ptr)
protected

◆ GetVectorBinding() [1/2]

int GetVectorBinding ( )
protected

◆ SMARTSParser()

Pattern * SMARTSParser ( Pattern pat,
ParseState stat,
int  prev,
int  part 
)
protected

◆ operator=()

OBSmartsPattern& operator= ( const OBSmartsPattern cp)
inline

◆ Init() [1/2]

bool Init ( const char *  pattern)

◆ Init() [2/2]

bool Init ( const std::string &  pattern)

Parse the pattern SMARTS string.

Returns
Whether the pattern is a valid SMARTS expression

◆ GetSMARTS() [1/2]

const std::string& GetSMARTS ( ) const
inline
Returns
the SMARTS string which is currently used

◆ GetSMARTS() [2/2]

std::string& GetSMARTS ( )
inline
Returns
the SMARTS string which is currently used

◆ Empty()

bool Empty ( ) const
inline
Returns
If the SMARTS pattern is an empty expression (e.g., invalid)

◆ IsValid()

bool IsValid ( ) const
inline
Returns
If the SMARTS pattern is a valid expression

Referenced by OBRotorRule::IsValid().

◆ NumAtoms()

unsigned int NumAtoms ( ) const
inline
Returns
the number of atoms in the SMARTS pattern

Referenced by OBPhModel::ParseLine().

◆ NumBonds()

unsigned int NumBonds ( ) const
inline
Returns
the number of bonds in the SMARTS pattern

◆ GetBond()

void GetBond ( int &  src,
int &  dst,
int &  ord,
int  idx 
)

Access the bond idx in the internal pattern

Parameters
srcThe index of the beginning atom
dstThe index of the end atom
ordThe bond order of this bond
idxThe index of the bond in the SMARTS pattern

◆ GetAtomicNum()

int GetAtomicNum ( int  idx)
Returns
the atomic number of the atom idx in the internal pattern

◆ GetCharge()

int GetCharge ( int  idx)
Returns
the formal charge of the atom idx in the internal pattern

◆ GetVectorBinding() [2/2]

int GetVectorBinding ( int  idx) const
inline
Returns
the vector binding of the atom idx in the internal pattern

◆ Match() [1/2]

bool Match ( OBMol mol,
bool  single = false 
)

Perform SMARTS matching for the pattern specified using Init().

Parameters
molThe molecule to use for matching
singleWhether only a single match is required (faster). Default is false.
Returns
Whether matches occurred

Referenced by OBBondTyper::AssignFunctionalGroupBonds(), OBBuilder::Build(), OBMol::DoTransformations(), and OBAtom::MatchesSMARTS().

◆ Match() [2/2]

bool Match ( OBMol mol,
std::vector< std::vector< int > > &  mlist,
MatchType  mtype = All 
) const

Perform SMARTS matching for the pattern specified using Init(). This version is (more) thread safe.

Parameters
molThe molecule to use for matching
mlistThe resulting match list
mtypeThe match type to use. Default is All.
Returns
Whether matches occurred

◆ HasMatch()

bool HasMatch ( OBMol mol) const

Thread safe check for any SMARTS match

Parameters
molThe molecule to use for matching
Returns
Whether there exists any match

◆ RestrictedMatch() [1/2]

bool RestrictedMatch ( OBMol mol,
std::vector< std::pair< int, int > > &  pairs,
bool  single = false 
)

◆ RestrictedMatch() [2/2]

bool RestrictedMatch ( OBMol mol,
OBBitVec bv,
bool  single = false 
)

◆ NumMatches()

unsigned int NumMatches ( ) const
inline
Returns
the number of non-unique SMARTS matches To get the number of unique SMARTS matches, query GetUMapList()->size()

◆ GetMapList()

std::vector<std::vector<int> >& GetMapList ( )
inline
Returns
the entire list of non-unique matches for this pattern
See also
GetUMapList()

Referenced by OBRotorRules::GetRotorIncrements().

◆ BeginMList()

std::vector<std::vector<int> >::iterator BeginMList ( )
inline
Returns
An iterator over the (non-unique) match list, starting at the beginning

◆ EndMList()

std::vector<std::vector<int> >::iterator EndMList ( )
inline
Returns
An iterator over the non-unique match list, set to the end

◆ GetUMapList()

std::vector< std::vector< int > > & GetUMapList ( )
Returns
the entire list of unique matches for this pattern A unique match is defined as one which does not cover the identical atoms that a previous match has covered.

For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned.

Referenced by OBBondTyper::AssignFunctionalGroupBonds(), OBBuilder::Build(), and OBAtom::MatchesSMARTS().

◆ WriteMapList()

void WriteMapList ( std::ostream &  ofs)

Debugging – write a list of matches to the output stream.

Member Data Documentation

◆ _d

OBSmartsPrivate* _d
protected

Internal data storage for future expansion.

◆ _growbond

std::vector<bool> _growbond
protected
Deprecated:
(Not used)

◆ _mlist

std::vector<std::vector<int> > _mlist
protected

The list of matches.

◆ _pat

Pattern* _pat
protected

The parsed SMARTS pattern.

◆ _str

std::string _str
protected

The string of the SMARTS expression.

Referenced by OBSmartsPattern::operator=().

◆ _buffer

char* _buffer
protected

◆ LexPtr

char* LexPtr
protected

◆ MainPtr

char* MainPtr
protected

The documentation for this class was generated from the following files: