OBSmartsPattern Class Reference
[Substructure Searching]
SMARTS (SMiles ARbitrary Target Specification) substructure searching. More...
#include <openbabel/parsmart.h>
Public Types | |
enum | MatchType { All, Single, AllUnique } |
Public Member Functions | |
OBSmartsPattern () | |
virtual | ~OBSmartsPattern () |
OBSmartsPattern (const OBSmartsPattern &cp) | |
OBSmartsPattern & | operator= (const OBSmartsPattern &cp) |
void | WriteMapList (std::ostream &) |
Initialization Methods | |
bool | Init (const char *pattern) |
bool | Init (const std::string &pattern) |
Pattern Properties | |
const std::string & | GetSMARTS () const |
std::string & | GetSMARTS () |
bool | Empty () const |
bool | IsValid () const |
unsigned int | NumAtoms () const |
unsigned int | NumBonds () const |
void | GetBond (int &src, int &dst, int &ord, int idx) |
int | GetAtomicNum (int idx) |
int | GetCharge (int idx) |
int | GetVectorBinding (int idx) const |
Matching methods (SMARTS on a specific OBMol) | |
bool | Match (OBMol &mol, bool single=false) |
bool | Match (OBMol &mol, std::vector< std::vector< int > > &mlist, MatchType mtype=All) const |
bool | HasMatch (OBMol &mol) const |
bool | RestrictedMatch (OBMol &mol, std::vector< std::pair< int, int > > &pairs, bool single=false) |
bool | RestrictedMatch (OBMol &mol, OBBitVec &bv, bool single=false) |
unsigned int | NumMatches () const |
std::vector< std::vector< int > > & | GetMapList () |
std::vector< std::vector< int > >::iterator | BeginMList () |
std::vector< std::vector< int > >::iterator | EndMList () |
std::vector< std::vector< int > > & | GetUMapList () |
Protected Member Functions | |
Pattern * | ParseSMARTSPattern (void) |
Pattern * | ParseSMARTSPart (Pattern *, int) |
Pattern * | SMARTSError (Pattern *pat) |
Pattern * | ParseSMARTSError (Pattern *pat, BondExpr *expr) |
AtomExpr * | ParseSimpleAtomPrimitive (void) |
AtomExpr * | ParseComplexAtomPrimitive (void) |
AtomExpr * | ParseAtomExpr (int level) |
BondExpr * | ParseBondPrimitive (void) |
BondExpr * | ParseBondExpr (int level) |
Pattern * | ParseSMARTSString (char *ptr) |
Pattern * | ParseSMARTSRecord (char *ptr) |
int | GetVectorBinding () |
Pattern * | SMARTSParser (Pattern *pat, ParseState *stat, int prev, int part) |
Protected Attributes | |
OBSmartsPrivate * | _d |
std::vector< bool > | _growbond |
std::vector< std::vector< int > > | _mlist |
Pattern * | _pat |
std::string | _str |
char * | _buffer |
char * | LexPtr |
char * | MainPtr |
Detailed Description
SMARTS (SMiles ARbitrary Target Specification) substructure searching.
Substructure search is an incredibly useful tool in the context of a small molecule programming library. Having an efficient substructure search engine reduces the amount of hard code needed for molecule perception, as well as increases the flexibility of certain operations. For instance, atom typing can be easily performed based on hard coded rules of element type and bond orders (or hybridization). Alternatively, atom typing can also be done by matching a set of substructure rules read at run time. In the latter case customization based on application (such as changing the pH) becomes a facile operation. Fortunately for Open Babel and its users, Roger Sayle donated a SMARTS parser which became the basis for SMARTS matching in Open Babel.
For more information on the SMARTS support in Open Babel, see the wiki page: http://openbabel.org/wiki/SMARTS
The SMARTS matcher, or OBSmartsPattern, is a separate object which can match patterns in the OBMol class. The following code demonstrates how to use the OBSmartsPattern class:
OBMol mol; ... OBSmartsPattern sp; sp.Init("CC"); sp.Match(mol); vector<vector<int> > maplist; maplist = sp.GetMapList(); //or maplist = sp.GetUMapList(); //print out the results vector<vector<int> >::iterator i; vector<int>::iterator j; for (i = maplist.begin();i != maplist.end();++i) { for (j = i->begin();j != i->end();++j) cout << j << ' `; cout << endl; }
The preceding code reads in a molecule, initializes a SMARTS pattern of two single-bonded carbons, and locates all instances of the pattern in the molecule. Note that calling the Match() function does not return the results of the substructure match. The results from a match are stored in the OBSmartsPattern, and a call to GetMapList() or GetUMapList() must be made to extract the results. The function GetMapList() returns all matches of a particular pattern while GetUMapList() returns only the unique matches. For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned. A unique match is defined as one which does not cover the identical atoms that a previous match has covered.
Member Enumeration Documentation
enum MatchType |
Constructor & Destructor Documentation
OBSmartsPattern | ( | ) | [inline] |
~OBSmartsPattern | ( | ) | [virtual] |
OBSmartsPattern | ( | const OBSmartsPattern & | cp | ) | [inline] |
Member Function Documentation
Pattern * ParseSMARTSPattern | ( | void | ) | [protected] |
Referenced by OBSmartsPattern::ParseComplexAtomPrimitive(), and OBSmartsPattern::ParseSMARTSString().
Referenced by OBSmartsPattern::ParseSMARTSPattern().
Referenced by OBSmartsPattern::SMARTSParser().
AtomExpr * ParseSimpleAtomPrimitive | ( | void | ) | [protected] |
Referenced by OBSmartsPattern::SMARTSParser().
AtomExpr * ParseComplexAtomPrimitive | ( | void | ) | [protected] |
Referenced by OBSmartsPattern::ParseAtomExpr().
AtomExpr * ParseAtomExpr | ( | int | level | ) | [protected] |
Referenced by OBSmartsPattern::SMARTSParser().
BondExpr * ParseBondPrimitive | ( | void | ) | [protected] |
Referenced by OBSmartsPattern::ParseBondExpr().
BondExpr * ParseBondExpr | ( | int | level | ) | [protected] |
Referenced by OBSmartsPattern::SMARTSParser().
Pattern * ParseSMARTSString | ( | char * | ptr | ) | [protected] |
Referenced by OBSmartsPattern::ParseSMARTSRecord().
Pattern * ParseSMARTSRecord | ( | char * | ptr | ) | [protected] |
Referenced by OBSmartsPattern::Init().
int GetVectorBinding | ( | ) | [protected] |
Referenced by OBChemTsfm::Init(), and OBSmartsPattern::SMARTSParser().
Pattern * SMARTSParser | ( | Pattern * | pat, | |
ParseState * | stat, | |||
int | prev, | |||
int | part | |||
) | [protected] |
Referenced by OBSmartsPattern::ParseSMARTSPart().
OBSmartsPattern& operator= | ( | const OBSmartsPattern & | cp | ) | [inline] |
bool Init | ( | const char * | pattern | ) |
Parse the pattern
SMARTS string.
- Returns:
- Whether the pattern is a valid SMARTS expression
Referenced by patty::assign_rules(), OBBondTyper::AssignFunctionalGroupBonds(), OpenBabel::CorrectBadResonanceForm(), OBMol::DoTransformations(), OBChemTsfm::Init(), OBBuilder::LoadFragments(), OBAtom::MatchesSMARTS(), OBMol::NewPerceiveKekuleBonds(), OBRotorList::OBRotorList(), OBAromaticTyper::ParseLine(), OBRingTyper::ParseLine(), OBAtomTyper::ParseLine(), OBPhModel::ParseLine(), OBBondTyper::ParseLine(), and patty::read_rules().
bool Init | ( | const std::string & | pattern | ) |
Parse the pattern
SMARTS string.
- Returns:
- Whether the pattern is a valid SMARTS expression
const std::string& GetSMARTS | ( | ) | const [inline] |
- Returns:
- the SMARTS string which is currently used
std::string& GetSMARTS | ( | ) | [inline] |
- Returns:
- the SMARTS string which is currently used
bool Empty | ( | ) | const [inline] |
- Returns:
- If the SMARTS pattern is an empty expression (e.g., invalid)
bool IsValid | ( | ) | const [inline] |
- Returns:
- If the SMARTS pattern is a valid expression
unsigned int NumAtoms | ( | ) | const [inline] |
- Returns:
- the number of atoms in the SMARTS pattern
Referenced by OBChemTsfm::Init(), OBChemTsfm::IsAcid(), OBChemTsfm::IsBase(), and OBPhModel::ParseLine().
unsigned int NumBonds | ( | ) | const [inline] |
- Returns:
- the number of bonds in the SMARTS pattern
Referenced by OBChemTsfm::Init().
void GetBond | ( | int & | src, | |
int & | dst, | |||
int & | ord, | |||
int | idx | |||
) |
Access the bond idx
in the internal pattern
- Parameters:
-
src The index of the beginning atom dst The index of the end atom ord The bond order of this bond idx The index of the bond in the SMARTS pattern
Referenced by OBChemTsfm::Init().
int GetAtomicNum | ( | int | idx | ) |
- Returns:
- the atomic number of the atom
idx
in the internal pattern
Referenced by OBChemTsfm::Init().
int GetCharge | ( | int | idx | ) |
- Returns:
- the formal charge of the atom
idx
in the internal pattern
Referenced by OBChemTsfm::Init(), OBChemTsfm::IsAcid(), and OBChemTsfm::IsBase().
int GetVectorBinding | ( | int | idx | ) | const [inline] |
- Returns:
- the vector binding of the atom
idx
in the internal pattern
bool Match | ( | OBMol & | mol, | |
bool | single = false | |||
) |
Perform SMARTS matching for the pattern specified using Init().
- Parameters:
-
mol The molecule to use for matching single Whether only a single match is required (faster). Default is false.
- Returns:
- Whether matches occurred
Referenced by OBChemTsfm::Apply(), OBBondTyper::AssignFunctionalGroupBonds(), OpenBabel::CorrectBadResonanceForm(), OBMol::DoTransformations(), OBSmartsPattern::HasMatch(), OBAtom::MatchesSMARTS(), and OBMol::NewPerceiveKekuleBonds().
Perform SMARTS matching for the pattern specified using Init(). This version is (more) thread safe.
- Parameters:
-
mol The molecule to use for matching mlist The resulting match list mtype The match type to use. Default is All.
- Returns:
- Whether matches occurred
bool HasMatch | ( | OBMol & | mol | ) | const |
Thread safe check for any SMARTS match
- Parameters:
-
mol The molecule to use for matching
- Returns:
- Whether there exists any match
bool RestrictedMatch | ( | OBMol & | mol, | |
std::vector< std::pair< int, int > > & | pairs, | |||
bool | single = false | |||
) |
Referenced by OBRotorRules::GetRotorIncrements().
unsigned int NumMatches | ( | ) | const [inline] |
- Returns:
- the number of non-unique SMARTS matches To get the number of unique SMARTS matches, query GetUMapList()->size()
std::vector<std::vector<int> >& GetMapList | ( | ) | [inline] |
- Returns:
- the entire list of non-unique matches for this pattern
- See also:
- GetUMapList()
Referenced by OBRotorRules::GetRotorIncrements().
std::vector<std::vector<int> >::iterator BeginMList | ( | ) | [inline] |
- Returns:
- An iterator over the (non-unique) match list, starting at the beginning
std::vector<std::vector<int> >::iterator EndMList | ( | ) | [inline] |
- Returns:
- An iterator over the non-unique match list, set to the end
std::vector< std::vector< int > > & GetUMapList | ( | ) |
- Returns:
- the entire list of unique matches for this pattern A unique match is defined as one which does not cover the identical atoms that a previous match has covered.
For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned.
Referenced by OBChemTsfm::Apply(), OBBondTyper::AssignFunctionalGroupBonds(), OpenBabel::CorrectBadResonanceForm(), and OBAtom::MatchesSMARTS().
void WriteMapList | ( | std::ostream & | ofs | ) |
Debugging -- write a list of matches to the output stream.
Member Data Documentation
OBSmartsPrivate* _d [protected] |
Internal data storage for future expansion.
std::vector<bool> _growbond [protected] |
- Deprecated:
- (Not used)
std::vector<std::vector<int> > _mlist [protected] |
The list of matches.
Referenced by OBSmartsPattern::GetUMapList(), OBSmartsPattern::Match(), OBSmartsPattern::RestrictedMatch(), and OBSmartsPattern::WriteMapList().
The parsed SMARTS pattern.
Referenced by OBSmartsPattern::GetAtomicNum(), OBSmartsPattern::GetBond(), OBSmartsPattern::GetCharge(), OBSmartsPattern::Init(), OBSmartsPattern::Match(), OBSmartsPattern::RestrictedMatch(), and OBSmartsPattern::~OBSmartsPattern().
std::string _str [protected] |
The string of the SMARTS expression.
Referenced by OBSmartsPattern::Init(), and OBSmartsPattern::operator=().
char* _buffer [protected] |
Referenced by OBSmartsPattern::Init(), and OBSmartsPattern::~OBSmartsPattern().
char* LexPtr [protected] |
Referenced by OBSmartsPattern::GetVectorBinding(), OBSmartsPattern::ParseAtomExpr(), OBSmartsPattern::ParseBondExpr(), OBSmartsPattern::ParseBondPrimitive(), OBSmartsPattern::ParseComplexAtomPrimitive(), OBSmartsPattern::ParseSimpleAtomPrimitive(), OBSmartsPattern::ParseSMARTSPattern(), OBSmartsPattern::ParseSMARTSString(), OBSmartsPattern::SMARTSError(), and OBSmartsPattern::SMARTSParser().
char* MainPtr [protected] |
Referenced by OBSmartsPattern::ParseSMARTSString(), and OBSmartsPattern::SMARTSError().
The documentation for this class was generated from the following files: