Open Babel
3.0
|
#include <openbabel/parsmart.h>
Public Types | |
enum | MatchType { All, Single, AllUnique } |
Public Member Functions | |
OBSmartsPattern () | |
virtual | ~OBSmartsPattern () |
OBSmartsPattern (const OBSmartsPattern &cp) | |
OBSmartsPattern & | operator= (const OBSmartsPattern &cp) |
void | WriteMapList (std::ostream &) |
Initialization Methods | |
bool | Init (const char *pattern) |
bool | Init (const std::string &pattern) |
Pattern Properties | |
const std::string & | GetSMARTS () const |
std::string & | GetSMARTS () |
bool | Empty () const |
bool | IsValid () const |
unsigned int | NumAtoms () const |
unsigned int | NumBonds () const |
void | GetBond (int &src, int &dst, int &ord, int idx) |
int | GetAtomicNum (int idx) |
int | GetCharge (int idx) |
int | GetVectorBinding (int idx) const |
Matching methods (SMARTS on a specific OBMol) | |
bool | Match (OBMol &mol, bool single=false) |
bool | Match (OBMol &mol, std::vector< std::vector< int > > &mlist, MatchType mtype=All) const |
bool | HasMatch (OBMol &mol) const |
bool | RestrictedMatch (OBMol &mol, std::vector< std::pair< int, int > > &pairs, bool single=false) |
bool | RestrictedMatch (OBMol &mol, OBBitVec &bv, bool single=false) |
unsigned int | NumMatches () const |
std::vector< std::vector< int > > & | GetMapList () |
std::vector< std::vector< int > >::iterator | BeginMList () |
std::vector< std::vector< int > >::iterator | EndMList () |
std::vector< std::vector< int > > & | GetUMapList () |
Protected Member Functions | |
Pattern * | ParseSMARTSPattern (void) |
Pattern * | ParseSMARTSPart (Pattern *, int) |
Pattern * | SMARTSError (Pattern *pat) |
Pattern * | ParseSMARTSError (Pattern *pat, BondExpr *expr) |
AtomExpr * | ParseSimpleAtomPrimitive (void) |
AtomExpr * | ParseComplexAtomPrimitive (void) |
AtomExpr * | ParseAtomExpr (int level) |
BondExpr * | ParseBondPrimitive (void) |
BondExpr * | ParseBondExpr (int level) |
Pattern * | ParseSMARTSString (char *ptr) |
Pattern * | ParseSMARTSRecord (char *ptr) |
int | GetVectorBinding () |
Pattern * | SMARTSParser (Pattern *pat, ParseState *stat, int prev, int part) |
Protected Attributes | |
OBSmartsPrivate * | _d |
std::vector< bool > | _growbond |
std::vector< std::vector< int > > | _mlist |
Pattern * | _pat |
std::string | _str |
char * | _buffer |
char * | LexPtr |
char * | MainPtr |
SMARTS (SMiles ARbitrary Target Specification) substructure searching.
Substructure search is an incredibly useful tool in the context of a small molecule programming library. Having an efficient substructure search engine reduces the amount of hard code needed for molecule perception, as well as increases the flexibility of certain operations. For instance, atom typing can be easily performed based on hard coded rules of element type and bond orders (or hybridization). Alternatively, atom typing can also be done by matching a set of substructure rules read at run time. In the latter case customization based on application (such as changing the pH) becomes a facile operation. Fortunately for Open Babel and its users, Roger Sayle donated a SMARTS parser which became the basis for SMARTS matching in Open Babel.
For more information on the SMARTS support in Open Babel, see the wiki page: http://openbabel.org/wiki/SMARTS
The SMARTS matcher, or OBSmartsPattern, is a separate object which can match patterns in the OBMol class. The following code demonstrates how to use the OBSmartsPattern class:
The preceding code reads in a molecule, initializes a SMARTS pattern of two single-bonded carbons, and locates all instances of the pattern in the molecule. Note that calling the Match() function does not return the results of the substructure match. The results from a match are stored in the OBSmartsPattern, and a call to GetMapList() or GetUMapList() must be made to extract the results. The function GetMapList() returns all matches of a particular pattern while GetUMapList() returns only the unique matches. For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned. A unique match is defined as one which does not cover the identical atoms that a previous match has covered.
enum MatchType |
|
inline |
|
virtual |
|
inline |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
inline |
bool Init | ( | const char * | pattern | ) |
Parse the pattern
SMARTS string.
Referenced by patty::assign_rules(), OBBondTyper::AssignFunctionalGroupBonds(), OBBuilder::Build(), OBMol::DoTransformations(), OBBuilder::LoadFragments(), OBAtom::MatchesSMARTS(), OBRotorRule::OBRotorRule(), OBBondTyper::ParseLine(), OBAtomTyper::ParseLine(), OBRingTyper::ParseLine(), OBPhModel::ParseLine(), and patty::read_rules().
bool Init | ( | const std::string & | pattern | ) |
Parse the pattern
SMARTS string.
|
inline |
|
inline |
|
inline |
|
inline |
Referenced by OBRotorRule::IsValid().
|
inline |
Referenced by OBPhModel::ParseLine().
|
inline |
void GetBond | ( | int & | src, |
int & | dst, | ||
int & | ord, | ||
int | idx | ||
) |
Access the bond idx
in the internal pattern
src | The index of the beginning atom |
dst | The index of the end atom |
ord | The bond order of this bond |
idx | The index of the bond in the SMARTS pattern |
int GetAtomicNum | ( | int | idx | ) |
idx
in the internal pattern int GetCharge | ( | int | idx | ) |
idx
in the internal pattern
|
inline |
idx
in the internal pattern bool Match | ( | OBMol & | mol, |
bool | single = false |
||
) |
Perform SMARTS matching for the pattern specified using Init().
mol | The molecule to use for matching |
single | Whether only a single match is required (faster). Default is false. |
Referenced by OBBondTyper::AssignFunctionalGroupBonds(), OBBuilder::Build(), OBMol::DoTransformations(), and OBAtom::MatchesSMARTS().
Perform SMARTS matching for the pattern specified using Init(). This version is (more) thread safe.
mol | The molecule to use for matching |
mlist | The resulting match list |
mtype | The match type to use. Default is All. |
bool HasMatch | ( | OBMol & | mol | ) | const |
Thread safe check for any SMARTS match
mol | The molecule to use for matching |
bool RestrictedMatch | ( | OBMol & | mol, |
std::vector< std::pair< int, int > > & | pairs, | ||
bool | single = false |
||
) |
Referenced by OBRotorRules::GetRotorIncrements().
|
inline |
|
inline |
Referenced by OBRotorRules::GetRotorIncrements().
|
inline |
|
inline |
std::vector< std::vector< int > > & GetUMapList | ( | ) |
For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned.
Referenced by OBBondTyper::AssignFunctionalGroupBonds(), OBBuilder::Build(), and OBAtom::MatchesSMARTS().
void WriteMapList | ( | std::ostream & | ofs | ) |
Debugging – write a list of matches to the output stream.
|
protected |
Internal data storage for future expansion.
|
protected |
|
protected |
The list of matches.
|
protected |
The parsed SMARTS pattern.
|
protected |
The string of the SMARTS expression.
Referenced by OBSmartsPattern::operator=().
|
protected |
|
protected |
|
protected |