SMARTS (SMiles ARbitrary Target Specification) substructure searching.
More...
#include <openbabel/parsmart.h>
List of all members.
Public Types |
enum | MatchType { All,
Single,
AllUnique
} |
Public Member Functions |
| OBSmartsPattern () |
virtual | ~OBSmartsPattern () |
| OBSmartsPattern (const OBSmartsPattern &cp) |
OBSmartsPattern & | operator= (const OBSmartsPattern &cp) |
void | WriteMapList (std::ostream &) |
|
bool | Init (const char *pattern) |
bool | Init (const std::string &pattern) |
|
const std::string & | GetSMARTS () const |
std::string & | GetSMARTS () |
bool | Empty () const |
bool | IsValid () const |
unsigned int | NumAtoms () const |
unsigned int | NumBonds () const |
void | GetBond (int &src, int &dst, int &ord, int idx) |
int | GetAtomicNum (int idx) |
int | GetCharge (int idx) |
int | GetVectorBinding (int idx) const |
|
bool | Match (OBMol &mol, bool single=false) |
bool | Match (OBMol &mol, std::vector< std::vector< int > > &mlist, MatchType mtype=All) const |
bool | HasMatch (OBMol &mol) const |
bool | RestrictedMatch (OBMol &mol, std::vector< std::pair< int, int > > &pairs, bool single=false) |
bool | RestrictedMatch (OBMol &mol, OBBitVec &bv, bool single=false) |
unsigned int | NumMatches () const |
std::vector< std::vector< int > > & | GetMapList () |
std::vector< std::vector< int >
>::iterator | BeginMList () |
std::vector< std::vector< int >
>::iterator | EndMList () |
std::vector< std::vector< int > > & | GetUMapList () |
Protected Member Functions |
Pattern * | ParseSMARTSPattern (void) |
Pattern * | ParseSMARTSPart (Pattern *, int) |
Pattern * | SMARTSError (Pattern *pat) |
Pattern * | ParseSMARTSError (Pattern *pat, BondExpr *expr) |
AtomExpr * | ParseSimpleAtomPrimitive (void) |
AtomExpr * | ParseComplexAtomPrimitive (void) |
AtomExpr * | ParseAtomExpr (int level) |
BondExpr * | ParseBondPrimitive (void) |
BondExpr * | ParseBondExpr (int level) |
Pattern * | ParseSMARTSString (char *ptr) |
Pattern * | ParseSMARTSRecord (char *ptr) |
int | GetVectorBinding () |
Pattern * | SMARTSParser (Pattern *pat, ParseState *stat, int prev, int part) |
Protected Attributes |
OBSmartsPrivate * | _d |
std::vector< bool > | _growbond |
std::vector< std::vector< int > > | _mlist |
Pattern * | _pat |
std::string | _str |
char * | _buffer |
char * | LexPtr |
char * | MainPtr |
Detailed Description
SMARTS (SMiles ARbitrary Target Specification) substructure searching.
Substructure search is an incredibly useful tool in the context of a small molecule programming library. Having an efficient substructure search engine reduces the amount of hard code needed for molecule perception, as well as increases the flexibility of certain operations. For instance, atom typing can be easily performed based on hard coded rules of element type and bond orders (or hybridization). Alternatively, atom typing can also be done by matching a set of substructure rules read at run time. In the latter case customization based on application (such as changing the pH) becomes a facile operation. Fortunately for Open Babel and its users, Roger Sayle donated a SMARTS parser which became the basis for SMARTS matching in Open Babel.
For more information on the SMARTS support in Open Babel, see the wiki page: http://openbabel.org/wiki/SMARTS
The SMARTS matcher, or OBSmartsPattern, is a separate object which can match patterns in the OBMol class. The following code demonstrates how to use the OBSmartsPattern class:
OBMol mol;
...
OBSmartsPattern sp;
sp.Init("CC");
sp.Match(mol);
vector<vector<int> > maplist;
maplist = sp.GetMapList();
vector<vector<int> >::iterator i;
vector<int>::iterator j;
for (i = maplist.begin();i != maplist.end();++i)
{
for (j = i->begin();j != i->end();++j)
cout << j << ' `;
cout << endl;
}
The preceding code reads in a molecule, initializes a SMARTS pattern of two single-bonded carbons, and locates all instances of the pattern in the molecule. Note that calling the Match() function does not return the results of the substructure match. The results from a match are stored in the OBSmartsPattern, and a call to GetMapList() or GetUMapList() must be made to extract the results. The function GetMapList() returns all matches of a particular pattern while GetUMapList() returns only the unique matches. For instance, the pattern [OD1]~C~[OD1] describes a carboxylate group. This pattern will match both atom number permutations of the carboxylate, and if GetMapList() is called, both matches will be returned. If GetUMapList() is called only unique matches of the pattern will be returned. A unique match is defined as one which does not cover the identical atoms that a previous match has covered.
Member Enumeration Documentation
Constructor & Destructor Documentation
Member Function Documentation
Pattern * ParseSMARTSPattern |
( |
void |
) |
[protected] |
AtomExpr * ParseSimpleAtomPrimitive |
( |
void |
) |
[protected] |
AtomExpr * ParseComplexAtomPrimitive |
( |
void |
) |
[protected] |
AtomExpr * ParseAtomExpr |
( |
int |
level ) |
[protected] |
BondExpr * ParseBondPrimitive |
( |
void |
) |
[protected] |
BondExpr * ParseBondExpr |
( |
int |
level ) |
[protected] |
Pattern * ParseSMARTSString |
( |
char * |
ptr ) |
[protected] |
Pattern * ParseSMARTSRecord |
( |
char * |
ptr ) |
[protected] |
int GetVectorBinding |
( |
) |
[protected] |
bool Init |
( |
const char * |
pattern ) |
|
Parse the pattern
SMARTS string.
- Returns:
- Whether the pattern is a valid SMARTS expression
Referenced by patty::assign_rules(), OBBondTyper::AssignFunctionalGroupBonds(), OpenBabel::CorrectBadResonanceForm(), OBMol::DoTransformations(), OBBuilder::LoadFragments(), OBAtom::MatchesSMARTS(), OBMol::NewPerceiveKekuleBonds(), OBRotorList::OBRotorList(), OBRotorRule::OBRotorRule(), OBAromaticTyper::ParseLine(), OBRingTyper::ParseLine(), OBAtomTyper::ParseLine(), OBPhModel::ParseLine(), OBBondTyper::ParseLine(), and patty::read_rules().
bool Init |
( |
const std::string & |
pattern ) |
|
Parse the pattern
SMARTS string.
- Returns:
- Whether the pattern is a valid SMARTS expression
const std::string& GetSMARTS |
( |
) |
const [inline] |
- Returns:
- the SMARTS string which is currently used
std::string& GetSMARTS |
( |
) |
[inline] |
- Returns:
- the SMARTS string which is currently used
bool Empty |
( |
) |
const [inline] |
- Returns:
- If the SMARTS pattern is an empty expression (e.g., invalid)
bool IsValid |
( |
) |
const [inline] |
- Returns:
- If the SMARTS pattern is a valid expression
unsigned int NumAtoms |
( |
) |
const [inline] |
unsigned int NumBonds |
( |
) |
const [inline] |
- Returns:
- the number of bonds in the SMARTS pattern
void GetBond |
( |
int & |
src, |
|
|
int & |
dst, |
|
|
int & |
ord, |
|
|
int |
idx |
|
) |
| |
Access the bond idx
in the internal pattern
- Parameters:
-
src | The index of the beginning atom |
dst | The index of the end atom |
ord | The bond order of this bond |
idx | The index of the bond in the SMARTS pattern |
int GetAtomicNum |
( |
int |
idx ) |
|
- Returns:
- the atomic number of the atom
idx
in the internal pattern
int GetCharge |
( |
int |
idx ) |
|
- Returns:
- the formal charge of the atom
idx
in the internal pattern
int GetVectorBinding |
( |
int |
idx ) |
const [inline] |
- Returns:
- the vector binding of the atom
idx
in the internal pattern
bool Match |
( |
OBMol & |
mol, |
|
|
bool |
single = false |
|
) |
| |
bool Match |
( |
OBMol & |
mol, |
|
|
std::vector< std::vector< int > > & |
mlist, |
|
|
MatchType |
mtype = All |
|
) |
| const |
Perform SMARTS matching for the pattern specified using Init(). This version is (more) thread safe.
- Parameters:
-
mol | The molecule to use for matching |
mlist | The resulting match list |
mtype | The match type to use. Default is All. |
- Returns:
- Whether matches occurred
bool HasMatch |
( |
OBMol & |
mol ) |
const |
Thread safe check for any SMARTS match
- Parameters:
-
mol | The molecule to use for matching |
- Returns:
- Whether there exists any match
bool RestrictedMatch |
( |
OBMol & |
mol, |
|
|
std::vector< std::pair< int, int > > & |
pairs, |
|
|
bool |
single = false |
|
) |
| |
bool RestrictedMatch |
( |
OBMol & |
mol, |
|
|
OBBitVec & |
bv, |
|
|
bool |
single = false |
|
) |
| |
unsigned int NumMatches |
( |
) |
const [inline] |
- Returns:
- the number of non-unique SMARTS matches To get the number of unique SMARTS matches, query GetUMapList()->size()
std::vector<std::vector<int> >& GetMapList |
( |
) |
[inline] |
std::vector<std::vector<int> >::iterator BeginMList |
( |
) |
[inline] |
- Returns:
- An iterator over the (non-unique) match list, starting at the beginning
std::vector<std::vector<int> >::iterator EndMList |
( |
) |
[inline] |
- Returns:
- An iterator over the non-unique match list, set to the end
std::vector< std::vector< int > > & GetUMapList |
( |
) |
|
void WriteMapList |
( |
std::ostream & |
ofs ) |
|
Debugging -- write a list of matches to the output stream.
Member Data Documentation
OBSmartsPrivate* _d [protected] |
Internal data storage for future expansion.
std::vector<std::vector<int> > _mlist [protected] |
The parsed SMARTS pattern.
std::string _str [protected] |
The documentation for this class was generated from the following files: