OBDescriptor Class Reference

Base class for molecular properties, descriptors or features. More...

#include <openbabel/descriptor.h>

Inheritance diagram for OBDescriptor:

Inheritance graph
[legend]

List of all members.

Public Types

typedef std::map< const char
*, OBPlugin *, CharPtrLess
PluginMapType
typedef
PluginMapType::const_iterator 
PluginIterator

Public Member Functions

const char * TypeID ()
virtual double Predict (OBBase *pOb)
double PredictAndSave (OBBase *pOb)
virtual double GetStringValue (OBBase *pOb, std::string &svalue)
virtual bool Compare (OBBase *pOb, std::istream &ss, bool noEval)
virtual bool Display (std::string &txt, const char *param, const char *ID=NULL)
virtual const char * Description ()
virtual OBPluginMakeInstance (const std::vector< std::string > &)
const char * GetID () const
virtual PluginMapTypeGetMap () const =0

Static Public Member Functions

static bool FilterCompare (OBBase *pOb, std::istream &ss, bool noEval)
static void AddProperties (OBBase *pOb, const std::string &DescrList)
static void DeleteProperties (OBBase *pOb, const std::string &DescrList)
static std::string GetValues (OBBase *pOb, const std::string &DescrList)
static OBPluginGetPlugin (const char *Type, const char *ID)
static bool ListAsVector (const char *PluginID, const char *param, std::vector< std::string > &vlist)
static void List (const char *PluginID, const char *param=NULL, std::ostream *os=&std::cout)
static std::string ListAsString (const char *PluginID, const char *param=NULL)
static std::string FirstLine (const char *txt)
static PluginIterator Begin (const char *PluginID)
static PluginIterator End (const char *PluginID)

Static Protected Member Functions

static std::string GetIdentifier (std::istream &optionText)
static double ParsePredicate (std::istream &optionText, char &ch1, char &ch2, std::string &svalue)
static bool ReadStringFromFilter (std::istream &ss, std::string &result)
static bool CompareStringWithFilter (std::istream &optionText, std::string &s, bool noEval, bool NoCompOK=false)
static bool ispunctU (char ch)
static bool MatchPairData (OBBase *pOb, std::string &s)
static PluginMapTypePluginMap ()
static PluginMapTypeGetTypeMap (const char *PluginID)
static OBPluginBaseFindType (PluginMapType &Map, const char *ID)

Protected Attributes

const char * _id


Detailed Description

Base class for molecular properties, descriptors or features.

Since:
version 2.2
OBDescriptor and Filtering

On the command line, using the option --filter filter-string converts only those molecules which meet the criteria specified in the filter-string. This is useful to select particular molecules from a set. It is used like: babel dataset.sdf outfile.smi --filter "MW>200 SMARTS!=c1ccccc1 PUBCHEM_CACTVS_ROTATABLE_BOND<5"

The identifier , "PUBCHEM_CACTVS_ROTATABLE_BOND" is the name of an attribute of an OBPairData which has probably been imported from a property in a SDF or CML file. The identifier names are (currently) case dependent. A comparison is made with the value in the OBPairData. This is a numeric comparison if both operands can be converted to numbers (as in the example). If the 5 had been enclosed in single or double quotes the comparison would have been a string comparison, which gives a different result in some cases. OBPairData is searched first to match an identifier.

If there are no OBPair attributes that match, the identifier is taken to be the ID of an OBDescriptor class object. The class OBDescriptor is the base class for classes that wrap molecular properties, descriptors or features. In the example "MW" and "SMARTS" are OBDescriptor IDs and are case independent. They are plugin classes, like fingerprints, forcefields and formats, so that new molecular features can be added or old ones removed (to prevent code bloat) without altering old code. A list of available descriptors is available from the commandline: babel -L descriptors or from the functions OBPlugin::List, OBPlugin::ListAsString and OBPlugin::ListAsVector.

The filter-string is interpreted by a static function of OBDescriptor, FilterCompare(). This identifies the descriptor IDs and then calls a virtual function, Compare(), of each OBDescriptor class to interpret the rest of relational expression, for example, ">200", or "=c1ccccc1". The default version of Compare() is suitable for descriptors, like MW or logP, which return a double from their Predict() method. Classes like SMARTS which need different semantics provide their own.

By default, as in the example, OBDescriptor::FilterCompare() would AND each comparison so that all the comparisons must be true for the test to succeed. However filter-string could also be a full boolean expression, with &, |, !, and parenthases allowing any combination of features to be selected. FilterCompareAs calls itself recursively to give AND precidence over OR and evaluation is not carried out if not needed.

The aim has been to make interpretation of the filter-string as liberal as possible, so that AND can be &&, there can be spaces or commas in places that are reasonable.

The base class, OBDescriptor, uses pointers to OBBase in its functions, like OBFormat, to improve extendability - reactions could have features too. It does mean that a dynamic_cast is needed at the start of the Predict(OBBase* pOb) functions.

To use a particular descriptor, like logP, when programming with the API, use code like the following:

  OBDescriptor* pDescr = OBDecriptor::FindType("logP");
  if(pDescr)
    double val = pDescr->Predict(mol);
To add the descriptor ID and the predicted data to OBPairData attached to the object, use PredictAndSave().

This facility can be called from the command line.Use the option --add "descriptor list", which will add the requested descriptors to the molecule. They are then visible as properties in SDF and CML formats. The IDs in the list can be separated by spaces or punctuation characters. All Descriptors will provide an output value as a string through a virtual function GetStringValue((OBBase* pOb, string& svalue)) which assigns the value of a string descriptor(like inchi) to svalue or a string representation of a numerical property like logP.

The classes MWFilter and TitleFilter illustrate the code that has to be provided for numerical and non-numerical descriptors.


Member Typedef Documentation

typedef std::map<const char*, OBPlugin*, CharPtrLess> PluginMapType [inherited]

typedef PluginMapType::const_iterator PluginIterator [inherited]


Member Function Documentation

const char* TypeID (  )  [inline]

virtual double Predict ( OBBase pOb  )  [inline, virtual]

Returns:
the value of a numeric descriptor

Reimplemented in OBGroupContrib.

Referenced by OBDescriptor::Compare(), and OBDescriptor::GetStringValue().

double PredictAndSave ( OBBase pOb  ) 

Returns:
the value of the descriptor and adds it to the object's OBPairData

Referenced by OBDescriptor::AddProperties().

double GetStringValue ( OBBase pOb,
std::string &  svalue 
) [virtual]

Provides a string value for non-numeric descriptors and returns NaN, or a string representation and returns a numeric value.

This default version provides a string representation of the numeric value.

Referenced by OBDescriptor::GetValues(), and OBDescriptor::PredictAndSave().

bool Compare ( OBBase pOb,
std::istream &  ss,
bool  noEval 
) [virtual]

Parses the filter stream for a relational expression and returns its result when applied to the chemical object.

Compare() is a virtual function and can be overridden to allow different comparison behaviour. The default implementation here is suitable for OBDescriptor classes which return a double value. The stringstream is parsed to retrieve a comparison operator, one of > < >= <= = == != , and a numerical value. The function compares this the value returned by Predict() and returns the result. The stringstream is left after the number, and its state reflects whether any errors have occurred. If noEval is true, the parsing is as normal but Predict is not called and the function returns false.

Referenced by OBDescriptor::FilterCompare().

bool Display ( std::string &  txt,
const char *  param,
const char *  ID = NULL 
) [virtual]

Write information on a plugin class to the string txt. If the parameter is a descriptor ID, displays the verbose description for that descriptor only e.g. babel -L descriptors HBA1

Reimplemented from OBPlugin.

bool FilterCompare ( OBBase pOb,
std::istream &  optionText,
bool  noEval 
) [static]

Interprets the --filter option string and returns the combined result of all the comparisons it contains.

The string has the form: PropertyID1 predicate1 [booleanOp] PropertyID2 predicate2 ... The propertyIDs are the ID of instances of a OBDescriptor class or the Attributes of OBPairData, and contain only letters, numbers and underscores. The predicates must start with a punctuation character and are interpreted by the Compare function of the OBDescriptor class. The default implementation expects a comparison operator and a number, e.g. >=1.3 Whitespace is optional and is ignored. Each predicate and this OBBase object (usually OBMol) is passed to the Compare function of a OBDescriptor. The result of each comparison is combined in a boolean expression (which can include parentheses) in the normal way. The AND operator can be & or &&, the OR operator can be | or ||, and a unitary NOT is ! The expected operator precedence is achieved using recursive calls of the function. If there is no boolean Op, all the tests have to return true for the function to return true, i.e. the default is AND. If the first operand of an AND is 0, or of an OR is 1, the parsing of the second operand continues but no comparisons are done since the result does not matter.

Referenced by OBMol::DoTransformations().

void AddProperties ( OBBase pOb,
const std::string &  DescrList 
) [static]

Reads list of descriptor IDs and calls PredictAndSave() for each.

Referenced by OBMol::DoTransformations().

void DeleteProperties ( OBBase pOb,
const std::string &  DescrList 
) [static]

Deletes all the OBPairDatas whose attribute names are in the list (if they exist).

Referenced by OBMol::DoTransformations().

string GetValues ( OBBase pOb,
const std::string &  DescrList 
) [static]

Reads list of descriptor IDs and OBPairData names and returns a list of values, each precede by a space or the first character in the list if it is whitespace or punctuation.

Referenced by OBMol::DoTransformations().

string GetIdentifier ( std::istream &  optionText  )  [static, protected]

Read an identifier from the filter string.

Referenced by OBDescriptor::FilterCompare().

double ParsePredicate ( std::istream &  optionText,
char &  ch1,
char &  ch2,
std::string &  svalue 
) [static, protected]

Reads comparison operator and the following string. Return its value if possible else NaN.

Referenced by OBDescriptor::CompareStringWithFilter(), and OBDescriptor::FilterCompare().

bool ReadStringFromFilter ( std::istream &  ss,
std::string &  result 
) [static, protected]

Reads a string from the filter string optionally preceded by = or !=.

Reads a string from the filter stream, optionally preceded by = or !=

Returns:
false if != operator found, and true otherwise.
On entry the stringstream position should be just after the ID. On exit it is after the string. If there is an error, the stringstream badbit is set. Returns false if != found, to indicate negation. Can be of any of the following forms: mystring =mystring ==mystring [must be terminated by a space or tab] "mystring" 'mystring' ="mystring" ='mystring' [mystring can contain spaces or tabs] !=mystring !="mystring" [Returns false indicating negate] There can be spaces or tabs after the operator = == !=

Referenced by OBDescriptor::ParsePredicate().

bool CompareStringWithFilter ( std::istream &  optionText,
std::string &  s,
bool  noEval,
bool  NoCompOK = false 
) [static, protected]

Makes a comparison using the operator and a string read from the filter stream with a provided string.

Returns:
the result of the comparison and true if NoCompOK==true and there is no comparison operator.

Referenced by OBDescriptor::FilterCompare().

static bool ispunctU ( char  ch  )  [inline, static, protected]

bool MatchPairData ( OBBase pOb,
std::string &  s 
) [static, protected]

Returns:
true if s (with or without _ replaced by spaces) is a PairData attribute. On return s is the form which matches.

Referenced by OBDescriptor::DeleteProperties(), OBDescriptor::FilterCompare(), and OBDescriptor::GetValues().

virtual const char* Description (  )  [inline, virtual, inherited]

Required description of a sub-type.

Reimplemented in OBFormat, OBGroupContrib, and OpTransform.

Referenced by OBPlugin::Display(), and OBOp::OpOptions().

virtual OBPlugin* MakeInstance ( const std::vector< std::string > &   )  [inline, virtual, inherited]

Make a new instance of the class. See OpTransform, OBGroupContrib, SmartsDescriptor classes for derived versions. Usually, the first parameter is the classname, the next three are parameters(ID, filename, description) for a constructor, and the rest data.

Reimplemented in OBGroupContrib, and OpTransform.

Referenced by OBConversion::LoadFormatFiles().

static OBPlugin* GetPlugin ( const char *  Type,
const char *  ID 
) [inline, static, inherited]

Get a pointer to a plugin from its type and ID. Return NULL if not found. Not cast to Type*.

Referenced by OBConversion::LoadFormatFiles().

const char* GetID (  )  const [inline, inherited]

Return the ID of the sub-type instance.

Referenced by OBPlugin::Display(), OBFormat::Display(), and OBDescriptor::PredictAndSave().

bool ListAsVector ( const char *  PluginID,
const char *  param,
std::vector< std::string > &  vlist 
) [static, inherited]

Output a list of sub-type classes of the the type PluginID, or, if PluginID is "plugins" or empty, a list of the base types. If PluginID is not recognized or is NULL, the base types are output and the return is false.

Referenced by OBConversion::GetSupportedInputFormat(), OBConversion::GetSupportedOutputFormat(), and OBPlugin::List().

void List ( const char *  PluginID,
const char *  param = NULL,
std::ostream *  os = &std::cout 
) [static, inherited]

As ListAsVector but sent to an ostream with a default of cout if not specified.

Referenced by OBPlugin::ListAsString().

string ListAsString ( const char *  PluginID,
const char *  param = NULL 
) [static, inherited]

As ListAsVector but returns a string containing the list.

string FirstLine ( const char *  txt  )  [static, inherited]

Utility function to return only the first line of a string.

Referenced by OBPlugin::Display(), OBFormat::Display(), and OBOp::OpOptions().

static PluginIterator Begin ( const char *  PluginID  )  [inline, static, inherited]

Return an iterator at the start of the map of the plugin types PluginID or, if there is no such map, the end of the top level plugin map.

Referenced by OBConversion::GetNextFormat(), and OBOp::OpOptions().

static PluginIterator End ( const char *  PluginID  )  [inline, static, inherited]

virtual PluginMapType& GetMap (  )  const [pure virtual, inherited]

Returns the map of the subtypes.

Referenced by OBFormat::RegisterFormat().

static PluginMapType& PluginMap (  )  [inline, static, protected, inherited]

Returns a reference to the map of the plugin types. Is a function rather than a static member variable to avoid initialization problems.

Referenced by OBPlugin::GetTypeMap(), OBPlugin::ListAsVector(), and OBFormat::RegisterFormat().

OBPlugin::PluginMapType & GetTypeMap ( const char *  PluginID  )  [static, protected, inherited]

Returns the map of a particular plugin type, e.g. GetMapType("fingerprints").

static OBPlugin* BaseFindType ( PluginMapType Map,
const char *  ID 
) [inline, static, protected, inherited]

Returns the type with the specified ID, or NULL if not found. Will be cast to the appropriate class in the calling routine.


Member Data Documentation

const char* _id [protected, inherited]


The documentation for this class was generated from the following files: