OBConversion Class Reference

Class to convert from one format to another. More...

#include <obconversion.h>

List of all members.

Option handling

enum  Option_type { INOPTIONS, OUTOPTIONS, GENOPTIONS }
 Three types of options set on the the command line by -a? , -x? , or -? More...
const char * IsOption (const char *opt, Option_type opttyp=OUTOPTIONS)
 Determine whether an option is set. Returns NULL if option not and a pointer to the associated text if it is.
const std::map< std::string,
std::string > * 
GetOptions (Option_type opttyp)
 Access the map with option name as key and any associated text as value.
void AddOption (const char *opt, Option_type opttyp, const char *txt=NULL)
 Set an option of specified type, with optional text.
bool RemoveOption (const char *opt, Option_type optype)
void SetOptions (const char *options, Option_type opttyp)
 Set several single character options of specified type from string like ab"btext"c"ctext".
static void RegisterOptionParam (std::string name, OBFormat *pFormat, int numberParams=0, Option_type typ=OUTOPTIONS)
 For example -h takes 0 parameters; -f takes 1. Call in a format constructor.
static int GetOptionParams (std::string name, Option_type typ)
 Returns the number of parameters registered for the option, or 0 if not found.

Convenience functions

bool Write (OBBase *pOb, std::ostream *pout=NULL)
 Outputs an object of a class derived from OBBase.
std::string WriteString (OBBase *pOb)
 Outputs an object of a class derived from OBBase as a string.
bool WriteFile (OBBase *pOb, std::string filePath)
 Outputs an object of a class derived from OBBase as a file (with the supplied path).
bool Read (OBBase *pOb, std::istream *pin=NULL)
 Reads an object of a class derived from OBBase into pOb.
bool ReadString (OBBase *pOb, std::string input)
 Reads an object of a class derived from OBBase into pOb from the supplied string.
bool ReadFile (OBBase *pOb, std::string filePath)
 Reads an object of a class derived from OBBase into pOb from the file specified.
static OBFormatGetDefaultFormat ()
 The default format is set in a single OBFormat class (generally it is OBMol).
static std::string BatchFileName (std::string &BaseName, std::string &InFile)
 Replaces * in BaseName by InFile without extension and path.
static std::string IncrementedFileName (std::string &BaseName, const int Count)
 Replaces * in BaseName by Count.

Public Member Functions

Parameter get and set
std::istream * GetInStream () const
std::ostream * GetOutStream () const
void SetInStream (std::istream *pIn)
void SetOutStream (std::ostream *pOut)
bool SetInAndOutFormats (const char *inID, const char *outID)
 Sets the formats from their ids, e g CML.
bool SetInAndOutFormats (OBFormat *pIn, OBFormat *pOut)
bool SetInFormat (const char *inID)
bool SetInFormat (OBFormat *pIn)
bool SetOutFormat (const char *outID)
bool SetOutFormat (OBFormat *pOut)
OBFormatGetInFormat () const
OBFormatGetOutFormat () const
std::string GetInFilename () const
std::streampos GetInPos () const
 Get the position in the input stream of the object being read.
size_t GetInLen () const
 Get the length in the input stream of the object being read.
const char * GetTitle () const
 Returns a default title which is the filename.
OBConversionGetAuxConv () const
 Extension method: deleted in ~OBConversion().
void SetAuxConv (OBConversion *pConv)
Conversion
int Convert (std::istream *is, std::ostream *os)
 Conversion for single input and output stream.
int Convert ()
 Conversion with existing streams.
int FullConvert (std::vector< std::string > &FileList, std::string &OutputFileName, std::vector< std::string > &OutputFileList)
 Conversion with multiple input/output files: makes input and output streams, and carries out normal, batch, aggregation, and splitting conversion.
Conversion loop control
int AddChemObject (OBBase *pOb)
 Adds to internal array during input.
OBBaseGetChemObject ()
 Retrieve from internal array during output.
bool IsLast ()
 True if no more objects to be output.
bool IsFirstInput ()
 True if the first input object is being processed.
int GetOutputIndex () const
 Retrieves number of ChemObjects that have been actually output.
void SetOutputIndex (int indx)
 Sets ouput index (maybe to control whether seen as first object).
void SetMoreFilesToCome ()
 Used with multiple input files. Off by default.
void SetOneObjectOnly ()
 Used with multiple input files. Off by default.

Static Public Member Functions

Collection of formats
static int RegisterFormat (const char *ID, OBFormat *pFormat, const char *MIME=NULL)
 Called once by each format class.
static OBFormatFindFormat (const char *ID)
 Searches registered formats.
static OBFormatFormatFromExt (const char *filename)
 Searches registered formats for an ID the same as the file extension.
static OBFormatFormatFromMIME (const char *MIME)
 Searches registered formats for a MIME the same as the chemical MIME type passed.
static bool GetNextFormat (Formatpos &itr, const char *&str, OBFormat *&pFormat)
 Repeatedly called to recover available Formats.
Information
static const char * Description ()

Protected Types

typedef std::map< std::string,
int > 
OPAMapType

Protected Member Functions

bool SetStartAndEnd ()
bool OpenAndSetFormat (bool SetFormat, std::ifstream *is)

Static Protected Member Functions

static FMapTypeFormatsMap ()
 contains ID and pointer to all OBFormat classes
static FMapTypeFormatsMIMEMap ()
 contains MIME and pointer to all OBFormat classes
static OPAMapTypeOptionParamArray (Option_type typ)
static int LoadFormatFiles ()

Protected Attributes

std::string InFilename
std::istream * pInStream
std::ostream * pOutStream
OBFormatpInFormat
OBFormatpOutFormat
std::map< std::string, std::string > OptionsArray [3]
int Index
unsigned int StartNumber
unsigned int EndNumber
int Count
bool m_IsLast
bool MoreFilesToCome
bool OneObjectOnly
bool ReadyToInput
bool CheckedForGzip
 input stream was already checked if it is gzip-encoded
OBBasepOb1
std::streampos wInpos
 position in the input stream of the object being written
std::streampos rInpos
 position in the input stream of the object being read
size_t wInlen
 length in the input stream of the object being written
size_t rInlen
 length in the input stream of the object being read
OBConversionpAuxConv
 Way to extend OBConversion.

Static Protected Attributes

static OBFormatpDefaultFormat
static int FormatFilesLoaded


Detailed Description

Class to convert from one format to another.

OBConversion maintains a list of the available formats, provides information on them, and controls the conversion process.

A conversion is carried out by the calling routine, usually in a user interface or an application program, making an instance of OBConversion. It is loaded with the in and out formats, any options and (usually) the default streams for input and output. Then either the Convert() function is called, which allows a single input file to be converted, or the extended functionality of FullConvert() is used. This allows multiple input and output files, allowing:

These procedures constitute the "Convert" interface. OBConversion and the user interface or application program do not need to be aware of any other part of OpenBabel - mol.h is not #included. This allows any chemical object derived from OBBase to be converted; the type of object is decided by the input format class. However,currently, almost all the conversions are for molecules of class OBMol. / OBConversion can also be used with an "API" interface called from programs which manipulate chemical objects. Input/output is done with the Read() and Write() functions which work with any chemical object, but need to have its type specified. (The ReadMolecule() and WriteMolecule() functions of the format classes can also be used directly.)

Example code using OBConversion

To read in a molecule, manipulate it and write it out.

Set up an istream and an ostream, to and from files or elsewhere. (cin and cout are used in the example). Specify the file formats.

      OBConversion conv(&cin,&cout);
      if(conv.SetInAndOutFormats("SMI","MOL"))
      { 
      OBMol mol;
      if(conv.Read(&mol))
      ...manipulate molecule 
                
      conv->Write(&mol);
      }

A two stage construction is used to allow error handling if the format ID is not recognized. This is necessary now that the formats are dynamic and errors are not caught at compile time. OBConversion::Read() is a templated function so that objects derived from OBBase can also be handled, in addition to OBMol, if the format routines are written appropriately.

To make a molecule from a SMILES string.

      std::string SmilesString;
      OBMol mol;
      stringstream ss(SmilesString)
      OBConversion conv(&ss);
      if(conv.SetInFormat("smi") && conv.Read(&mol))
      ...

To do a file conversion without manipulating the molecule.

      #include "obconversion.h" //mol.h is not needed
      ...set up an istream is and an ostream os 
      OBConversion conv(&is,&os);
      if(conv.SetInAndOutFormats("SMI","MOL"))
      {
      conv.SetOptions("h"); //Optional; (h adds expicit hydrogens)
      conv.Convert();
      }

To add automatic format conversion to an existing program.

The existing program inputs from the file identified by the const char* filename into the istream is. The file is assumed to have a format ORIG, but otherformats, identified by their file extensions, can now be used.

      ifstream ifs(filename); //Original code

      OBConversion conv;
      OBFormat* inFormat = conv.FormatFromExt(filename);
      OBFormat* outFormat = conv.GetFormat("ORIG");
      istream* pIn = &ifs; 
      stringstream newstream;
      if(inFormat && outFormat)
      {
      conv.SetInAndOutFormats(inFormat,outFormat);
      conv.Convert(pIn,&newstream);
      pIn=&newstream;
      }
      //else error; new features not available; fallback to original functionality 

      ...Carry on with original code using pIn

In Windows a degree of independence from OpenBabel can be achieved using DLLs. This code would be linked with obconv.lib. At runtime the following DLLs would be in the executable directory: obconv.dll, obdll.dll, one or more *.obf format files.


Member Typedef Documentation

typedef std::map<std::string,int> OPAMapType [protected]


Member Enumeration Documentation

enum Option_type

Three types of options set on the the command line by -a? , -x? , or -?

Enumerator:
INOPTIONS 
OUTOPTIONS 
GENOPTIONS 


Constructor & Destructor Documentation

OBConversion ( std::istream *  is = NULL,
std::ostream *  os = NULL 
)

OBConversion ( const OBConversion o  ) 

Copy constructor.

~OBConversion (  )  [virtual]

OBConversion ( std::istream *  is = NULL,
std::ostream *  os = NULL 
)

OBConversion ( const OBConversion o  ) 

Copy constructor.

~OBConversion (  )  [virtual]


Member Function Documentation

int RegisterFormat ( const char *  ID,
OBFormat pFormat,
const char *  MIME = NULL 
) [static]

Called once by each format class.

Class information on formats is collected by making an instance of the class derived from OBFormat(only one is usually required). RegisterFormat() is called from its constructor.

If the compiled format is stored separately, like in a DLL or shared library, the initialization code makes an instance of the imported OBFormat class.

OBFormat * FindFormat ( const char *  ID  )  [static]

Searches registered formats.

OBFormat * FormatFromExt ( const char *  filename  )  [static]

Searches registered formats for an ID the same as the file extension.

OBFormat * FormatFromMIME ( const char *  MIME  )  [static]

Searches registered formats for a MIME the same as the chemical MIME type passed.

bool GetNextFormat ( Formatpos itr,
const char *&  str,
OBFormat *&  pFormat 
) [static]

Repeatedly called to recover available Formats.

Returns the ID + the first line of the description in str and a pointer to the format in pFormat. If called with str==NULL the first format is returned; subsequent formats are returned by calling with str!=NULL and the previous value of itr returns false, and str and pFormat NULL, when there are no more formats. Use like:

        const char* str=NULL;
        Formatpos pos;
     OBConversion conv; // dummy to make sure static data is available
        while(OBConversion::GetNextFormat(pos,str,pFormat))
        {
                use str and pFormat
        }
   *

NOTE: Because of dynamic loading problems, it is usually necessary to declare a "dummy" OBConversion object to access this static method. (Not elegant, but will hopefully be fixed in the future.)

const char * Description (  )  [static]

std::istream* GetInStream (  )  const [inline]

std::ostream* GetOutStream (  )  const [inline]

void SetInStream ( std::istream *  pIn  )  [inline]

void SetOutStream ( std::ostream *  pOut  )  [inline]

bool SetInAndOutFormats ( const char *  inID,
const char *  outID 
)

Sets the formats from their ids, e g CML.

Sets the formats from their ids, e g CML. If inID is NULL, the input format is left unchanged. Similarly for outID Returns true if both formats have been successfully set at sometime

bool SetInAndOutFormats ( OBFormat pIn,
OBFormat pOut 
)

bool SetInFormat ( const char *  inID  ) 

bool SetInFormat ( OBFormat pIn  ) 

bool SetOutFormat ( const char *  outID  ) 

bool SetOutFormat ( OBFormat pOut  ) 

OBFormat* GetInFormat (  )  const [inline]

OBFormat* GetOutFormat (  )  const [inline]

std::string GetInFilename (  )  const [inline]

std::streampos GetInPos (  )  const [inline]

Get the position in the input stream of the object being read.

size_t GetInLen (  )  const [inline]

Get the length in the input stream of the object being read.

const char * GetTitle (  )  const

Returns a default title which is the filename.

OBConversion* GetAuxConv (  )  const [inline]

Extension method: deleted in ~OBConversion().

void SetAuxConv ( OBConversion pConv  )  [inline]

const char * IsOption ( const char *  opt,
Option_type  opttyp = OUTOPTIONS 
)

Determine whether an option is set. Returns NULL if option not and a pointer to the associated text if it is.

const std::map<std::string,std::string>* GetOptions ( Option_type  opttyp  )  [inline]

Access the map with option name as key and any associated text as value.

void AddOption ( const char *  opt,
Option_type  opttyp,
const char *  txt = NULL 
)

Set an option of specified type, with optional text.

bool RemoveOption ( const char *  opt,
Option_type  optype 
)

void SetOptions ( const char *  options,
Option_type  opttyp 
)

Set several single character options of specified type from string like ab"btext"c"ctext".

void RegisterOptionParam ( std::string  name,
OBFormat pFormat,
int  numberParams = 0,
Option_type  typ = OUTOPTIONS 
) [static]

For example -h takes 0 parameters; -f takes 1. Call in a format constructor.

int GetOptionParams ( std::string  name,
Option_type  typ 
) [static]

Returns the number of parameters registered for the option, or 0 if not found.

int Convert ( std::istream *  is,
std::ostream *  os 
)

Conversion for single input and output stream.

int Convert (  ) 

Conversion with existing streams.

Actions the "convert" interface. Calls the OBFormat class's ReadMolecule() which

AddChemObject does not save the object passed to it if it is NULL (as a result of a DoTransformation()) or if the number of the object is outside the range defined by StartNumber and EndNumber.This means the start and end counts apply to all chemical objects found whether or not they are output.

If ReadMolecule returns false the input conversion loop is exited.

int FullConvert ( std::vector< std::string > &  FileList,
std::string &  OutputFileName,
std::vector< std::string > &  OutputFileList 
)

Conversion with multiple input/output files: makes input and output streams, and carries out normal, batch, aggregation, and splitting conversion.

Makes input and output streams, and carries out normal, batch, aggregation, and splitting conversion.

Normal Done if FileList contains a single file name and OutputFileName does not contain a *.

Aggregation Done if FileList has more than one file name and OutputFileName does not contain * . All the chemical objects are converted and sent to the single output file.

Splitting Done if FileList contains a single file name and OutputFileName contains a * . Each chemical object in the input file is converted and sent to a separate file whose name is OutputFileName with the replaced by 1, 2, 3, etc. For example, if OutputFileName is NEW*.smi then the output files are NEW1.smi, NEW2.smi, etc.

Batch Conversion Done if FileList has more than one file name and contains a * . Each input file is converted to an output file whose name is OutputFileName with the * replaced by the inputfile name without its path and extension. So if the input files were inpath/First.cml, inpath/Second.cml and OutputFileName was NEW*.mol, the output files would be NEWFirst.mol, NEWSecond.mol.

If FileList is empty, the input stream that has already been set (usually in the constructor) is used. If OutputFileName is empty, the output stream already set is used.

On exit, OutputFileList contains the names of the output files.

Returns the number of Chemical objects converted.

int AddChemObject ( OBBase pOb  ) 

Adds to internal array during input.

Called by ReadMolecule() to deliver an object it has read from an input stream. Used in two modes:

OBBase * GetChemObject (  ) 

Retrieve from internal array during output.

Retrieves an object stored by AddChemObject() during output

bool IsLast (  ) 

True if no more objects to be output.

bool IsFirstInput (  ) 

True if the first input object is being processed.

int GetOutputIndex (  )  const

Retrieves number of ChemObjects that have been actually output.

void SetOutputIndex ( int  indx  ) 

Sets ouput index (maybe to control whether seen as first object).

void SetMoreFilesToCome (  ) 

Used with multiple input files. Off by default.

void SetOneObjectOnly (  ) 

Used with multiple input files. Off by default.

static OBFormat* GetDefaultFormat (  )  [inline, static]

The default format is set in a single OBFormat class (generally it is OBMol).

bool Write ( OBBase pOb,
std::ostream *  pout = NULL 
)

Outputs an object of a class derived from OBBase.

Part of "API" interface. The output stream can be specified and the change is retained in the OBConversion instance

std::string WriteString ( OBBase pOb  ) 

Outputs an object of a class derived from OBBase as a string.

Part of "API" interface. The output stream is temporarily changed to the string and then restored This method is primarily intended for scripting languages without "stream" classes

bool WriteFile ( OBBase pOb,
std::string  filePath 
)

Outputs an object of a class derived from OBBase as a file (with the supplied path).

Part of "API" interface. The output stream is changed to the supplied file and the change is retained in the OBConversion instance. This method is primarily intended for scripting languages without "stream" classes

bool Read ( OBBase pOb,
std::istream *  pin = NULL 
)

Reads an object of a class derived from OBBase into pOb.

Part of "API" interface. The input stream can be specified and the change is retained in the OBConversion instance Returns false and pOb=NULL on error

bool ReadString ( OBBase pOb,
std::string  input 
)

Reads an object of a class derived from OBBase into pOb from the supplied string.

Part of "API" interface. Returns false and pOb=NULL on error This method is primarily intended for scripting languages without "stream" classes

bool ReadFile ( OBBase pOb,
std::string  filePath 
)

Reads an object of a class derived from OBBase into pOb from the file specified.

Part of "API" interface. The output stream is changed to the supplied file and the change is retained in the OBConversion instance. Returns false and pOb=NULL on error This method is primarily intended for scripting languages without "stream" classes

string BatchFileName ( std::string &  BaseName,
std::string &  InFile 
) [static]

Replaces * in BaseName by InFile without extension and path.

string IncrementedFileName ( std::string &  BaseName,
const int  Count 
) [static]

Replaces * in BaseName by Count.

bool SetStartAndEnd (  )  [protected]

FMapType & FormatsMap (  )  [static, protected]

contains ID and pointer to all OBFormat classes

This static function returns a reference to the FormatsMap which, because it is a static local variable is constructed only once. This fiddle is to avoid the "static initialization order fiasco" See Marshall Cline's C++ FAQ Lite document, www.parashift.com/c++-faq-lite/".

FMapType & FormatsMIMEMap (  )  [static, protected]

contains MIME and pointer to all OBFormat classes

This static function returns a reference to the FormatsMIMEMap which, because it is a static local variable is constructed only once. This fiddle is to avoid the "static initialization order fiasco" See Marshall Cline's C++ FAQ Lite document, www.parashift.com/c++-faq-lite/".

OPAMapType & OptionParamArray ( Option_type  typ  )  [static, protected]

int LoadFormatFiles (  )  [static, protected]

bool OpenAndSetFormat ( bool  SetFormat,
std::ifstream *  is 
) [protected]


Member Data Documentation

std::string InFilename [protected]

std::istream* pInStream [protected]

std::ostream* pOutStream [protected]

OBFormat * pDefaultFormat [static, protected]

OBFormat* pInFormat [protected]

OBFormat* pOutFormat [protected]

std::map<std::string,std::string> OptionsArray[3] [protected]

int Index [protected]

unsigned int StartNumber [protected]

unsigned int EndNumber [protected]

int Count [protected]

bool m_IsLast [protected]

bool MoreFilesToCome [protected]

bool OneObjectOnly [protected]

bool ReadyToInput [protected]

bool CheckedForGzip [protected]

input stream was already checked if it is gzip-encoded

int FormatFilesLoaded [static, protected]

OBBase* pOb1 [protected]

std::streampos wInpos [protected]

position in the input stream of the object being written

std::streampos rInpos [protected]

position in the input stream of the object being read

size_t wInlen [protected]

length in the input stream of the object being written

size_t rInlen [protected]

length in the input stream of the object being read

OBConversion* pAuxConv [protected]

Way to extend OBConversion.


The documentation for this class was generated from the following files: