Difference between revisions of "Chemical Markup Language"

From Open Babel
Jump to: navigation, search
m (Added to format category)
(Created article pages)
 
(3 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
{{Format|
 
{{Format|
|extensions=cml
+
|extensions=cml
|mime=chemical/x-cml
+
|mime=chemical/x-cml
|url=http://wwmm.ch.cam.ac.uk/moin/ChemicalMarkupLanguage
+
|url=http://wwmm.ch.cam.ac.uk/moin/ChemicalMarkupLanguage
|notes=
+
|import=Yes
|options=
+
|export=Yes
 +
|version=1.100 and later
 +
|dimensionality=3D
 +
|options=
 
<pre>
 
<pre>
XML format. This implementation uses libxml2.
+
XML format. This implementation uses libxml2.
 
  Write options for CML: -x[flags] (e.g. -x1ac)
 
  Write options for CML: -x[flags] (e.g. -x1ac)
 
  1  output CML1 (rather than CML2)
 
  1  output CML1 (rather than CML2)
Line 13: Line 16:
 
  m  output metadata
 
  m  output metadata
 
  x  omit XML and namespace declarations
 
  x  omit XML and namespace declarations
  N add namespace prefix to elements
+
  N<prefix> add namespace prefix to elements
 
</pre>
 
</pre>
 
}}
 
}}
  
The XML formats require the XML text to be well formed but generally interpret it fairly tolerantly. Unrecognised elements and attributes are ignored and there are rather few error messages when any required structures are not found.
+
==General XML Notes==
 +
 
 +
The XML formats require the XML text to be well formed but generally interpret
 +
it fairly tolerantly. Unrecognised elements and attributes are ignored and
 +
there are rather few error messages when any required structures are not found.
 +
This laxity allows, for instance, the reactant and product molecules to be
 +
picked out of a [[CML React]] file using [[CML]]. Each format has an element which
 +
is regarded as defining the object that OpenBabel will convert. For [[CML]]
 +
this is &lt;molecule&gt;. Files can have multiple objects and these can be treated
 +
the same as with other multiple object formats like [[SMILES]] and [[MDL Molfile]]. So
 +
conversion can start at the nth object using the -fn option and finish before
 +
the end using the -ln option. Multiple object XML files also can be indexed
 +
and searched using [[FastSearch]], although this has not yet been extensively
 +
tested.
 +
 
 +
==CML Notes==
 +
 
 +
This format writes and reads CML XML files. To write CML1 format rather than
 +
the default CML2, use the -x1 option. To write the array form use -xa and to
 +
specify all hydrogens using the hydrogenCount attribute on atoms use -xh.
 +
 
 +
Crystal structures are written using the &lt;crystal&gt;, &lt;xfract&gt;) etc., elements
 +
if the OBMol has a OBGenericDataType::UnitCell data.
 +
 
 +
If the OBMol has no bonds, a &lt;formula&gt; element is written instead of the normal
 +
&lt;atomArray&gt; and &lt;atom&gt; elements.
 +
 
 +
All these forms are handled transparently during reading. Only a subset of CML
 +
elements and attributes are recognised, but these include most of those which
 +
define chemical structure, see below.
 +
 
 +
The following are read:
 +
* Elements:
 +
** molecule, atomArray, atom, bondArray, bond, atomParity, bondStereo
 +
** name, formula, crystal, scalar (contains crystal data)
 +
** string, stringArray, integer, integerArray, float floatArray, builtin
 +
 
 +
* Attributes:
 +
** On &lt;molecule&gt;: id, title, ref(in CMLReact)
 +
** On &lt;atom&gt;: id, atomId, atomID, elementType, x2, y2, x3, y3, z3, xy2, xyz3, xFract, yFract, zFract, xyzFract, hydrogenCount, formalCharge, isotope, isotopeNumber, spinMultiplicity, radical(from Marvin), atomRefs4 (for atomParity)
 +
** On &lt;bond&gt;: atomRefs2, order, CML1: atomRef, atomRef1, atomRef2
 +
 
 +
== References ==
 +
* [[Article:rr99]]
 +
* [[Article:mr01]]
 +
* [[Article:gmrw01]]
 +
* [[Article:wil01]]
 +
* [[Article:mr03]]
 +
* [[Article:mrww04]]
  
 
[[Category:Formats]]
 
[[Category:Formats]]

Latest revision as of 14:38, 16 March 2006

Filename Extensions cml
Chemical MIME Type chemical/x-cml
Specification URL http://wwmm.ch.cam.ac.uk/moin/ChemicalMarkupLanguage
Import Yes
Export Yes
Open Babel Version 1.100 and later

Options

 XML format. This implementation uses libxml2.
 Write options for CML: -x[flags] (e.g. -x1ac)
 1  output CML1 (rather than CML2)
 a  output array format for atoms and bonds
 h  use hydrogenCount for all hydrogens
 m  output metadata
 x  omit XML and namespace declarations
 N<prefix> add namespace prefix to elements

Additional Comments

General XML Notes

The XML formats require the XML text to be well formed but generally interpret it fairly tolerantly. Unrecognised elements and attributes are ignored and there are rather few error messages when any required structures are not found. This laxity allows, for instance, the reactant and product molecules to be picked out of a CML React file using CML. Each format has an element which is regarded as defining the object that OpenBabel will convert. For CML this is <molecule>. Files can have multiple objects and these can be treated the same as with other multiple object formats like SMILES and MDL Molfile. So conversion can start at the nth object using the -fn option and finish before the end using the -ln option. Multiple object XML files also can be indexed and searched using FastSearch, although this has not yet been extensively tested.

CML Notes

This format writes and reads CML XML files. To write CML1 format rather than the default CML2, use the -x1 option. To write the array form use -xa and to specify all hydrogens using the hydrogenCount attribute on atoms use -xh.

Crystal structures are written using the <crystal>, <xfract>) etc., elements if the OBMol has a OBGenericDataType::UnitCell data.

If the OBMol has no bonds, a <formula> element is written instead of the normal <atomArray> and <atom> elements.

All these forms are handled transparently during reading. Only a subset of CML elements and attributes are recognised, but these include most of those which define chemical structure, see below.

The following are read:

  • Elements:
    • molecule, atomArray, atom, bondArray, bond, atomParity, bondStereo
    • name, formula, crystal, scalar (contains crystal data)
    • string, stringArray, integer, integerArray, float floatArray, builtin
  • Attributes:
    • On <molecule>: id, title, ref(in CMLReact)
    • On <atom>: id, atomId, atomID, elementType, x2, y2, x3, y3, z3, xy2, xyz3, xFract, yFract, zFract, xyzFract, hydrogenCount, formalCharge, isotope, isotopeNumber, spinMultiplicity, radical(from Marvin), atomRefs4 (for atomParity)
    • On <bond>: atomRefs2, order, CML1: atomRef, atomRef1, atomRef2

References