Difference between revisions of "HowTo:Add A New File Format"
From Open Babel
(Added wikified link to repository) |
|||
(6 intermediate revisions by 3 users not shown) | |||
Line 2: | Line 2: | ||
# Create a file for your format in <code>src/formats/</code> or <code>src/formats/xml/</code> (for XML-based formats). Ideally, this file is self-contained, although several formats modules are compiled across multiple source code files. | # Create a file for your format in <code>src/formats/</code> or <code>src/formats/xml/</code> (for XML-based formats). Ideally, this file is self-contained, although several formats modules are compiled across multiple source code files. | ||
− | # Take a look at other file format code, particularly <code>exampleformat.cpp</code>, which contains a heavily-annotated description of writing a new format. | + | # Take a look at other file format code, particularly <code>exampleformat.cpp</code>, which contains a heavily-annotated description of writing a new format. XML formats need to take a different approach; see the code in <code>xcmlformat.cpp</code> or <code>pubchemformat.cpp</code>. |
# When reading in molecules (and thus performing a lot of molecular modifications) call <code>OBMol::BeginModify()</code> at the beginning and <code>OBMol::EndModify()</code> at the end. This will ensure that perception routines do not run while you read in a molecule and are reset after your code finishes. | # When reading in molecules (and thus performing a lot of molecular modifications) call <code>OBMol::BeginModify()</code> at the beginning and <code>OBMol::EndModify()</code> at the end. This will ensure that perception routines do not run while you read in a molecule and are reset after your code finishes. | ||
# Currently, lazy perception does not include connectivity and bond order assignment. If your format does not include bonds, make sure to call <code>OBMol::ConnectTheDots()</code> and <code>OBMol::PerceiveBondOrders()</code> after <code>OBMol::EndModify()</code> to ensure bonds are assigned. | # Currently, lazy perception does not include connectivity and bond order assignment. If your format does not include bonds, make sure to call <code>OBMol::ConnectTheDots()</code> and <code>OBMol::PerceiveBondOrders()</code> after <code>OBMol::EndModify()</code> to ensure bonds are assigned. | ||
Line 9: | Line 9: | ||
#* <code>-ab</code> No bond perception | #* <code>-ab</code> No bond perception | ||
# Make sure to use generic data classes like <code>OBUnitCell</code> and others as appropriate. If your format stores any sort of common data types, consider adding a subclass of <code>OBGenericData</code> for use by other formats and user code. | # Make sure to use generic data classes like <code>OBUnitCell</code> and others as appropriate. If your format stores any sort of common data types, consider adding a subclass of <code>OBGenericData</code> for use by other formats and user code. | ||
− | # Please make sure to add several example files to the [[Repository test set repository]]. Ideally, these should work several areas of your import code -- in the end, the more robust the test set, the more stable and useful Open Babel will be. | + | # Please make sure to add several example files to the [[Repository|test set repository]]. Ideally, these should work several areas of your import code -- in the end, the more robust the test set, the more stable and useful Open Babel will be. The test files should include at least one example of a correct file and one example of an invalid file (i.e., something which will properly be ignored and not crash [[babel]]). |
# That's it! Contact the [mailto:[email protected] openbabel-discuss mailing list] with any questions, comments, or to contribute your new format code. | # That's it! Contact the [mailto:[email protected] openbabel-discuss mailing list] with any questions, comments, or to contribute your new format code. | ||
+ | |||
+ | After the code is released in a new [[releases|version]], a new entry for your format should be added to this website under the [[:Category:Formats|formats]] category and to the [[list of extensions]] pages. See some of the examples for more information. | ||
+ | |||
+ | [[Category:Contribute]] |
Latest revision as of 14:54, 19 October 2006
Adding support for a new file format is a relatively easy process, particularly with Open Babel 2.0 and later. This guide outlines several important steps to remember when developing a format translator.
- Create a file for your format in
src/formats/
orsrc/formats/xml/
(for XML-based formats). Ideally, this file is self-contained, although several formats modules are compiled across multiple source code files. - Take a look at other file format code, particularly
exampleformat.cpp
, which contains a heavily-annotated description of writing a new format. XML formats need to take a different approach; see the code inxcmlformat.cpp
orpubchemformat.cpp
. - When reading in molecules (and thus performing a lot of molecular modifications) call
OBMol::BeginModify()
at the beginning andOBMol::EndModify()
at the end. This will ensure that perception routines do not run while you read in a molecule and are reset after your code finishes. - Currently, lazy perception does not include connectivity and bond order assignment. If your format does not include bonds, make sure to call
OBMol::ConnectTheDots()
andOBMol::PerceiveBondOrders()
afterOBMol::EndModify()
to ensure bonds are assigned. - Consider various input and output options that users can set from the command-line or GUI. For example, many quantum mechanics or other formats which do not recognize bonds, offer the following options (GUI programs often offer these as a sub-menu which appears when choosing the appropriate format):
-
-as
Call onlyOBMol::ConnectTheDots()
(single bonds only) -
-ab
No bond perception
-
- Make sure to use generic data classes like
OBUnitCell
and others as appropriate. If your format stores any sort of common data types, consider adding a subclass ofOBGenericData
for use by other formats and user code. - Please make sure to add several example files to the test set repository. Ideally, these should work several areas of your import code -- in the end, the more robust the test set, the more stable and useful Open Babel will be. The test files should include at least one example of a correct file and one example of an invalid file (i.e., something which will properly be ignored and not crash babel).
- That's it! Contact the openbabel-discuss mailing list with any questions, comments, or to contribute your new format code.
After the code is released in a new version, a new entry for your format should be added to this website under the formats category and to the list of extensions pages. See some of the examples for more information.