Table Of Contents

Previous topic

Fastsearch format (fs)

Next topic

General XML format (xml)

This Page

Fingerprint format (fpt)

Generate or display molecular fingerprints.

This format constructs and displays fingerprints and (for multiple input objects) the Tanimoto coefficient and whether a superstructure of the first object.

A list of available fingerprint types can be obtained by:

babel -L fingerprints

The current default type FP2 is is of the Daylight type, indexing a molecule based on the occurrence of linear fragment up to 7 atoms in length. To use a fingerprint type other than the default, use the -xf option, for example:

babel infile.xxx -ofpt -xfFP3

For a single molecule the fingerprint is output in hexadecimal form (intended mainly for debugging).

With multiple molecules the hexadecimal form is output only if the -xh option is specified. But in addition the Tanimoto coefficient between the first molecule and each of the subsequent ones is displayed. If the first molecule is a substructure of the target molecule a note saying this is also displayed.

The Tanimoto coefficient is defined as:

Number of bits set in (patternFP & targetFP) / Number of bits in (patternFP | targetFP)

where the boolean operations between the fingerprints are bitwise.

The Tanimoto coefficient has no absolute meaning and depends on the design of the fingerprint.

Use the -xs option to describe the bits that are set in the fingerprint. The output depends on the fingerprint type. For Fingerprint FP4, each bit corresponds to a particular chemical feature, which are specified as SMARTS patterns in SMARTS_InteLigand.txt, and the output is a tab-separated list of the features of a molecule. For instance, a well-known molecule gives:

Primary_carbon: Carboxylic_acid: Carboxylic_ester: Carboxylic_acid_derivative:
Vinylogous_carbonyl_or_carboxyl_derivative: Vinylogous_ester: Aromatic:
Conjugated_double_bond: C_ONS_bond: 1,3-Tautomerizable: Rotatable_bond: CH-acidic:

For the path-based fingerprint FP2, the output from the -xs option is instead a list of the chemical fragments used to set bits, e.g.:

$ obabel -:"CCC(=O)Cl" -ofpt -xs -xf FP2
>
0 6 1 6 <670>
0 6 1 6 1 6 <260>
0 8 2 6 <623>
...etc

where the first digit is 0 for linear fragments but is a bond order for cyclic fragments. The remaining digits indicate the atomic number and bond order alternatively. Note that a bond order of 5 is used for aromatic bonds. For example, bit 623 above is the linear fragment O=C (8 for oxygen, 2 for double bond and 6 for carbon).

Note

This is a write-only format.

Write Options

f <id> fingerprint type
N <num> fold to specified number of bits, 32, 64, 128, etc.
h hex output when multiple molecules
o hex output only
s describe each set bit
u describe each unset bit