Difference between revisions of "Fingerprint (format)"

From Open Babel
Jump to: navigation, search
m (Fingerprint moved to Fingerprint (format): Adding new page / category for fingerprints (for developers))
 
Line 11: Line 11:
 
  Constructs and displays fingerprints and (for multiple input objects)
 
  Constructs and displays fingerprints and (for multiple input objects)
 
  the Tanimoto coefficient and whether a superstructure of the first object
 
  the Tanimoto coefficient and whether a superstructure of the first object
  Options e.g. -xfFP3 -xn128
+
  Options e.g. -xfFP3 -xN128
 
   f<id> fingerprint type
 
   f<id> fingerprint type
 
   N# fold to specified number of bits, 32, 64, 128, etc.
 
   N# fold to specified number of bits, 32, 64, 128, etc.
 
   h  hex output when multiple molecules
 
   h  hex output when multiple molecules
   F displays the available fingerprint types
+
   s describe each set bit
 +
  u  describe each unset bit
 
  </pre>
 
  </pre>
 
}}
 
}}
Line 21: Line 22:
 
This format generates molecular fingerprints.
 
This format generates molecular fingerprints.
  
For an intoduction to fingerprint types see http://www.mesaac.com/Fingerprint.htm
+
For an introduction to fingerprint types see http://www.mesaac.com/Fingerprint.htm
  
 
A list of available fingerprint types can be obtained by:
 
A list of available fingerprint types can be obtained by:
 
<pre>
 
<pre>
   babel -F
+
   babel -L fingerprints
 
</pre>
 
</pre>
  
Line 45: Line 46:
  
 
The Tanimoto coefficient has no absolute meaning and depends on the design of the fingerprint.
 
The Tanimoto coefficient has no absolute meaning and depends on the design of the fingerprint.
 +
 +
 +
In Fingerprint FP4 each bit corresponds to a particular chemical feature, which are specified as SMARTS patterns in SMARTS_InteLigand.txt. Use the -xs option to output a tab separated list of the features in a molecule. For instance a well-known molecule gives
 +
 +
Primary_carbon: Carboxylic_acid: Carboxylic_ester:  Carboxylic_acid_derivative: Vinylogous_carbonyl_or_carboxyl_derivative: Vinylogous_ester: Aromatic: Conjugated_double_bond: C_ONS_bond: 1,3-Tautomerizable: Rotatable_bond: CH-acidic:
  
 
[[Category:Formats]]
 
[[Category:Formats]]

Latest revision as of 15:18, 3 June 2008

Filename Extensions fpt
Chemical MIME Type Undefined
Specification URL Unknown
Import No
Export Yes
Open Babel Version 2.0.0 and later

Options

 Constructs and displays fingerprints and (for multiple input objects)
 the Tanimoto coefficient and whether a superstructure of the first object
 Options e.g. -xfFP3 -xN128
  f<id> fingerprint type
  N# fold to specified number of bits, 32, 64, 128, etc.
  h  hex output when multiple molecules
  s  describe each set bit
  u  describe each unset bit
 

Additional Comments

This format generates molecular fingerprints.

For an introduction to fingerprint types see http://www.mesaac.com/Fingerprint.htm

A list of available fingerprint types can be obtained by:

  babel -L fingerprints

The current default type FP2 is is of the Daylight type, indexing a molecule based on the occurrence of linear fragment up to 7 atoms in length. To use a fingerprint type other than the default, use the -xf option, e.g.

  babel infile.xxx -ofpt -xfFP3

For a single molecule the fingerprint is output in hexadecimal form (intended mainly for debugging).

With multiple molecules the hexadecimal form is output only if the -xh option is specified. But in addition the Tanimoto coefficient between the first molecule and each of the subsequent ones is displayed. If the first molecule is a substructure of the target molecule a note saying this is also displayed.

The Tanimoto coefficient is defined as:

 Number of bits set in (patternFP & targetFP) / Number of bits in (patternFP | targetFP)

where the boolean operations between the fingerprints are bitwise.

The Tanimoto coefficient has no absolute meaning and depends on the design of the fingerprint.


In Fingerprint FP4 each bit corresponds to a particular chemical feature, which are specified as SMARTS patterns in SMARTS_InteLigand.txt. Use the -xs option to output a tab separated list of the features in a molecule. For instance a well-known molecule gives

Primary_carbon: Carboxylic_acid: Carboxylic_ester: Carboxylic_acid_derivative: Vinylogous_carbonyl_or_carboxyl_derivative: Vinylogous_ester: Aromatic: Conjugated_double_bond: C_ONS_bond: 1,3-Tautomerizable: Rotatable_bond: CH-acidic: