chemfp API¶
This chapter contains the docstrings for the public portion of the chemfp API. Chemfp also has internal modules and functions that should not be imported or used directly. If you use parts of the undocumented API then your code is more likely to break with newer chemfp releases.
See “Getting started with the API” for some introductory examples.
- chemfp top-level
- chemfp.arena
- chemfp.base_toolkit
- chemfp.bitops
- chemfp.cdk_toolkit
- chemfp.cdk_types
- chemfp.clustering
- chemfp.csv_readers
- chemfp.diversity
- chemfp.encodings
- chemfp.fpb_io
- chemfp.fps_io
- chemfp.fps_search
- chemfp.highlevel.clustering
- chemfp.highlevel.conversion
- chemfp.highlevel.diversity
- chemfp.highlevel.simarray
- chemfp.highlevel.similarity
- chemfp.io
- chemfp.jcmapper_types
- chemfp.openbabel_toolkit
- chemfp.openbabel_types
- chemfp.openeye_toolkit
- chemfp.openeye_types
- chemfp.rdkit_toolkit
- chemfp.rdkit_types
- chemfp.search
- chemfp.simarray_io
- chemfp.text_records
- chemfp.text_toolkit
- chemfp.toolkit
- chemfp.types
Overview¶
The top-level chemfp module is the starting point for using chemfp. It contains functions to read and write fingerprint files, “high-level” commands for working with chemfp, and more.
The API for the FPS and FPS fingerprint readers and writers are
defined in chemfp.fps_io
and chemfp.fpb_io
, which may
refer to a Location
object defined in chemfp.io
.
The fingerprint arena class is defined in chemfp.arena
.
The chemfp.search
module contains similarity search functions
for searching fingerprint arenas, and the SearchResult
and
SearchResults
result class definitions. It also contains the
similarity array functions to generate an all-by-all NumPy comparison
array. These are the low-level APIs used for the high-level
chemfp.simsearch()
and chemfp.simarray()
functions.
The chemfp.fps_search
module contains similarity search
functions for searching FPS files, and the search result class
definitions. This is only needed when working in a streaming
environment where fingerprint arena creation overhead is too large.
The chemfp.diversity
module contains chemfp’s diversity
pickers, all of which require a fingerprint arena. This is a
lower-level API than using chemfp.maxmin()
,
chemfp.heapsweep()
, or chemfp.spherex()
.
The chemfp.clustering
module contains the
ButinaClusters
result from Butina clustering using
chemfp.butina()
.
The chemfp.cdk_toolkit
, chemfp.openbabel_toolkit
,
chemfp.openeye_toolkit
and chemfp.rdkit_toolkit
modules
contain the public-facing API for chemfp’s cheminformatics toolkit
wrapper implementations. The chemfp.cdk
,
chemfp.openbabel
, chemfp.openeye
, chemfp.rdkit
objects will automatically import the underlying toolkit and forward
to them.
The FingerprintType
implementations for the different
toolkits are:
- CDK
chemfp.cdk_types
: core CDK toolkit fingerprintschemfp.cdk_patterns
: chemfp’s CDK-based fingerprintschemfp.jcmapper_types
: jCompoundMapper fingerprints
- RDKit
chemfp.rdkit_types
: core RDKit toolkit fingerprintschemfp.rdkit_patterns
: chemfp’s RDKit-based fingerprints
- OpenEye
chemfp.openeye_types
: core OEGraphSim fingerprintschemfp.openeye_patterns
: chemfp’s OEChem-based fingerprints
- Open Babel
chemfp.openbabel_types
: core Open Babel toolkit fingerprintschemfp.openbabel_patterns
: chemfp’s Open Babel-based fingerprints
Sometimes you need to work with SMILES or SD files as text records,
not molecules. For that, use the chemfp.text_toolkit
module.
Sometimes you need to work with CVS files containing structure records
or fingerprint. For that, use functions like
read_csv_ids_and_fingerprints()
and read_csv_rows()
from the chemfp.csv_readers
module, or the
read_csv_ids_and_molecules()
function in the toolkit
wrapper module.
The chemfp.bitops
module has functions to work with
fingerprints represented as byte strings or hex-encoded strings, as
well as configuration functions for configuring chemfp’s bit
operations. Use the chemfp.encodings
to decode from various
fingerprint string representations to a byte string.
Finally, the chemfp.types
module contains a few public
exceptions which derived from ValueError but which don’t yet also
derive from ChemFPError.