Fast cheminformatics fingerprint search, anywhere you use Python

Chemfp is an analytics platform for cheminformatics fingerprints. It contains command-line tools and an extensive Python library for fingerprint generation, high-performance similarity search, diversity selection, and exploratory research.

Its market-leading performance and comprehensive API make it easy for you to add fast similarity search anywhere you use Python.

NEW! Chemfp 4.1 was released on 17 May 2023. See the documentation for the full list of notable changes or go to the download page.

Why chemfp?

If that sounds interesting

You can get started by downloading the pre-compiled Linux version of chemfp using the following:

python -m pip install chemfp -i https://chemfp.com/packages/

A few features are either limited or disabled. Visit the licensing page to see the licensing terms, to request a evaluation key to unlock those features, and learn about some of the available licensing options.

You do not need to request a license key for Tanimoto searches of the licensed FPB files available from the datasets page, so long as you follow the terms of the Chemfp Base License Agreement.

More information

Chemfp includes extensive documentation. For a more scholarly description, see: Dalke, A. The chemfp project. J. Cheminformatics 11, 76 (2019). doi: 10.1186/s13321-019-0398-8

Open source reference baseline for benchmarking

Chemfp 1.6.1 is the latest version of the no-cost/open source chemfp development track. It only supports Python 2.7. It is being maintained in order to provide a good reference baseline to evaluate similarity search performance, and to support the dwindling number of legacy users who haven't moved to Python 3. See the download page for download details.

Some of the many improvements in chemfp 4.1 are: Butina clustering, read/write to SciPy compressed sparse matrix npz files, CXSMILES support, CSV format readers, parallelized sphere exclusion along with new ranking methods, and improved API for molecule structure and file processing.

Some of the improvements in chemfp 4.0 were MaxMin and sphere exclusion diversity selection, improved API for notebook use, pandas integration, and support for CSV/TSV output.