I've just released chemfp 1.6.1, which is the latest version of the no-cost/open source chemfp development track. You can download it from PyPI using:
python -m pip install chemfp
This minor release added specialized POPCNT implementations for all 8-byte-multiple fingerprint lengths up to 1024 bytes, plus a faster implementation for 8-byte-multiple lengths beyond that. Previously there were only specialized implementations for 24-, 64-, 112-, 128-, and 256-byte fingerprints, which are the most common in cheminformatics.
In one benchmark, small fingerprints (<256 bits) are about 20% faster, medium fingerprints (256 to 1024 bits) are about 10% faster, and larger fingerprints are a few percent faster.
It also added two new FingerprintArena methods. sample()
randomly selects a subset of the fingerprints and returns them in a new arena. train_test_split()
returns two randomly selected and disjoint subsets of the area, typically used as a training set and a test set.
Finally, it fixes a bug in fpcat where using --reorder
would write the FPS header twice.
NOTE: This version only supports Python 2.7. The Open Babel, OpenEye, and RDKit toolkits all dropped support for Python 2.7 by 2019. If you need to generate new fingerprints for use with chemfp 1.6.1, perhaps for benchmarking purposes, then install chemfp 3.4 for a Linux-based OS and have it generate the FPS files for you.