chemfp.fps_io module¶
I/O routines for the FPS format.
This is an internal chemfp module. It should not be imported by programs which use the public API. (Let me know if anything else should be part of the public API.)
This module contains class definitions for a few objects which are
returned as part of the public API. The chemfp.open()
function
returns a FPSReader
which reads from an FPS file, and the
open_fingerprint_writer()
function returns an
FPSWriter
to write to an FPS file.
- class chemfp.fps_io.FPSReader(infile, close, metadata, location, block_reader)¶
Bases:
FingerprintReader
FPS file reader
This class implements the
chemfp.FingerprintReader
API. It is also its own a context manager, which automatically closes the file when the manager exits.The public attributes are:
- metadata: Metadata¶
A
chemfp.Metadata
instance with information about the fingerprint type.
- location: Location¶
A
chemfp.io.Location
instance with parser location and state information.The FPSReader.location only tracks the “lineno” property and (if possible) the “position”, “end_position”, and “position_units” properties.
- closed: bool¶
True if the file is open, else False
- close()¶
Close the file
- count_tanimoto_hits_arena(queries: _typing.FingerprintArena, threshold: float = 0.7)¶
Count the fingerprints which are sufficiently similar to each query fingerprint
Returns a list containing a count for each query fingerprint in the queries arena. The count is the number of fingerprints in the reader which are at least threshold similar to the query fingerprint.
The order of results is the same as the order of the queries.
- Parameters:
queries (a
FingerprintArena
) – query fingerprintsthreshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.7)
- Returns:
list of integer counts, one for each query
- count_tanimoto_hits_fp(query_fp: bytes, threshold: float = 0.7)¶
Count the fingerprints which are sufficiently similar to the query fingerprint
Return the number of fingerprints in the reader which are at least threshold similar to the query fingerprint query_fp.
- Parameters:
query_fp (byte string) – query fingerprint
threshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.7)
- Returns:
integer count
- count_tversky_hits_fp(query_fp: bytes, threshold: float = 0.7, alpha: float = 1.0, beta: float = 1.0)¶
Count the fingerprints which are sufficiently similar to the query fingerprint
Return the number of fingerprints in the reader which are at least threshold similar to the query fingerprint query_fp.
- Parameters:
query_fp (byte string) – query fingerprint
threshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.7)
- Returns:
integer count
- iter_blocks()¶
This is not part of the public API
- iter_rows()¶
This is not part of the public API
- knearest_tanimoto_search_arena(queries: _typing.FingerprintArena, k: int = 3, threshold: float = 0.0)¶
Find the k-nearest fingerprints which are sufficiently similar to each of the query fingerprints
For each fingerprint in the queries arena, find the fingerprints in this reader which are at least threshold similar to the query fingerprint, and of those, select the top k hits. The hits are returned as a
SearchResults
, where the hits in eachSearchResult
are sorted by similarity score.- Parameters:
queries (a
FingerprintArena
) – query fingerprintsthreshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.0)
- Returns:
- knearest_tanimoto_search_fp(query_fp: bytes, k: int = 3, threshold: float = 0.0)¶
Find the k-nearest fingerprints which are sufficiently similar to the query fingerprint
Find all of the fingerprints in this reader which are at least threshold similar to the query fingerprint, and of those, select the top k hits. The hits are returned as a
SearchResult
, sorted from highest score to lowest.- Parameters:
queries (a
FingerprintArena
) – query fingerprintsthreshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.0)
- Returns:
- knearest_tversky_search_fp(query_fp: bytes, k: int = 3, threshold: float = 0.0, alpha: float = 1.0, beta: float = 1.0)¶
Find the k-nearest fingerprints which are sufficiently similar to the query fingerprint
Find all of the fingerprints in this reader which are at least threshold similar to the query fingerprint, and of those, select the top k hits. The hits are returned as a
SearchResult
, sorted from highest score to lowest.- Parameters:
queries (a
FingerprintArena
) – query fingerprintsthreshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.0)
- Returns:
- next()¶
Return the next (id, fp) pair
- threshold_tanimoto_search_arena(queries: _typing.FingerprintArena, threshold: float = 0.7)¶
Find the fingerprints which are sufficiently similar to each of the query fingerprints
For each fingerprint in the queries arena, find all of the fingerprints in this arena which are at least threshold similar. The hits are returned as a
SearchResults
, where the hits in eachSearchResult
is in arbitrary order.- Parameters:
queries (a
FingerprintArena
) – query fingerprintsthreshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.7)
- Returns:
- threshold_tanimoto_search_fp(query_fp: bytes, threshold: float = 0.7)¶
Find the fingerprints which are sufficiently similar to the query fingerprint
Find all of the fingerprints in this reader which are at least threshold similar to the query fingerprint query_fp. The hits are returned as a
SearchResult
, in arbitrary order.- Parameters:
query_fp (byte string) – query fingerprint
threshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.7)
- Returns:
- threshold_tversky_search_fp(query_fp: bytes, threshold: float = 0.7, alpha: float = 1.0, beta: float = 1.0)¶
Find the fingerprints which are sufficiently similar to the query fingerprint
Find all of the fingerprints in this reader which are at least threshold similar to the query fingerprint query_fp. The hits are returned as a
SearchResult
, in arbitrary order.- Parameters:
query_fp (byte string) – query fingerprint
threshold (float between 0.0 and 1.0, inclusive) – minimum similarity threshold (default: 0.7)
- Returns:
- class chemfp.fps_io.FPSWriter(output, writer, metadata, location=None)¶
Bases:
FingerprintWriter
Write fingerprints in FPS format.
This is a subclass of
chemfp.FingerprintWriter
.An FPSWriter is its own context manager, and will close the output file on context exit.
The public attributes are:
- metadata: Metadata¶
A
chemfp.Metadata
instance describing the fingerprints being written.
- format: str¶
The string ‘fps’.
- closed: bool¶
False when the file is open, else True.
- location: Location¶
A
chemfp.io.Location
instance which supports the “recno”, “output_recno”, and “lineno” properties.
- close()¶
Close the writer
This will set self.closed to False.
- write_fingerprint(id: str, fp: bytes)¶
Write a single fingerprint record with the given id and fp
- Parameters:
id (string) – the record identifier
fp (bytes) – the fingerprint
- write_fingerprints(id_fp_pairs: Iterable[Tuple[str, bytes]])¶
Write a sequence of fingerprint records
- Parameters:
id_fp_pairs – An iterable of (id, fingerprint) pairs.