chemfp.highlevel.simarray module

This module should not be imported directly.

It contains internal implementation details of the high-level API available from the top-level chemfp module.

This module is included in the documentation because parts of this module are returned to the user, and are part of the public API.

class chemfp.highlevel.simarray.SimarrayResult(processor, *, times=None, queries_close=None, targets_close=None)

Bases: SimarrayContent

Store the result from calling chemfp.simarray()

It contains the input parameters, NumPy array, and timings.

In addition, it is a context manager for any files which may have been opened.

The public attributes are:

processor: SimarrayProcessor

The SimarrayProcessor used to create the array.

query_fp: bytes

The query fingerprint (if specified).

out: a NumPy array

The NumPy array containing the comparisons.

num_bits: int

The number of bits in the fingerprints.

dtype_str: str

A string describing the output array dtype

metric: SimarrayMetric

A SimarrayMetric describing the full metric parameters.

matrix_type: str

One of the following strings, to describes the content of self.out.

  • “N” if a 1-D vector containing the comparisons between a single query fingerprint and set of targets;

  • “NxM” if a 2-D array containing the comparisons between a set of queries and a set of targets;

  • “NxN” if a 2-D array containing the full comparisons between a set of fingerprint and itself

  • “upper-triangular” if a 2-D array containing the diagonal and upper-triangle comparisons between a set of fingerprint and itself. The lower triangle is left as the default zero value.

times: dict[str, float | None]

A dictionary of timings for the different processing stages, in seconds (as a float) or None if not relevant. The keys are:

  • load_queries - the time to load the queries

  • load_targets - the time to load the targets

  • init - the time to initialize the SimarrayProcessor

  • process - the time to generate the full array

  • total - the total time

closed: bool

False if close() was called or the context manager exited, otherwise True. If False then the processor will be None.

close() None

Close any files which may be open and set the processor to None

If queries or targets is a memory-mapped FPB file then the respective arena keeps an open file handle so fingerprint and identifier lookups continue to work.

Call this close() to close them explicitly, or use this object as a context manager to close them when exiting the context.

The close() method also sets the processor to None because its queries and targets arena may refer to those open files.

The close() method may be called multiple times.

get_description(include_times: bool = True) str

Return a human-readable description of the simarray generation.

Parameters:

include_times (bool) – if True, (the default), include the array generation time and the full time.

Returns:

str

get_times_description() str

Return string containing a human-readable description of the timing details.

property queries

The query arena (if present)

Returns None if the SimarrayResult is closed.

property query_ids

the query identifiers (if present) else None

property target_ids

the target identifiers

property targets

The target arena

This is also the arena used in NxN generation.

Returns None if the SimarrayResult is closed.