chemfp.highlevel.diversity module

This module should not be imported directly.

It contains internal implementation details of the high-level API available from the top-level chemfp module.

This module is included in the documentation because parts of this module are returned to the user, and are part of the public API.

class chemfp.highlevel.diversity.BaseHeapSweepSearch(candidates, seed, num_picked, threshold, times, picker, result, candidates_close)

Bases: _CloseArenas

This is the base class for objects returned by chemfp.heapsweep()

It contains the query parameters, search results, and timings.

In addition, it is a context manager for any files which may have been opened.

candidates: chemfp.arena.FingerprintArena
get_description()

Return a human-readable description of the simsearch run

num_picked: int
picker: HeapSweepPicker
property picks

Shortcut for self.picker.picks.

result: Picks | PicksAndScores
seed: int
threshold: float
times: dict
class chemfp.highlevel.diversity.BaseSpherexSearch(candidates, references, num_initial_picks, threshold, seed, direction_msg, picker, num_picked, result, times, candidates_close, references_close)

Bases: _CloseArenas

This is the base class for objects returned by chemfp.spherex()

It contains the query parameters, search results, and timings.

In addition, it is a context manager for any files which may have been opened.

candidates: chemfp.arena.FingerprintArena
direction_msg: str
get_description()

Return a human-readable description of the simsearch run

num_initial_picks: int | None
num_picked: int
picker: SphereExclusionPicker
references: chemfp.arena.FingerprintArena | None
result: Picks | PicksAndCounts | PicksAndNeighbors
seed: int
threshold: float
times: dict
class chemfp.highlevel.diversity.HeapSweepScoreSearch(candidates, seed, num_picked, threshold, times, picker, result, candidates_close)

Bases: BaseHeapSweepSearch

heapsweep() returns an instance of this class when include_scores=True

as_ctypes()

Return a ctypes view of the underlying hit data

Shortcut for heapsweep_search.result.as_ctypes(). See PicksAndScores.as_ctypes().

The view is a PickAndScore array with attributes named candidate_idx and score.

as_numpy()

Return a numpy view of the underlying hit data

Shortcut for heapsweep_search.result.as_numpy(). See PicksAndScores.as_numpy().

The view has a structure dtype with fields named “candidate_idx” and “score”.

get_ids()

Return a list of identifiers for the picks

Shortcut for heapsweep_search.result.get_ids(). See PicksAndScores.get_ids().

get_ids_and_scores()

Return a tuple of (id, score) for the picks

Shortcut for heapsweep_search.result.get_ids_and_scores(). See PicksAndScores.get_ids_and_scores().

get_indices()

Return a list of indices into the candidate arena for the picks

Shortcut for heapsweep_search.result.get_indices(). See PicksAndScores.get_indices().

get_indices_and_scores()

Return a tuple of (arena indices, score) for the picks

Shortcut for heapsweep_search.result.get_indices_and_scores(). See PicksAndScores.get_indices_and_scores().

get_scores()

Return a list of scores for the picks

Shortcut for heapsweep_search.result.get_scores(). See PicksAndScores.get_scores().

result: PicksAndScores
to_pandas(columns=['pick_id', 'score'])

Return a pandas DataFrame with the pick ids and scores

Shortcut for heapsweep_search.result.to_pandas(). See PicksAndScores.to_pandas().

The first column contains the ids, the second column contains the ids. The default columns headers are “pick_id” and “score”. Use columns to specify different headers.

Parameters:

columns (a list of two strings) – column names for the returned DataFrame

Returns:

a pandas DataFrame

class chemfp.highlevel.diversity.HeapSweepSearch(candidates, seed, num_picked, threshold, times, picker, result, candidates_close)

Bases: BaseHeapSweepSearch

heapsweep() returns an instance of this class when include_scores=False

as_ctypes()

Return a ctypes view of the underlying pick data

Shortcut for heapsweep_search.result.as_ctypes(). See Picks.as_ctypes().

The view is a Pick array with attributes named “candidate_idx” and “popcount”.

as_numpy()

Return a NumPy view of the underlying pick data

Shortcut for heapsweep_search.result.as_numpy(). See Picks.as_numpy().

The view has a structured dtype with fields named “candidate_idx” and “popcount”.

get_ids()

Return a list of ids for each pick

Shortcut for heapsweep_search.result.get_ids(). See Picks.get_ids().

get_indices()

Return a list of indices into the candidates arena for each pick

Shortcut for heapsweep_search.result.get_indices(). See Picks.get_indices().

result: Picks
to_pandas(*, column='pick_id')

Return the pick ids as a pandas DataFrame

Shortcut for heapsweep_search.result.to_pandas(). See Picks.to_pandas().

The default column header is “pick_id”. Use column to specify an alternate header.

Parameters:

column (a string) – the column header for the pick ids

Returns:

a pandas DataFrame

class chemfp.highlevel.diversity.MaxMinScoreSearch(candidates, references, seed, num_picked, threshold, times, picker, result, candidates_close, references_close)

Bases: BaseMaxMinSearch

maxmin() returns an instance of this type when include_scores=True

as_ctypes()

Return a ctypes view of the underlying hit data

Shortcut for maxmin_search.result.as_ctypes(). See PicksAndScores.as_ctypes().

The view is a PickAndScore array with attributes named candidate_idx and score.

as_numpy()

Return a numpy view of the underlying hit data

Shortcut for maxmin_search.result.as_numpy(). See PicksAndScores.as_numpy().

The view has a structure dtype with fields named “candidate_idx” and “score”.

get_ids()

Return a list of identifiers for the picks

Shortcut for maxmin_search.result.get_ids(). See PicksAndScores.get_ids().

get_ids_and_scores()

Return a tuple of (id, score) for the picks

Shortcut for maxmin_search.result.get_ids_and_scores(). See PicksAndScores.get_ids_and_scores().

get_indices()

Return a list of indices into the candidate arena for the picks

Shortcut for maxmin_search.result.get_indices(). See PicksAndScores.get_indices().

get_indices_and_scores()

Return a tuple of (arena indices, score) for the picks

Shortcut for maxmin_search.result.get_indices_and_scores(). See PicksAndScores.get_indices_and_scores().

get_scores()

Return a list of scores for the picks

Shortcut for maxmin_search.result.get_scores(). See PicksAndScores.get_scores().

result: PicksAndScores
to_pandas(columns=['pick_id', 'score'])

Return a pandas DataFrame with the pick ids and scores

Shortcut for maxmin_search.result.to_pandas(). See PicksAndScores.to_pandas().

The first column contains the ids, the second column contains the ids. The default columns headers are “pick_id” and “score”. Use columns to specify different headers.

Parameters:

columns (a list of two strings) – column names for the returned DataFrame

Returns:

a pandas DataFrame

class chemfp.highlevel.diversity.MaxMinSearch(candidates, references, seed, num_picked, threshold, times, picker, result, candidates_close, references_close)

Bases: BaseMaxMinSearch

maxmin() returns an instance of this type when include_scores=False

as_ctypes()

Return a ctypes view of the underlying pick data

Shortcut for maxmin_search.result.as_ctypes(). See Picks.as_ctypes().

The view is a Pick array with attributes named “candidate_idx” and “popcount”.

as_numpy()

Return a NumPy view of the underlying pick data

Shortcut for maxmin_search.result.as_numpy(). See Picks.as_numpy().

The view has a structured dtype with fields named “candidate_idx” and “popcount”.

get_ids()

Return a list of ids for each pick

Shortcut for maxmin_search.result.get_ids(). See Picks.get_ids().

get_indices()

Return a list of indices into the candidates arena for each pick

Shortcut for maxmin_search.result.get_indices(). See Picks.get_indices().

result: Picks
to_pandas(*, column='pick_id')

Return the pick ids as a pandas DataFrame

Shortcut for maxmin_search.result.to_pandas(). See Picks.to_pandas().

The default column header is “pick_id”. Use column to specify an alternate header.

Parameters:

column (a string) – the column header for the pick ids

Returns:

a pandas DataFrame

class chemfp.highlevel.diversity.SpherexCountSearch(candidates, references, num_initial_picks, threshold, seed, direction_msg, picker, num_picked, result, times, candidates_close, references_close)

Bases: BaseSpherexSearch

spherex() returns an instance of this class when include_counts is True

get_counts()

Return the array of counts for the picks

Shortcut for spherex_search.result.get_counts(). See PicksAndCounts.get_counts().

get_ids()

Return a list of pick ids for each pick

Shortcut for spherex_search.result.get_ids(). See PicksAndCounts.get_ids().

get_ids_and_counts()

Return a list of (pick id, count) for each pick

Shortcut for spherex_search.result.get_ids_and_counts(). See PicksAndCounts.get_ids_and_counts().

get_indices()

Return a list of indices into the candidates arena for each pick

Shortcut for spherex_search.result.get_indices(). See PicksAndCounts.get_indices().

get_indices_and_counts()

Return a list of (arena index, count) for each pick

Shortcut for spherex_search.result.get_indices_and_counts(). See PicksAndCounts.get_indices_and_counts().

to_pandas(columns=['pick_id', 'count'])

Return a pandas DataFrame with the pick ids and sphere exclusion counts.

Shortcut for spherex_search.result.to_pandas(). See PicksAndCounts.to_pandas().

The first column contains the ids, the second column contains the sphere exclusion counts. The default columns headers are “pick_id” and “count”. Use columns to specify different headers.

Parameters:

columns (a list of two strings) – column names for the returned DataFrame

Returns:

a pandas DataFrame

class chemfp.highlevel.diversity.SpherexNeighborSearch(candidates, references, num_initial_picks, threshold, seed, direction_msg, picker, num_picked, result, times, candidates_close, references_close)

Bases: BaseSpherexSearch

spherex() returns an instance of this class when include_neighbors is True

get_all_neighbors()

Return the list of all neighbors for each pick

Shortcut for spherex_search.result.get_all_neighbors(). See PicksAndNeighbors.get_all_neighbors().

get_counts()

Return the array of counts for the picks

Shortcut for spherex_search.result.get_counts(). See PicksAndNeighbors.get_counts().

get_ids_and_counts()

Return a list of (pick id, count) for each pick

Shortcut for spherex_search.result.get_ids_and_counts(). See PicksAndNeighbors.get_ids_and_counts().

get_ids_and_neighbors()

Return a tuple of (pick id, neighbors) for each pick

Shortcut for spherex_search.result.get_ids_and_neighbors(). See PicksAndNeighbors.get_ids_and_neighbors().

get_indices_and_counts()

Return a list of (pick index, count) for each pick

Shortcut for spherex_search.result.get_indices_and_counts(). See PicksAndNeighbors.get_indices_and_counts().

get_indices_and_neighbors()

Return a tuple of (candidate arena index, neighbors) for each pick

Shortcut for spherex_search.result.get_indices_and_neighbors(). See PicksAndNeighbors.get_indices_and_neighbors().

to_pandas(*, columns=['pick_id', 'neighbor_id', 'score'], empty=('*', None))

Return a pandas DataFrame with pick id and its sphere neighbor ids and scores

Shortcut for spherex_search.result.to_pandas(). See PicksAndNeighbors.to_pandas().

Each pick has zero or more neighbors. Each neighbor becomes a row in the output table, with the pick id in the first column, the neighbor id in the second, and the hit score in the third.

The default columns headers are “pick_id”, “neighbor_id” and “score”. Use columns to specify different headers.

If a pick has no neighbors then by default a row is added with the query id, ‘*’ as the target id, and None as the score (which pandas will treat as a NA value).

Use empty to specify different behavior for queries with no hits. If empty is None then no row is added to the table. If empty is a 2-element tuple the first element is used as the target id and the second is used as the score.

Parameters:

columns (a list of three strings) – column names for the returned DataFrame

Returns:

a pandas DataFrame

class chemfp.highlevel.diversity.SpherexSearch(candidates, references, num_initial_picks, threshold, seed, direction_msg, picker, num_picked, result, times, candidates_close, references_close)

Bases: BaseSpherexSearch

spherex() returns an instance of this class when include_counts and include_neighbors are False

as_ctypes()

Return a ctypes view of the underlying pick data

Shortcut for spherex_search.result.as_ctypes(). See Picks.as_ctypes().

The view is a Pick array with attributes named “candidate_idx” and “popcount”.

as_numpy()

Return a NumPy view of the underlying pick data

Shortcut for spherex_search.result.as_numpy(). See Picks.as_numpy().

The view has a structured dtype with fields named “candidate_idx” and “popcount”.

get_ids()

Return a list of ids for each pick

Shortcut for spherex_search.result.get_ids(). See Picks.get_ids().

get_indices()

Return a list of indices into the candidates arena for each pick

Shortcut for spherex_search.result.get_indices(). See Picks.get_indices().

to_pandas(*, column='pick_id')

Return the pick ids as a pandas DataFrame

Shortcut for spherex_search.result.to_pandas(). See Picks.to_pandas().

The default column header is “pick_id”. Use column to specify an alternate header.

Parameters:

column (a string) – the column header for the pick ids

Returns:

a pandas DataFrame