chemfp.highlevel.diversity module¶

This module should not be imported directly.

It contains internal implementation details of the high-level API available from the top-level chemfp module.

This module is included in the documentation because parts of this module are returned to the user, and are part of the public API.

class chemfp.highlevel.diversity.BaseHeapSweepSearch(candidates, seed, num_picked, threshold, times, picker, out, candidates_close)¶

Bases: _CloseArenas

This is the base class for objects returned by chemfp.heapsweep()

It contains the query parameters, search result output, and timings.

In addition, it is a context manager for any files which may have been opened.

candidates: chemfp.arena.FingerprintArena¶

get_description()¶: Return a human-readable description of the simsearch run

num_picked: int¶

out: Picks | PicksAndScores¶

picker: HeapSweepPicker¶

property picks¶: Shortcut for self.picker.picks.

property result¶

pre-chemfp 5.0 attribute to get the search result output.

The original API used “result” to store the low-level search results. This lead to confusion when using “result.result”.

chemfp 4.2 introduced “.out” as an alternative, inspired by the NumPy “out” parameter. With chemfp 5.0 “.out” became the preferred way to access the parameter.

seed: int¶

threshold: float¶

times: dict¶

class chemfp.highlevel.diversity.BaseSpherexSearch(candidates, references, num_initial_picks, threshold, seed, direction_msg, picker, num_picked, out, times, candidates_close, references_close)¶

Bases: _CloseArenas

This is the base class for objects returned by chemfp.spherex()

It contains the query parameters, search result output, and timings.

In addition, it is a context manager for any files which may have been opened.

candidates: chemfp.arena.FingerprintArena¶

direction_msg: str¶

get_description()¶: Return a human-readable description of the simsearch run

num_initial_picks: int | None¶

num_picked: int¶

out: Picks | PicksAndCounts | PicksAndNeighbors¶

picker: SphereExclusionPicker¶

references: chemfp.arena.FingerprintArena | None¶

property result¶

pre-chemfp 5.0 attribute to get the search result output.

The original API used “result” to store the low-level search results. This lead to confusion when using “result.result”.

chemfp 4.2 introduced “.out” as an alternative, inspired by the NumPy “out” parameter. With chemfp 5.0 “.out” became the preferred way to access the parameter.

seed: int¶

threshold: float¶

times: dict¶

class chemfp.highlevel.diversity.HeapSweepScoreSearch(candidates, seed, num_picked, threshold, times, picker, out, candidates_close)¶

Bases: BaseHeapSweepSearch

heapsweep() returns an instance of this class when include_scores=True

as_ctypes()¶

Return a ctypes view of the underlying hit data

Shortcut for heapsweep_search.result.as_ctypes(). See PicksAndScores.as_ctypes().

The view is a PickAndScore array with attributes named candidate_idx and score.

as_numpy()¶

Return a numpy view of the underlying hit data

Shortcut for heapsweep_search.result.as_numpy(). See PicksAndScores.as_numpy().

The view has a structure dtype with fields named “candidate_idx” and “score”.

get_ids()¶

Return a list of identifiers for the picks

Shortcut for heapsweep_search.result.get_ids(). See PicksAndScores.get_ids().

get_ids_and_scores()¶

Return a tuple of (id, score) for the picks

Shortcut for heapsweep_search.result.get_ids_and_scores(). See PicksAndScores.get_ids_and_scores().

get_indices()¶

Return a list of indices into the candidate arena for the picks

Shortcut for heapsweep_search.result.get_indices(). See PicksAndScores.get_indices().

get_indices_and_scores()¶

Return a tuple of (arena indices, score) for the picks

Shortcut for heapsweep_search.result.get_indices_and_scores(). See PicksAndScores.get_indices_and_scores().

get_scores()¶

Return a list of scores for the picks

Shortcut for heapsweep_search.result.get_scores(). See PicksAndScores.get_scores().

out: PicksAndScores¶

to_pandas(columns=['pick_id', 'score'])¶

Return a pandas DataFrame with the pick ids and scores

Shortcut for heapsweep_search.result.to_pandas(). See PicksAndScores.to_pandas().

The first column contains the ids, the second column contains the ids. The default columns headers are “pick_id” and “score”. Use columns to specify different headers.

Parameters:: columns (a list of two strings) – column names for the returned DataFrame
Returns:: a pandas DataFrame

class chemfp.highlevel.diversity.HeapSweepSearch(candidates, seed, num_picked, threshold, times, picker, out, candidates_close)¶

Bases: BaseHeapSweepSearch

heapsweep() returns an instance of this class when include_scores=False

as_ctypes()¶

Return a ctypes view of the underlying pick data

Shortcut for heapsweep_search.result.as_ctypes(). See Picks.as_ctypes().

The view is a Pick array with attributes named “candidate_idx” and “popcount”.

as_numpy()¶

Return a NumPy view of the underlying pick data

Shortcut for heapsweep_search.result.as_numpy(). See Picks.as_numpy().

The view has a structured dtype with fields named “candidate_idx” and “popcount”.

get_ids()¶

Return a list of ids for each pick

Shortcut for heapsweep_search.result.get_ids(). See Picks.get_ids().

get_indices()¶

Return a list of indices into the candidates arena for each pick

Shortcut for heapsweep_search.result.get_indices(). See Picks.get_indices().

out: Picks¶

to_pandas(*, column='pick_id')¶

Return the pick ids as a pandas DataFrame

Shortcut for heapsweep_search.result.to_pandas(). See Picks.to_pandas().

The default column header is “pick_id”. Use column to specify an alternate header.

Parameters:: column (a string) – the column header for the pick ids
Returns:: a pandas DataFrame

class chemfp.highlevel.diversity.MaxMinScoreSearch(candidates, references, seed, num_picked, threshold, times, picker, out, candidates_close, references_close)¶

Bases: BaseMaxMinSearch

maxmin() returns an instance of this type when include_scores=True

as_ctypes()¶

Return a ctypes view of the underlying hit data

Shortcut for maxmin_search.result.as_ctypes(). See PicksAndScores.as_ctypes().

The view is a PickAndScore array with attributes named candidate_idx and score.

as_numpy()¶

Return a numpy view of the underlying hit data

Shortcut for maxmin_search.result.as_numpy(). See PicksAndScores.as_numpy().

The view has a structure dtype with fields named “candidate_idx” and “score”.

get_ids()¶

Return a list of identifiers for the picks

Shortcut for maxmin_search.result.get_ids(). See PicksAndScores.get_ids().

get_ids_and_scores()¶

Return a tuple of (id, score) for the picks

Shortcut for maxmin_search.result.get_ids_and_scores(). See PicksAndScores.get_ids_and_scores().

get_indices()¶

Return a list of indices into the candidate arena for the picks

Shortcut for maxmin_search.result.get_indices(). See PicksAndScores.get_indices().

get_indices_and_scores()¶

Return a tuple of (arena indices, score) for the picks

Shortcut for maxmin_search.result.get_indices_and_scores(). See PicksAndScores.get_indices_and_scores().

get_scores()¶

Return a list of scores for the picks

Shortcut for maxmin_search.result.get_scores(). See PicksAndScores.get_scores().

out: PicksAndScores¶

to_pandas(columns=['pick_id', 'score'])¶

Return a pandas DataFrame with the pick ids and scores

Shortcut for maxmin_search.result.to_pandas(). See PicksAndScores.to_pandas().

The first column contains the ids, the second column contains the ids. The default columns headers are “pick_id” and “score”. Use columns to specify different headers.

Parameters:: columns (a list of two strings) – column names for the returned DataFrame
Returns:: a pandas DataFrame

class chemfp.highlevel.diversity.MaxMinSearch(candidates, references, seed, num_picked, threshold, times, picker, out, candidates_close, references_close)¶

Bases: BaseMaxMinSearch

maxmin() returns an instance of this type when include_scores=False

as_ctypes()¶

Return a ctypes view of the underlying pick data

Shortcut for maxmin_search.result.as_ctypes(). See Picks.as_ctypes().

The view is a Pick array with attributes named “candidate_idx” and “popcount”.

as_numpy()¶

Return a NumPy view of the underlying pick data

Shortcut for maxmin_search.result.as_numpy(). See Picks.as_numpy().

The view has a structured dtype with fields named “candidate_idx” and “popcount”.

get_ids()¶

Return a list of ids for each pick

Shortcut for maxmin_search.result.get_ids(). See Picks.get_ids().

get_indices()¶

Return a list of indices into the candidates arena for each pick

Shortcut for maxmin_search.result.get_indices(). See Picks.get_indices().

out: Picks¶

to_pandas(*, column='pick_id')¶

Return the pick ids as a pandas DataFrame

Shortcut for maxmin_search.result.to_pandas(). See Picks.to_pandas().

The default column header is “pick_id”. Use column to specify an alternate header.

Parameters:: column (a string) – the column header for the pick ids
Returns:: a pandas DataFrame

class chemfp.highlevel.diversity.SpherexCountSearch(candidates, references, num_initial_picks, threshold, seed, direction_msg, picker, num_picked, out, times, candidates_close, references_close)¶

Bases: BaseSpherexSearch

spherex() returns an instance of this class when include_counts is True

get_counts()¶

Return the array of counts for the picks

Shortcut for spherex_search.result.get_counts(). See PicksAndCounts.get_counts().

get_ids()¶

Return a list of pick ids for each pick

Shortcut for spherex_search.result.get_ids(). See PicksAndCounts.get_ids().

get_ids_and_counts()¶

Return a list of (pick id, count) for each pick

Shortcut for spherex_search.result.get_ids_and_counts(). See PicksAndCounts.get_ids_and_counts().

get_indices()¶

Return a list of indices into the candidates arena for each pick

Shortcut for spherex_search.result.get_indices(). See PicksAndCounts.get_indices().

get_indices_and_counts()¶

Return a list of (arena index, count) for each pick

Shortcut for spherex_search.result.get_indices_and_counts(). See PicksAndCounts.get_indices_and_counts().

to_pandas(columns=['pick_id', 'count'])¶

Return a pandas DataFrame with the pick ids and sphere exclusion counts.

Shortcut for spherex_search.result.to_pandas(). See PicksAndCounts.to_pandas().

The first column contains the ids, the second column contains the sphere exclusion counts. The default columns headers are “pick_id” and “count”. Use columns to specify different headers.

Parameters:: columns (a list of two strings) – column names for the returned DataFrame
Returns:: a pandas DataFrame

class chemfp.highlevel.diversity.SpherexNeighborSearch(candidates, references, num_initial_picks, threshold, seed, direction_msg, picker, num_picked, out, times, candidates_close, references_close)¶

Bases: BaseSpherexSearch

spherex() returns an instance of this class when include_neighbors is True

get_all_neighbors()¶

Return the list of all neighbors for each pick

Shortcut for spherex_search.result.get_all_neighbors(). See PicksAndNeighbors.get_all_neighbors().

get_counts()¶

Return the array of counts for the picks

Shortcut for spherex_search.result.get_counts(). See PicksAndNeighbors.get_counts().

get_ids_and_counts()¶

Return a list of (pick id, count) for each pick

Shortcut for spherex_search.result.get_ids_and_counts(). See PicksAndNeighbors.get_ids_and_counts().

get_ids_and_neighbors()¶

Return a tuple of (pick id, neighbors) for each pick

Shortcut for spherex_search.result.get_ids_and_neighbors(). See PicksAndNeighbors.get_ids_and_neighbors().

get_indices_and_counts()¶

Return a list of (pick index, count) for each pick

Shortcut for spherex_search.result.get_indices_and_counts(). See PicksAndNeighbors.get_indices_and_counts().

get_indices_and_neighbors()¶

Return a tuple of (candidate arena index, neighbors) for each pick

Shortcut for spherex_search.result.get_indices_and_neighbors(). See PicksAndNeighbors.get_indices_and_neighbors().

to_pandas(*, columns=['pick_id', 'neighbor_id', 'score'], empty=('*', None))¶

Return a pandas DataFrame with pick id and its sphere neighbor ids and scores

Shortcut for spherex_search.result.to_pandas(). See PicksAndNeighbors.to_pandas().

Each pick has zero or more neighbors. Each neighbor becomes a row in the output table, with the pick id in the first column, the neighbor id in the second, and the hit score in the third.

The default columns headers are “pick_id”, “neighbor_id” and “score”. Use columns to specify different headers.

If a pick has no neighbors then by default a row is added with the query id, ‘*’ as the target id, and None as the score (which pandas will treat as a NA value).

Use empty to specify different behavior for queries with no hits. If empty is None then no row is added to the table. If empty is a 2-element tuple the first element is used as the target id and the second is used as the score.

Parameters:: columns (a list of three strings) – column names for the returned DataFrame
Returns:: a pandas DataFrame

class chemfp.highlevel.diversity.SpherexSearch(candidates, references, num_initial_picks, threshold, seed, direction_msg, picker, num_picked, out, times, candidates_close, references_close)¶

Bases: BaseSpherexSearch

spherex() returns an instance of this class when include_counts and include_neighbors are False

as_ctypes()¶

Return a ctypes view of the underlying pick data

Shortcut for spherex_search.result.as_ctypes(). See Picks.as_ctypes().

The view is a Pick array with attributes named “candidate_idx” and “popcount”.

as_numpy()¶

Return a NumPy view of the underlying pick data

Shortcut for spherex_search.result.as_numpy(). See Picks.as_numpy().

The view has a structured dtype with fields named “candidate_idx” and “popcount”.

get_ids()¶

Return a list of ids for each pick

Shortcut for spherex_search.result.get_ids(). See Picks.get_ids().

get_indices()¶

Return a list of indices into the candidates arena for each pick

Shortcut for spherex_search.result.get_indices(). See Picks.get_indices().

to_pandas(*, column='pick_id')¶

Return the pick ids as a pandas DataFrame

Shortcut for spherex_search.result.to_pandas(). See Picks.to_pandas().

The default column header is “pick_id”. Use column to specify an alternate header.

Parameters:: column (a string) – the column header for the pick ids
Returns:: a pandas DataFrame