Chemfp toolkit API

This chapter describes the common chemfp toolkit API, with links to the cheminformatics toolkit-specific implementations. It is not an actual chemfp module. You cannot import “chemfp.toolkit”. The toolkit wrapper modules are:

See also examples of using the toolkit API.

Chemfp has a standard API for reading and writing structure files and working with SD tag data, but chemfp doesn’t implement its own I/O routines and instead leaves that to a third-party cheminformatics package. Instead, it implements a toolkit wrapper which maps the chemfp toolkit API as appropriate for the underlying toolkit API.

For example, by commenting or uncommenting as appropriate, the following reads “input.smi” as a SMILES file and writes the structures to “output.sdf.gz” as a gzip-compressed SDF.

## Choose your preferred toolkit
#from chemfp import rdkit_toolkit as T
#from chemfp import openeye_toolkit as T
#from chemfp import openbabel_toolkit as T
from chemfp import cdk_toolkit as T

with T.read_molecules("input.smi") as reader:
    with T.open_molecule_writer("output.sdf.gz") as writer:
        writer.write_molecules(reader)

Chemfp also has a "text toolkit", which implements the toolkit API but reading and writing records as text records, rather than molecules.

Chemfp does not have a cross-toolkit molecule API. For that, use Cinfony.

name

chemfp.toolkit.name

One of “cdk”, “openbabel”, “openeye”, “rdkit”, or “text”.

[cdk_toolkit.name] [openbabel_toolkit.name] [openeye_toolkit.name] [rdkit_toolkit.name] [text_toolkit.name]

software

chemfp.toolkit.software

A string describing the underlying toolkit version. Examples: “RDKit/2023.09.5”, “OEChem/20220607”, “CDK/2.9”, and “OpenBabel/3.1.0”.

[cdk_toolkit.software] [openbabel_toolkit.software] [openeye_toolkit.software] [rdkit_toolkit.software] [text_toolkit.software]

is_available

chemfp.toolkit.is_available

A boolean value which is True if the underlying third-party cheminformatics toolkit is available. This is useful using the toolkit shortcut variables, for example:

if not chemfp.rdkit.is_available:
    print("RDKit is not available")

[cdk_toolkit.is_available] [openbabel_toolkit.is_available] [openeye_toolkit.is_available] [rdkit_toolkit.is_available]

is_licensed

chemfp.toolkit.is_licensed()

Return True if the underlying toolkit is licensed. CDK, Open Babel, and RDKit are always licensed.

Returns:

a Boolean

[cdk_toolkit.is_licensed] [openbabel_toolkit.is_licensed] [openeye_toolkit.is_licensed] [rdkit_toolkit.is_licensed] [text_toolkit.is_licensed]

suppress_log_output

chemfp.toolkit.suppress_log_output()

Return a context manager to disable toolkit logging.

The toolkits write warning and error messages to stderr, which you may not want. This context mananger uses the toolkit-specific methods to disable those messages while in the context block.

For example:

with suppress_output():
   mol = T.parse_smistring("QWERTY", errors="ignore")
if mol is None:
    print("Could not parse")

The contact manager may be entered multiple times. Logging will not be re-enabled until the matching number of exits.

[cdk_toolkit.suppress_log_output] [openbabel_toolkit.suppress_log_output] [openeye_toolkit.suppress_log_output] [rdkit_toolkit.suppress_log_output]

get_formats

chemfp.toolkit.get_formats(include_unavailable=False)

Get the list of structure formats supported by the toolkit.

If include_unavailable is True then also include formats which aren’t available to this specific version of the toolkit.

Parameters:

include_unavailable (True or False) – include unavailable formats?

Returns:

a list of Format objects

[cdk_toolkit.get_formats] [openbabel_toolkit.get_formats] [openeye_toolkit.get_formats] [rdkit_toolkit.get_formats] [text_toolkit.get_formats]

get_input_formats

chemfp.toolkit.get_input_formats()

Get the list of supported toolkit input formats.

Returns:

a list of Format objects

[cdk_toolkit.get_input_formats] [openbabel_toolkit.get_input_formats] [openeye_toolkit.get_input_formats] [rdkit_toolkit.get_input_formats] [text_toolkit.get_input_formats]

get_output_formats

chemfp.toolkit.get_output_formats()

Get the list of supported output formats.

Returns:

a list of Format objects

[cdk_toolkit.get_output_formats] [openbabel_toolkit.get_output_formats] [openeye_toolkit.get_output_formats] [rdkit_toolkit.get_output_formats] [text_toolkit.get_output_formats]

get_format

chemfp.toolkit.get_format(format)

Get the named format, or raise a ValueError

This will raise a ValueError if the toolkit does not implement the format format_name or that format is not available.

Parameters:

format_name (a string) – the format name

Returns:

a list of Format objects

[cdk_toolkit.get_format] [openbabel_toolkit.get_format] [openeye_toolkit.get_format] [rdkit_toolkit.get_format] [text_toolkit.get_format]

get_input_format

chemfp.toolkit.get_input_format(format)

Get the named input format, or raise a ValueError.

This will raise a ValueError if the toolkit does not implement the format format_name or that format is not an input format.

Parameters:

format_name (a string) – the format name

Returns:

a list of Format objects

[cdk_toolkit.get_input_format] [openbabel_toolkit.get_input_format] [openeye_toolkit.get_input_format] [rdkit_toolkit.get_input_format] [text_toolkit.get_input_format]

get_output_format

chemfp.toolkit.get_output_format(format)

Get the named format, or raise a ValueError

This will raise a ValueError if the toolkit does not implement the format format_name or that format is not an output format.

Parameters:

format_name (a string) – the format name

Returns:

a list of Format objects

[cdk_toolkit.get_output_format] [openbabel_toolkit.get_output_format] [openeye_toolkit.get_output_format] [rdkit_toolkit.get_output_format] [text_toolkit.get_output_format]

get_input_format_from_source

chemfp.toolkit.get_input_format_from_source(source=None, format=None)

Get the most appropriate format given the available source and format information

If format is a Format then return it. If it’s a Format-like object with “name” and “compression” attributes use it to make a real Format object with the same attributes. If it’s a string then use it to create a Format object.

If format is None, use the source to auto-detect the format. If auto-detection is not possible, assume it’s an uncompressed SMILES file.

Parameters:
  • source (a filename (as a string), a file object, or None to read from stdin) – the structure data source.

  • format (a Format(-like) object, string, or None) – format information, if known.

Returns:

a Format object

[cdk_toolkit.get_input_format_from_source] [openbabel_toolkit.get_input_format_from_source] [openeye_toolkit.get_input_format_from_source] [rdkit_toolkit.get_input_format_from_source] [text_toolkit.get_input_format_from_source]

get_output_format_from_destination

chemfp.toolkit.get_output_format_from_destination(destination=None, format=None)

Get the most appropriate format given the available destination and format information.

If format is a Format then return it. If it’s a Format-like object with “name” and “compression” attributes use it to make a real Format object with the same attributes. If it’s a string then use it to create a Format object.

If format is None, use the destination to auto-detect the format. If auto-detection is not possible, assume it’s an uncompressed SMILES file.

Parameters:
  • destination (a filename (as a string), a file object, or None to read from stdin) – The structure data source.

  • format (a Format(-like) object, string, or None) – format information, if known.

Returns:

a Format object

[cdk_toolkit.get_output_format_from_destination] [openbabel_toolkit.get_output_format_from_destination] [openeye_toolkit.get_output_format_from_destination] [rdkit_toolkit.get_output_format_from_destination] [text_toolkit.get_output_format_from_destination]

read_molecules

chemfp.toolkit.read_molecules(source=None, format=None, id_tag=None, reader_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict')

Return an iterator that reads molecules (or records) from a structure file.

Iterate through the format structure records in source. If format is None then auto-detect the format based on the source. For SD files, use id_tag to get the record id from the given SD tag instead of the title line. (read_molecules() will ignore the id_tag. It exists to make it easier to switch between reader functions.)

The reader_args dictionary parameters depend on the format and toolkit. The common ones are:

  • SMILES:

    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None

    • has_header - True or False

  • InChI:

    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None

    • has_header - True or False

The errors parameter specifies how to handle errors. “strict” raises an exception, “report” sends a message to stderr and goes to the next record, and “ignore” goes to the next record.

The location parameter takes a chemfp.io.Location instance. If None then a default Location will be created.

See chemfp.toolkit.read_ids_and_molecules() if you want (id, molecule) pairs instead of just the molecules.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the structure source

  • format (a format name string, or Format object, or None to auto-detect) – the input structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary) – reader parameters passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

  • location (a chemfp.io.Location object, or None) – object used to track parser state information

Returns:

a MoleculeReader iterating molecules

[cdk_toolkit.read_molecules] [openbabel_toolkit.read_molecules] [openeye_toolkit.read_molecules] [rdkit_toolkit.read_molecules] [text_toolkit.read_molecules]

read_molecules_from_string

chemfp.toolkit.read_molecules_from_string(content, format, id_tag=None, reader_args=None, errors='strict', location=None)

Return an iterator that reads molecules (or records) from a string containing structure records.

content is a string containing 0 or more records in the format format. See chemfp.toolkit.read_molecules() for details about the other parameters. See chemfp.toolkit.read_ids_and_molecules_from_string() if you want to read (id, molecule) pairs instead of just molecules.

Parameters:
  • content (a string) – the string containing structure records

  • format (a format name string, or Format object) – the input structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

  • location (a chemfp.io.Location object, or None) – object used to track parser state information

Returns:

a MoleculeReader iterating molecules

[cdk_toolkit.read_molecules_from_string] [openbabel_toolkit.read_molecules_from_string] [openeye_toolkit.read_molecules_from_string] [rdkit_toolkit.read_molecules_from_string] [text_toolkit.read_molecules_from_string]

read_ids_and_molecules

chemfp.toolkit.read_ids_and_molecules(source=None, format=None, id_tag=None, reader_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict')

Return an iterator that reads (id, molecule) pairs from a structure file.

See chemfp.toolkit.read_molecules() for full parameter details. The major difference is that this returns an iterator of (id, molecule) pairs instead of just the molecules.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the structure source

  • format (a format name string, or Format object, or None to auto-detect) – the input structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

  • location (a chemfp.io.Location object, or None) – object used to track parser state information

Returns:

an IdAndMoleculeReader iterating (id, molecule) pairs

[cdk_toolkit.read_ids_and_molecules] [openbabel_toolkit.read_ids_and_molecules] [openeye_toolkit.read_ids_and_molecules] [rdkit_toolkit.read_ids_and_molecules] [text_toolkit.read_ids_and_molecules]

read_csv_ids_and_molecules

chemfp.toolkit.read_csv_ids_and_molecules(source, *, id_column=1, mol_column=2, dialect=None, has_header=True, compression='auto', format='smi', id_tag=None, reader_args=None, errors='report', csv_errors='strict', location=None, encoding='utf8', encoding_errors='strict')

Read ids and molecules from column(s) of a CSV file.

Read from source, which may be a filename, a file-like object, or None (the default) to read from stdin.

Use id_column and mol_column to specify the columns containing the record identifier and molecule record. By default the identifiers come from column 1 (the first column) and the molecules from column 2 (the second column). Columns can be specified by integer position (starting with 1), or by a string matching the title from the header line. If id_column is None then the molecule id will come from parsing the molecule record.

Use dialect to specify the type of CSV file. The default of None infers the dialect from the filename extension; *.csv for comma-separated, and *.tsv for tab-separated. The dialect can be specified directly as “csv” or “tsv”, as a registered Python csv dialect at https://docs.python.org/3/library/csv.html (though “excel” is the same as “csv” and “excel-tab” is the same as “tsv”), or as a csv.Dialect or a CSVDialect instance.

If has_header is True then the first line/record contains column titles, and if False then there are no column titles.

Use compression to specify how the file compression format. The default “auto” uses the filename extension. Other options are “gz” and “zst”, or the empty string “” to mean no compresssion.

Use format to specify the structure format for how to parse the molecule column. The default of ‘smi’ will parse it as a SMILES string and, if id_column=None, will also parse any identifier.

The id_tag and reader_args arguments contain additional format configuration parameters.

The errors and csv_errors describe how to handle failures in molecule parsing and CSV parsing, respectively. The default is to report molecule parse failures to stderr, and to stop parsing if a CSV row does not contain enough columns.

The location parameter takes a chemfp.io.Location instance. If None then a default Location will be created.

The encoding and encoding_errors are strings describing the input file character encoding, and how to handle decoding errors. See https://docs.python.org/3/library/codecs.html#error-handlers and https://docs.python.org/3/library/codecs.html#error-handlers for details.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the CSV source

  • id_column (integer position (starting from 1), string, or None) – the column position or column title containing the identifier

  • mol_column (integer position (starting from 1), string) – the column position or column title containing the structure record

  • dialect (None, a string name, or a Dialect instance) – the CSV dialect

  • has_header (bool) – True if the first record contains titles, False of it does not

  • compression (string or None) – file compression format

  • format (a format name string, or Format object) – the molecule structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle molecule parse errors

  • csv_errors (one of "strict", "report", or "ignore") – specify how to handle CSV errors

  • location (a chemfp.io.Location object, or None) – object used to track parser state information

  • encoding (string) – the name of the file’s character encoding

  • encoding_errors (string) – the method used handle decoding errors

Returns:

an IdAndMoleculeReader iterating (id, molecule) pairs

[cdk_toolkit.read_csv_ids_and_molecules] [openbabel_toolkit.read_csv_ids_and_molecules] [openeye_toolkit.read_csv_ids_and_molecules] [rdkit_toolkit.read_csv_ids_and_molecules] [text_toolkit.read_csv_ids_and_molecules]

read_ids_and_molecules_from_string

chemfp.toolkit.read_ids_and_molecules_from_string(content, format, id_tag=None, reader_args=None, errors='strict', location=None)

Return an iterator that reads (id, molecule) pairs from a string containing structure records.

content is a string containing 0 or more records in the format format. See chemfp.toolkit.read_molecules() for details about the other parameters. See chemfp.toolkit.read_molecules_from_string() if you just want to read the molecules instead of (id, molecule) pairs.

Parameters:
  • content (a string) – the string containing structure records

  • format (a format name string, or Format object) – the input structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

  • location (a chemfp.io.Location object, or None) – object used to track parser state information

Returns:

an IdAndMoleculeReader iterating (id, molecule) pairs

[cdk_toolkit.read_ids_and_molecules_from_string] [openbabel_toolkit.read_ids_and_molecules_from_string] [openeye_toolkit.read_ids_and_molecules_from_string] [rdkit_toolkit.read_ids_and_molecules_from_string] [text_toolkit.read_ids_and_molecules_from_string]

make_id_and_molecule_parser

chemfp.toolkit.make_id_and_molecule_parser(format, id_tag=None, reader_args=None, errors='strict')

Create a specialized function which takes a record and returns an (id, molecule) pair

The returned function is optimized for reading many records from individual strings because it only does parameter validation once. However, I haven’t really noticed much of a performance difference between this and chemfp.toolkit.parse_id_and_molecule() so I suggest you use that function directly instead of making a specialized function. (Let me know if making a specialized function is useful.)

See chemfp.toolkit.read_molecules() for details about the other parameters.

Parameters:
  • format (a format name string, or Format object) – the input structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

Returns:

a function of the form parser(record string) -> (id, molecule)

[cdk_toolkit.make_id_and_molecule_parser] [openbabel_toolkit.make_id_and_molecule_parser] [openeye_toolkit.make_id_and_molecule_parser] [rdkit_toolkit.make_id_and_molecule_parser] [text_toolkit.make_id_and_molecule_parser]

parse_molecule

chemfp.toolkit.parse_molecule(content, format, id_tag=None, reader_args=None, errors='strict')

Parse the first structure record from the content string and return a molecule.

content is a string containing a single structure record in format format. (Additional records are ignored). See chemfp.toolkit.read_molecules() for details about the other parameters. See chemfp.toolkit.parse_id_and_molecule() if you want the (id, molecule) pair instead of just the molecule.

Parameters:
  • content (a string) – the string containing a structure record

  • format (a format name string, or Format object) – the input structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

Returns:

a molecule

[cdk_toolkit.parse_molecule] [openbabel_toolkit.parse_molecule] [openeye_toolkit.parse_molecule] [rdkit_toolkit.parse_molecule] [text_toolkit.parse_molecule]

parse_id_and_molecule

chemfp.toolkit.parse_id_and_molecule(content, format, id_tag=None, reader_args=None, errors='strict')

Parse the first structure record from content and return the (id, molecule) pair.

content is a string containing a single structure record in format format. (Additional records are ignored). See chemfp.toolkit.read_molecules() for details about the other parameters.

See chemfp.toolkit.read_molecules() for details about the other parameters. See chemfp.toolkit.parse_molecule() if just want the molecule and not the the (id, molecule) pair.

Parameters:
  • content (a string) – the string containing a structure record

  • format (a format name string, or Format object) – the input structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

Returns:

an (id, molecule) pair

[cdk_toolkit.parse_id_and_molecule] [openbabel_toolkit.parse_id_and_molecule] [openeye_toolkit.parse_id_and_molecule] [rdkit_toolkit.parse_id_and_molecule] [text_toolkit.parse_id_and_molecule]

create_string

chemfp.toolkit.create_string(mol, format, id=None, writer_args=None, errors='strict')

Convert a molecule into a structure record in the given format as a Unicode string

If id is not None then use it instead of the molecule’s own title. Warning: this may briefly modify the molecule, so may not be thread-safe.

Parameters:
  • mol (a molecule) – the molecule to use for the output

  • format (a format name string, or Format object) – the output structure format

  • id (a string, or None to use the molecule's own id) – an alternate record id

  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

Returns:

a Unicode string

[cdk_toolkit.create_string] [openbabel_toolkit.create_string] [openeye_toolkit.create_string] [rdkit_toolkit.create_string] [text_toolkit.create_string]

create_bytes

chemfp.toolkit.create_bytes(mol, format, id=None, writer_args=None, errors='strict', level=None)

Convert a molecule into a structure record in the given format as a byte string.

If id is not None then use it instead of the molecule’s own title. Warning: this may briefly modify the molecule, so may not be thread-safe.

Parameters:
  • mol (a molecule) – the molecule to use for the output

  • format (a format name string, or Format object) – the output structure format

  • id (a string, or None to use the molecule's own id) – an alternate record id

  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats

Returns:

a byte string

[cdk_toolkit.create_bytes] [openbabel_toolkit.create_bytes] [openeye_toolkit.create_bytes] [rdkit_toolkit.create_bytes] [text_toolkit.create_bytes]

translate_record

chemfp.toolkit.translate_record(content, in_format='smi', out_format='smi', *, id_tag=None, reader_args=None, writer_args=None, id=None, errors='strict')

Translate a molecule record from one format to another.

Use the toolkit to parse the content as format in_format (default: “smi”) and translate it into out_format (default: “smi”). For an SDF record, use id_tag to get the record id from the given SD tag instead of the title line. Use reader_args and writer_args to configure format-specific parameters. Use id to set the id of the output record.

The errors parameter specifies how to handle errors. “strict” raises an exception, “report” sends a message to stderr and goes to the next record, and “ignore” goes to the next record.

Parameters:
  • content (a string) – the string containing a structure record

  • in_format (a format name string, or Format object) – the input structure format

  • out_format (a format name string, or Format object) – the output structure format

  • id_tag (string, or None to use the record title) – SD tag containing the record id

  • reader_args (a dictionary, or None) – reader arguments for the specified in_format

  • writer_args (a dictionary, or None) – writer arguments for the specified out_format

  • id (a string, or None to use the default) – the record id to use for the output record

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

Returns:

a string

[cdk_toolkit.translate_record] [openbabel_toolkit.translate_record] [openeye_toolkit.translate_record] [rdkit_toolkit.translate_record]

open_molecule_writer

chemfp.toolkit.open_molecule_writer(destination=None, format=None, writer_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict', level=None)

Return a MoleculeWriter which can write molecules to a destination.

A MoleculeWriter has the methods write_molecule, write_molecules, and write_ids_and_molecules, which are ways to write a molecule, a molecule iterator, or an (id, molecule) pair iterator to a file.

Molecules are written to destination. The output format can be a string like “sdf.gz” or “smi”, a Format, or Format-like object with “name” and “compression” attributes, or None to auto-detect based on the destination. If auto-detection is not possible, the output will be written as uncompressed SMILES.

The writer_args dictionary parameters depend on the format and toolkit. The common ones are:

  • SMILES:

    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None

    • cxsmiles - True to include CXSMILES annotations; default is False

  • InChI and InChIKey:

    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None

    • include_id - True or default to include the id as the second column; False has no id column

The errors parameter specifies how to handle errors. “strict” raises an exception, “report” sends a message to stderr and goes to the next record, and “ignore” goes to the next record.

The location parameter takes a chemfp.io.Location instance. If None then a default Location will be created.

Parameters:
  • destination (a filename, file object, or None to write to stdout) – the structure destination

  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format

  • writer_args (a dictionary) – writer parameters passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

  • location (a chemfp.io.Location object, or None) – object used to track writer state information

  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_molecule_writer] [openbabel_toolkit.open_molecule_writer] [openeye_toolkit.open_molecule_writer] [rdkit_toolkit.open_molecule_writer] [text_toolkit.open_molecule_writer]

open_molecule_writer_to_string

chemfp.toolkit.open_molecule_writer_to_string(format, writer_args=None, errors='strict', location=None)

Return a MoleculeStringWriter which can write molecule records in the given format to a string.

See chemfp.toolkit.open_molecule_writer() for full parameter details.

Use the writer’s MoleculeStringWriter.getvalue() to get the output as a Unicode string.

Parameters:
  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format

  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

  • location (a chemfp.io.Location object, or None) – object used to track writer state information

Returns:

a MoleculeStringWriter expecting toolkit molecules

[cdk_toolkit.open_molecule_writer_to_string] [openbabel_toolkit.open_molecule_writer_to_string] [openeye_toolkit.open_molecule_writer_to_string] [rdkit_toolkit.open_molecule_writer_to_string] [text_toolkit.open_molecule_writer_to_string]

open_molecule_writer_to_bytes

chemfp.toolkit.open_molecule_writer_to_bytes(format, writer_args=None, errors='strict', location=None, level=None)

Return a MoleculeStringWriter which can write molecule records in the given format to a text string.

See chemfp.toolkit.open_molecule_writer() for full parameter details.

Use the writer’s MoleculeStringWriter.getvalue() to get the output as a byte string.

Parameters:
  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format

  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit

  • errors (one of "strict", "report", or "ignore") – specify how to handle errors

  • location (a chemfp.io.Location object, or None) – object used to track writer state information

  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats

Returns:

a MoleculeStringWriter expecting toolkit molecules

[cdk_toolkit.open_molecule_writer_to_bytes] [openbabel_toolkit.open_molecule_writer_to_bytes] [openeye_toolkit.open_molecule_writer_to_bytes] [rdkit_toolkit.open_molecule_writer_to_bytes] [text_toolkit.open_molecule_writer_to_bytes]

copy_molecule

chemfp.toolkit.copy_molecule(mol)

Return a new molecule which is a copy of the given molecule.

Parameters:

mol (a toolkit molecule) – the molecule to copy

Returns:

a new toolkit molecule instance

[cdk_toolkit.copy_molecule] [openbabel_toolkit.copy_molecule] [openeye_toolkit.copy_molecule] [rdkit_toolkit.copy_molecule] [text_toolkit.copy_molecule]

add_tag

chemfp.toolkit.add_tag(mol, tag, value)

Add an SD tag value to the molecule object.

Parameters:
  • mol (an toolkit molecule) – the molecule

  • tag (string) – the SD tag name

  • value (string) – the text for the tag

Returns:

None

[cdk_toolkit.add_tag] [openbabel_toolkit.add_tag] [openeye_toolkit.add_tag] [rdkit_toolkit.add_tag] [text_toolkit.add_tag]

get_tag

chemfp.toolkit.get_tag(mol, tag)

Get the named SD tag value, or None if it doesn’t exist

Parameters:
  • mol (a toolkit molecule) – the molecule

  • tag (string) – the SD tag name

Returns:

a string, or None

[cdk_toolkit.get_tag] [openbabel_toolkit.get_tag] [openeye_toolkit.get_tag] [rdkit_toolkit.get_tag] [text_toolkit.get_tag]

get_tag_pairs

chemfp.toolkit.get_tag_pairs(mol)

Get a list of all SD tag (name, value) pairs for the molecule.

Parameters:

mol (a toolkit molecule) – the molecule

Returns:

a list of (string name, string value) pairs

[cdk_toolkit.get_tag_pairs] [openbabel_toolkit.get_tag_pairs] [openeye_toolkit.get_tag_pairs] [rdkit_toolkit.get_tag_pairs] [text_toolkit.get_tag_pairs]

get_id

chemfp.toolkit.get_id(mol)

Get the molecule’s id from the toolkit molecule object.

Parameters:

mol (a toolkit molecule) – the molecule

Returns:

a string

[cdk_toolkit.get_id] [openbabel_toolkit.get_id] [openeye_toolkit.get_id] [rdkit_toolkit.get_id] [text_toolkit.get_id]

set_id

chemfp.toolkit.set_id(mol, id)

Set the id for the toolkit molecule object.

Parameters:
  • mol (a toolkit molecule) – the molecule

  • id (string) – the new id

Returns:

None

[cdk_toolkit.set_id] [openbabel_toolkit.set_id] [openeye_toolkit.set_id] [rdkit_toolkit.set_id] [text_toolkit.set_id]

parse_smistring

chemfp.toolkit.parse_smistring(content: Union[str, bytes], *,..., cxsmiles: bool = True, errors: str = 'strict')

Parse a SMILES string using the toolkit.

This is equivalent to calling:

parse_molecule(content, "smistring", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a SMILES string or line from a SMILES file

  • cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule

[cdk_toolkit.parse_smistring] [openbabel_toolkit.parse_smistring] [openeye_toolkit.parse_smistring] [rdkit_toolkit.parse_smistring] [text_toolkit.parse_smistring]

create_smistring

chemfp.toolkit.create_smistring(mol: Any, *, id: Optional[str] = None, ..., cxsmiles: bool = False, errors: str = 'strict') str | None

Generate a SMILES string from a toolkit molecule

This is equivalent to calling:

create_string(mol, "smistring", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • cxsmiles (Boolean (default: False)) – If true, generate CXSmiles

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_smistring] [openbabel_toolkit.create_smistring] [openeye_toolkit.create_smistring] [rdkit_toolkit.create_smistring] [text_toolkit.create_smistring]

parse_smi

chemfp.toolkit.parse_smi(content: Union[str, bytes], *, ..., cxsmiles: bool = True, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Parse a SMILES string and its id using the toolkit.

This is equivalent to calling:

parse_molecule(content, "smi", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a SMILES string

  • cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule object

[cdk_toolkit.parse_smi] [openbabel_toolkit.parse_smi] [openeye_toolkit.parse_smi] [rdkit_toolkit.parse_smi] [text_toolkit.parse_smi]

create_smi

chemfp.toolkit.create_smi(mol: Any, *, id: Optional[str] = None, ..., cxsmiles: bool = False, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict') str | None

Generate a SMILES string and its id from a toolkit molecule

This is equivalent to calling:

create_string(mol, "smi", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • cxsmiles (Boolean (default: False)) – If true, generate CXSmiles

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_smi] [openbabel_toolkit.create_smi] [openeye_toolkit.create_smi] [rdkit_toolkit.create_smi] [text_toolkit.create_smi]

read_smi_molecules

chemfp.toolkit.read_smi_molecules(source: Union[NoneType, str, BinaryIO], *, ..., cxsmiles: bool = True, has_header: bool = False, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Read molecules from a SMILES file using the toolkit.

This is mostly equivalent to calling:

read_molecules(source, "smi", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the SMILES file to read

  • cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string

  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[cdk_toolkit.read_smi_molecules] [openbabel_toolkit.read_smi_molecules] [openeye_toolkit.read_smi_molecules] [rdkit_toolkit.read_smi_molecules] [text_toolkit.read_smi_molecules]

read_smi_ids_and_molecules

chemfp.toolkit.read_smi_ids_and_molecules(source: Union[NoneType, str, BinaryIO], *, ..., cxsmiles: bool = True, has_header: bool = False, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Read ids and molecules from a SMILES file using the toolkit.

This is mostly equivalent to calling:

read_ids_and_molecules(source, "smi", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the SMILES file to read

  • cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string

  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[cdk_toolkit.read_smi_ids_and_molecules] [openbabel_toolkit.read_smi_ids_and_molecules] [openeye_toolkit.read_smi_ids_and_molecules] [rdkit_toolkit.read_smi_ids_and_molecules] [text_toolkit.read_smi_ids_and_molecules]

read_smi_molecules_from_string

chemfp.toolkit.read_smi_molecules_from_string(content: Union[str, bytes], *, ..., cxsmiles: bool = True, has_header: bool = False, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Read molecules from a string containing a SMILES file using the toolkit.

This is equivalent to calling:

read_molecules_from_string(content, "smi", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a string containing a SMILES file

  • cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string

  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[cdk_toolkit.read_smi_molecules_from_string] [openbabel_toolkit.read_smi_molecules_from_string] [openeye_toolkit.read_smi_molecules_from_string] [rdkit_toolkit.read_smi_molecules_from_string] [text_toolkit.read_smi_molecules_from_string]

read_smi_ids_and_molecules_from_string

chemfp.toolkit.read_smi_ids_and_molecules_from_string(content: Union[str, bytes], *, ..., cxsmiles: bool = True, has_header: bool = False, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Read ids and molecules from a string containing a SMILES file using the toolkit.

This is equivalent to calling:

read_ids_and_molecules_from_string(content, "smi", reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a string containing a SMILES file

  • cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string

  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[cdk_toolkit.read_smi_ids_and_molecules_from_string] [openbabel_toolkit.read_smi_ids_and_molecules_from_string] [openeye_toolkit.read_smi_ids_and_molecules_from_string] [rdkit_toolkit.read_smi_ids_and_molecules_from_string] [text_toolkit.read_smi_ids_and_molecules_from_string]

open_smi_writer

chemfp.toolkit.open_smi_writer(destination: Union[NoneType, str, BinaryIO], *, ..., cxsmiles: bool = False, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Open a SMILES file to write toolkit molecules.

This is mostly equivalent to calling:

open_molecule_writer(destination, "smi", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules

  • cxsmiles (Boolean (default: False)) – If true, generate CXSmiles

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_smi_writer] [openbabel_toolkit.open_smi_writer] [openeye_toolkit.open_smi_writer] [rdkit_toolkit.open_smi_writer] [text_toolkit.open_smi_writer]

open_smi_writer_to_string

chemfp.toolkit.open_smi_writer_to_string(*, ..., cxsmiles: bool = False, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Open a SMILES file to write toolkit molecules to an in-memory string.

This is equivalent to calling:

open_molecule_writer_to_string("smi", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • cxsmiles (Boolean (default: False)) – If true, generate CXSmiles

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_smi_writer_to_string] [openbabel_toolkit.open_smi_writer_to_string] [openeye_toolkit.open_smi_writer_to_string] [rdkit_toolkit.open_smi_writer_to_string] [text_toolkit.open_smi_writer_to_string]

parse_sdf

chemfp.toolkit.parse_sdf(content: Union[str, bytes], *, ..., errors: str = 'strict')

Parse an SDF record using the toolkit.

This is equivalent to calling:

parse_molecule(content, "sdf", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a string containing an SDF record

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule

[cdk_toolkit.parse_sdf] [openbabel_toolkit.parse_sdf] [openeye_toolkit.parse_sdf] [rdkit_toolkit.parse_sdf] [text_toolkit.parse_sdf]

create_sdf

chemfp.toolkit.create_sdf(mol: Any, *, id: Optional[str] = None, ..., errors: str = 'strict') str | None

Generate an SDF record from a toolkit molecule

This is equivalent to calling:

create_string(mol, "sdf", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_sdf] [openbabel_toolkit.create_sdf] [openeye_toolkit.create_sdf] [rdkit_toolkit.create_sdf] [text_toolkit.create_sdf]

read_sdf_molecules

chemfp.toolkit.read_sdf_molecules(source: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Read molecules from an SDF file using the toolkit.

This is mostly equivalent to calling:

read_molecules(source, "sdf", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the SD file to read

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[cdk_toolkit.read_sdf_molecules] [openbabel_toolkit.read_sdf_molecules] [openeye_toolkit.read_sdf_molecules] [rdkit_toolkit.read_sdf_molecules] [text_toolkit.read_sdf_molecules]

read_sdf_ids_and_molecules

chemfp.toolkit.read_sdf_ids_and_molecules(source: Union[NoneType, str, BinaryIO], *, id_tag: Optional[str] = None, ..., errors: str = 'strict')

Read ids and molecules from an SDF file using the toolkit.

This is mostly equivalent to calling:

read_ids_and_molecules(source, "sdf", id_tag=id_tag, reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the SD file to read

  • id_tag (a string, or None to use the title) – get the id from the named data item instead of using the record title

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a IdAndMoleculeReader iterating toolkit molecules

[cdk_toolkit.read_sdf_ids_and_molecules] [openbabel_toolkit.read_sdf_ids_and_molecules] [openeye_toolkit.read_sdf_ids_and_molecules] [rdkit_toolkit.read_sdf_ids_and_molecules] [text_toolkit.read_sdf_ids_and_molecules]

read_sdf_molecules_from_string

chemfp.toolkit.read_sdf_molecules_from_string(content: Union[str, bytes], *, ..., errors: str = 'strict')

Read molecules from a string containing an SDF file using the toolkit.

This is equivalent to calling:

read_molecules_from_string(content, "sdf", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – the content of an SD file

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[cdk_toolkit.read_sdf_molecules_from_string] [openbabel_toolkit.read_sdf_molecules_from_string] [openeye_toolkit.read_sdf_molecules_from_string] [rdkit_toolkit.read_sdf_molecules_from_string] [text_toolkit.read_sdf_molecules_from_string]

read_sdf_ids_and_molecules_from_string

chemfp.toolkit.read_sdf_ids_and_molecules_from_string(content: Union[str, bytes], *, id_tag: Optional[str] = None, ..., errors: str = 'strict')

Read ids and molecules from a string containing an SDF file using the toolkit.

This is equivalent to calling:

read_ids_and_molecules_from_string(content, "sdf",
     id_tag=id_tag, reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string () if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – the content of an SD file

  • id_tag (a string, or None to use the title) – get the id from the named data item instead of using the record title

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[cdk_toolkit.read_sdf_ids_and_molecules_from_string] [openbabel_toolkit.read_sdf_ids_and_molecules_from_string] [openeye_toolkit.read_sdf_ids_and_molecules_from_string] [rdkit_toolkit.read_sdf_ids_and_molecules_from_string] [text_toolkit.read_sdf_ids_and_molecules_from_string]

open_sdf_writer

chemfp.toolkit.open_sdf_writer(destination: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Open an SDF file to write toolkit molecules.

This is mostly equivalent to calling:

open_molecule_writer(destination, "sdf", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_sdf_writer] [openbabel_toolkit.open_sdf_writer] [openeye_toolkit.open_sdf_writer] [rdkit_toolkit.open_sdf_writer] [text_toolkit.open_sdf_writer]

open_sdf_writer_to_string

chemfp.toolkit.open_sdf_writer_to_string(*, ..., errors: str = 'strict')

Open an SDF file to write toolkit molecules to an in-memory string.

This is equivalent to calling:

open_molecule_writer_to_string("sdf", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:

errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_sdf_writer_to_string] [openbabel_toolkit.open_sdf_writer_to_string] [openeye_toolkit.open_sdf_writer_to_string] [rdkit_toolkit.open_sdf_writer_to_string] [text_toolkit.open_sdf_writer_to_string]

create_sdf3k

chemfp.toolkit.create_sdf3k(mol: Any, *, id: Optional[str] = None, ..., errors: str = 'strict') str | None

Generate an SDF record in V3000 format from a toolkit molecule.

This is equivalent to calling:

create_string(mol, "sdf3k", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_sdf3k] [openbabel_toolkit.create_sdf3k] [openeye_toolkit.create_sdf3k] [rdkit_toolkit.create_sdf3k]

open_sdf3k_writer

chemfp.toolkit.open_sdf3k_writer(destination: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Open an SDF file in V3000 format to write toolkit molecules.

This is mostly equivalent to calling:

open_molecule_writer(destination, "sdf3k", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_sdf3k_writer] [openbabel_toolkit.open_sdf3k_writer] [openeye_toolkit.open_sdf3k_writer] [rdkit_toolkit.open_sdf3k_writer]

open_sdf3k_writer_to_string

chemfp.toolkit.open_sdf3k_writer_to_string(*, ..., errors: str = 'strict')

Open an SDF file in V3000 format to write toolkit molecules to an in-memory string.

This is equivalent to calling:

open_molecule_writer_to_string("sdf3k", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:

errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_sdf3k_writer_to_string] [openbabel_toolkit.open_sdf3k_writer_to_string] [openeye_toolkit.open_sdf3k_writer_to_string] [rdkit_toolkit.open_sdf3k_writer_to_string]

parse_inchi

chemfp.toolkit.parse_inchi(content: Union[str, bytes], *, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Parse an InChI string and its id using the toolkit.

This is equivalent to calling:

parse_molecule(content, "inchi", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – an InChI string or line from an InChI file

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule object

[cdk_toolkit.parse_inchi] [openbabel_toolkit.parse_inchi] [openeye_toolkit.parse_inchi] [rdkit_toolkit.parse_inchi]

create_inchi

chemfp.toolkit.create_inchi(mol: Any, *, id: Optional[str] = None, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, include_id: bool = True, errors: str = 'strict') str | None

Generate an InChI string and its id from a toolkit molecule.

This is equivalent to calling:

create_string(mol, "inchi", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • options (a string (default: "")) – an configuration string to pass to the InChI API

  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API

  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • include_id (Boolean (default: True)) – if true, include the molecule id in the output

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_inchi] [openbabel_toolkit.create_inchi] [openeye_toolkit.create_inchi] [rdkit_toolkit.create_inchi]

read_inchi_molecules

chemfp.toolkit.read_inchi_molecules(source: Union[NoneType, str, BinaryIO], *, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Read molecules from an InChI file (with InChI and optional id) using the toolkit.

This is mostly equivalent to calling:

read_molecules(source, "inchi", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the InChI file to read

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[cdk_toolkit.read_inchi_molecules] [openbabel_toolkit.read_inchi_molecules] [openeye_toolkit.read_inchi_molecules] [rdkit_toolkit.read_inchi_molecules]

read_inchi_ids_and_molecules

chemfp.toolkit.read_inchi_ids_and_molecules(source: Union[NoneType, str, BinaryIO], *, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Read ids and molecules from an InChI file (with InChI and optional id) using the toolkit.

This is mostly equivalent to calling:

read_ids_and_molecules(source, "inchi", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the InChI file to read

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[cdk_toolkit.read_inchi_ids_and_molecules] [openbabel_toolkit.read_inchi_ids_and_molecules] [openeye_toolkit.read_inchi_ids_and_molecules] [rdkit_toolkit.read_inchi_ids_and_molecules]

read_inchi_molecules_from_string

chemfp.toolkit.read_inchi_molecules_from_string(content: Union[str, bytes], *, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Read molecules from a string containing an InChI file (each line has an InChI and optional id) using the toolkit.

This is equivalent to calling:

read_molecules_from_string(content, "inchi", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – the content of an InChI file

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[cdk_toolkit.read_inchi_molecules_from_string] [openbabel_toolkit.read_inchi_molecules_from_string] [openeye_toolkit.read_inchi_molecules_from_string] [rdkit_toolkit.read_inchi_molecules_from_string]

read_inchi_ids_and_molecules_from_string

chemfp.toolkit.read_inchi_ids_and_molecules_from_string(content: Union[str, bytes], *, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Read ids and molecules from a string containing an InChI file (with InChI and optional id) using the toolkit.

This is equivalent to calling:

read_ids_and_molecules_from_string(content, "inchi", reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – the content of an InChI file

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[cdk_toolkit.read_inchi_ids_and_molecules_from_string] [openbabel_toolkit.read_inchi_ids_and_molecules_from_string] [openeye_toolkit.read_inchi_ids_and_molecules_from_string] [rdkit_toolkit.read_inchi_ids_and_molecules_from_string]

open_inchi_writer

chemfp.toolkit.open_inchi_writer(destination: Union[NoneType, str, BinaryIO], *, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, include_id: bool = True, errors: str = 'strict')

Open an InChI file (with InChI and optional id) to write toolkit molecules.

This is mostly equivalent to calling:

open_molecule_writer(destination, "inchi", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules

  • options (a string (default: "")) – an configuration string to pass to the InChI API

  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API

  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • include_id (Boolean (default: True)) – if true, include the molecule id in the output

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_inchi_writer] [openbabel_toolkit.open_inchi_writer] [openeye_toolkit.open_inchi_writer] [rdkit_toolkit.open_inchi_writer]

open_inchi_writer_to_string

chemfp.toolkit.open_inchi_writer_to_string(*, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, include_id: bool = True, errors: str = 'strict')

Open an InChI file (with InChI and optional id) to write toolkit molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("inchi", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • include_id (Boolean (default: True)) – if true, include the molecule id in the output

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_inchi_writer_to_string] [openbabel_toolkit.open_inchi_writer_to_string] [openeye_toolkit.open_inchi_writer_to_string] [rdkit_toolkit.open_inchi_writer_to_string]

parse_inchistring

chemfp.toolkit.parse_inchistring(content: Union[str, bytes], *, ..., errors: str = 'strict')

Parse an InChI string using the toolkit.

This is equivalent to calling:

parse_molecule(content, "inchistring", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – an InChI string

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule object

[cdk_toolkit.parse_inchistring] [openbabel_toolkit.parse_inchistring] [openeye_toolkit.parse_inchistring] [rdkit_toolkit.parse_inchistring]

create_inchistring

chemfp.toolkit.create_inchistring(mol: Any, *, id: Optional[str] = None, ..., errors: str = 'strict') str | None

Generate an InChI string from a toolkit molecule.

This is equivalent to calling:

create_string(mol, "inchistring", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_inchistring] [openbabel_toolkit.create_inchistring] [openeye_toolkit.create_inchistring] [rdkit_toolkit.create_inchistring]

create_inchikey

chemfp.toolkit.create_inchikey(mol: Any, *, id: Optional[str] = None, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, include_id: bool = True, errors: str = 'strict') str | None

Generate an InChIKey string and its id from a toolkit molecule.

This is equivalent to calling:

create_string(mol, "inchikey", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • include_id (Boolean (default: True)) – if true, include the molecule id in the output

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_inchikey] [openbabel_toolkit.create_inchikey] [openeye_toolkit.create_inchikey] [rdkit_toolkit.create_inchikey]

open_inchikey_writer

chemfp.toolkit.open_inchikey_writer(destination: Union[NoneType, str, BinaryIO], *, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, include_id: bool = True, errors: str = 'strict')

Open an InChIKey file (with InChIKey and optional id) to write toolkit molecules.

This is mostly equivalent to calling:

open_molecule_writer(destination, "inchikey", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • include_id (Boolean (default: True)) – if true, include the molecule id in the output

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_inchikey_writer] [openbabel_toolkit.open_inchikey_writer] [openeye_toolkit.open_inchikey_writer] [rdkit_toolkit.open_inchikey_writer]

open_inchikey_writer_to_string

chemfp.toolkit.open_inchikey_writer_to_string(*, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, include_id: bool = True, errors: str = 'strict')

Open an InChIKey file (with InChIKey and optional id) to write toolkit molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("inchikey", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • include_id (Boolean (default: True)) – if true, include the molecule id in the output

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[cdk_toolkit.open_inchikey_writer_to_string] [openbabel_toolkit.open_inchikey_writer_to_string] [openeye_toolkit.open_inchikey_writer_to_string] [rdkit_toolkit.open_inchikey_writer_to_string]

create_inchikeystring

chemfp.toolkit.create_inchikeystring(mol: Any, *, id: Optional[str] = None, ..., errors: str = 'strict') str | None

Generate an InChIKey string from a toolkit molecule.

This is equivalent to calling:

create_string(mol, "inchikeystring", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_inchikeystring] [openbabel_toolkit.create_inchikeystring] [openeye_toolkit.create_inchikeystring] [rdkit_toolkit.create_inchikeystring]

parse_smiles

chemfp.toolkit.parse_smiles(content: Union[str, bytes], *, ..., delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict')

Parse a SMILES string and its id using the toolkit.

This is equivalent to calling:

parse_molecule(content, "smi", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – an SMILES string or line from a SMILES file

  • cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule object

[cdk_toolkit.parse_smiles] [openbabel_toolkit.parse_smiles] [openeye_toolkit.parse_smiles] [rdkit_toolkit.parse_smiles] [text_toolkit.parse_smiles]

create_smiles

chemfp.toolkit.create_smiles(mol: Any, *, id: Optional[str] = None, ..., cxsmiles: bool = False, delimiter: Optional[Literal['to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', ' ', '\t']] = None, errors: str = 'strict') str | None

Generate a SMILES string and its id from a toolkit molecule

This is equivalent to calling:

create_string(mol, "smi", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • cxsmiles (Boolean (default: False)) – If true, generate CXSmiles

  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_smiles] [openbabel_toolkit.create_smiles] [openeye_toolkit.create_smiles] [rdkit_toolkit.create_smiles] [text_toolkit.create_smiles]

parse_molfile

chemfp.toolkit.parse_molfile(content: Union[str, bytes], *, ..., errors: str = 'strict')

Parse a molfile using the toolkit.

This is equivalent to calling:

parse_molecule(content, "molfile", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a molfile record

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule object

[cdk_toolkit.parse_molfile] [openeye_toolkit.parse_molfile] [rdkit_toolkit.parse_molfile]

create_molfile

chemfp.toolkit.create_molfile(mol: Any, *, id: Optional[str] = None, ..., errors: str = 'strict') str | None

Generate a molfile from a toolkit molecule.

This is equivalent to calling:

create_string(mol, "molfile", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[cdk_toolkit.create_molfile] [openeye_toolkit.create_molfile] [rdkit_toolkit.create_molfile]

parse_fasta

chemfp.toolkit.parse_fasta(content: Union[str, bytes], *, ..., errors: str = 'strict')

Parse a FASTA record using the toolkit.

This is equivalent to calling:

parse_molecule(content, "fasta", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a FASTA record

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule object

[openbabel_toolkit.parse_fasta] [openeye_toolkit.parse_fasta] [rdkit_toolkit.parse_fasta]

create_fasta

chemfp.toolkit.create_fasta(mol: Any, *, id: Optional[str] = None, ..., errors: str = 'strict') str | None

Generate a FASTA record from a toolkit molecule

This is equivalent to calling:

create_string(mol, "fasta", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[openbabel_toolkit.create_fasta] [openeye_toolkit.create_fasta] [rdkit_toolkit.create_fasta]

read_fasta_molecules

chemfp.toolkit.read_fasta_molecules(source: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Read molecules from a FASTA file using the toolkit.

This is mostly equivalent to calling:

read_molecules(source, "fasta", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the FASTA file to read

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[openbabel_toolkit.read_fasta_molecules] [openeye_toolkit.read_fasta_molecules] [rdkit_toolkit.read_fasta_molecules]

read_fasta_ids_and_molecules

chemfp.toolkit.read_fasta_ids_and_molecules(source: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Read ids and molecules from a FASTA file using the toolkit.

This is mostly equivalent to calling:

read_ids_and_molecules(source, "fasta", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the FASTA file to read

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[openbabel_toolkit.read_fasta_ids_and_molecules] [openeye_toolkit.read_fasta_ids_and_molecules] [rdkit_toolkit.read_fasta_ids_and_molecules]

read_fasta_molecules_from_string

chemfp.toolkit.read_fasta_molecules_from_string(content: Union[str, bytes], *, ..., errors: str = 'strict')

Read molecules from a string containing a FASTA file using the toolkit.

This is equivalent to calling:

read_molecules_from_string(content, "fasta", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – the content of a FASTA file

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[openbabel_toolkit.read_fasta_molecules_from_string] [openeye_toolkit.read_fasta_molecules_from_string] [rdkit_toolkit.read_fasta_molecules_from_string]

read_fasta_ids_and_molecules_from_string

chemfp.toolkit.read_fasta_ids_and_molecules_from_string(content: Union[str, bytes], *, ..., errors: str = 'strict')

Read ids and molecules from a string containing a FASTA file using the toolkit.

This is equivalent to calling:

read_ids_and_molecules_from_string(
    content, "fasta", reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – the content of a FASTA file

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[openbabel_toolkit.read_fasta_ids_and_molecules_from_string] [openeye_toolkit.read_fasta_ids_and_molecules_from_string] [rdkit_toolkit.read_fasta_ids_and_molecules_from_string]

open_fasta_writer

chemfp.toolkit.open_fasta_writer(destination: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Open a FASTA file to write toolkit molecules

This is mostly equivalent to calling:

open_molecule_writer(destination, "fasta", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[openbabel_toolkit.open_fasta_writer] [openeye_toolkit.open_fasta_writer] [rdkit_toolkit.open_fasta_writer]

open_fasta_writer_to_string

chemfp.toolkit.open_fasta_writer_to_string(*, ..., errors: str = 'strict')

Open a FASTA file to write toolkit molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("fasta", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:

errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[openbabel_toolkit.open_fasta_writer_to_string] [openeye_toolkit.open_fasta_writer_to_string] [rdkit_toolkit.open_fasta_writer_to_string]

parse_pdb

chemfp.toolkit.parse_pdb(content: Union[str, bytes], *, ..., errors: str = 'strict')

Parse a PDB record using the toolkit.

This is equivalent to calling:

parse_molecule(content, "pdb", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a PDB record

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule object

[openbabel_toolkit.parse_pdb] [openeye_toolkit.parse_pdb] [rdkit_toolkit.parse_pdb]

create_pdb

chemfp.toolkit.create_pdb(mol: Any, *, ..., errors: str = 'strict') str | None

Generate a PDB record from a toolkit molecule

This is equivalent to calling:

create_string(mol, "pdb", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[openbabel_toolkit.create_pdb] [openeye_toolkit.create_pdb] [rdkit_toolkit.create_pdb]

read_pdb_molecules

chemfp.toolkit.read_pdb_molecules(source: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Read molecules from a PDB file using the toolkit.

This is mostly equivalent to calling:

read_molecules(source, "pdb", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the PDB file to read

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[openbabel_toolkit.read_pdb_molecules] [openeye_toolkit.read_pdb_molecules] [rdkit_toolkit.read_pdb_molecules]

read_pdb_ids_and_molecules

chemfp.toolkit.read_pdb_ids_and_molecules(source: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Read ids and molecules from a PDB file using the toolkit.

This is mostly equivalent to calling:

read_ids_and_molecules(source, "pdb", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the PDB file to read

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[openbabel_toolkit.read_pdb_ids_and_molecules] [openeye_toolkit.read_pdb_ids_and_molecules] [rdkit_toolkit.read_pdb_ids_and_molecules]

read_pdb_molecules_from_string

chemfp.toolkit.read_pdb_molecules_from_string(content: Union[str, bytes], *, ..., errors: str = 'strict')

Read molecules from a string containing a PDB file using the toolkit.

This is equivalent to calling:

read_molecules_from_string(content, "pdb", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – the content of a PDB file

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeReader iterating toolkit molecules

[openbabel_toolkit.read_pdb_molecules_from_string] [openeye_toolkit.read_pdb_molecules_from_string] [rdkit_toolkit.read_pdb_molecules_from_string]

read_pdb_ids_and_molecules_from_string

chemfp.toolkit.read_pdb_ids_and_molecules_from_string(content: Union[str, bytes], *, ..., errors: str = 'strict')

Read ids and molecules from a string containing a PDB file using the toolkit.

This is equivalent to calling:

read_ids_and_molecules_from_string(
   content, "pdb", reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – the content of a PDB file

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

an IdAndMoleculeReader iterating toolkit molecules

[openbabel_toolkit.read_pdb_ids_and_molecules_from_string] [openeye_toolkit.read_pdb_ids_and_molecules_from_string] [rdkit_toolkit.read_pdb_ids_and_molecules_from_string]

open_pdb_writer

chemfp.toolkit.open_pdb_writer(destination: Union[NoneType, str, BinaryIO], *, ..., errors: str = 'strict')

Open a PDB file to write toolkit molecules.

This is mostly equivalent to calling:

open_molecule_writer(destination, "pdb", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[openbabel_toolkit.open_pdb_writer] [openeye_toolkit.open_pdb_writer] [rdkit_toolkit.open_pdb_writer]

open_pdb_writer_to_string

chemfp.toolkit.open_pdb_writer_to_string(*, ..., errors: str = 'strict')

Open a PDB file to write toolkit molecules to an in-memory string.

This is equivalent to calling:

open_molecule_writer_to_string("pdb", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:

errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a MoleculeWriter expecting toolkit molecules

[openbabel_toolkit.open_pdb_writer_to_string] [openeye_toolkit.open_pdb_writer_to_string] [rdkit_toolkit.open_pdb_writer_to_string]

parse_sequence

chemfp.toolkit.parse_sequence(content: Union[str, bytes], *, ..., errors: str = 'strict')

Parse a 1-letter IUPAC DNA, RNA, or protein sequence using the toolkit.

This is equivalent to calling:

parse_molecule(content, "sequence", reader_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • content (a Unicode or byte string) – a protein, RNA, or DNA sequence as 1-letter IUPAC codes

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a toolkit molecule object

[openeye_toolkit.parse_sequence] [rdkit_toolkit.parse_sequence]

create_sequence

chemfp.toolkit.create_sequence(mol: Any, *, id: Optional[str] = None, ..., errors: str = 'strict') str | None

Generate a 1-letter IUPAC sequence from a toolkit molecule.

This is equivalent to calling:

create_string(mol, "sequence", id=id, writer_args={...}, errors=errors)

The ... contains toolkit-specific keyword arguments which are not described here.

Parameters:
  • mol (a toolkit molecule) – a molecule object

  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant

  • errors (one of "strict", "ignore", or "log") – specify how to handle errors

Returns:

a string, or None if errors are ignored

[openeye_toolkit.create_sequence] [rdkit_toolkit.create_sequence]