Library Enumeration#
The Library Enumeration module provides tools for generating combinatorial chemical products from reagent libraries using reaction SMARTS patterns. It supports single-step reactions, alternative SMARTS routing, multi-step synthesis pipelines, and protecting group deprotection.
Module Architecture#
The following diagram shows the class hierarchy and dependencies:
Quick Start#
Single reaction with validation:
from TACTICS.library_enumeration import ReactionDef, ReactionConfig, SynthesisPipeline
# Define a reaction
rxn = ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
description="Amide coupling"
)
# Create configuration and pipeline
config = ReactionConfig(
reactions=[rxn],
reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(config)
# Validate against reagent files
result = rxn.validate_reaction(reagent_files=["acids.smi", "amines.smi"])
print(f"Coverage: {result.coverage_stats}")
Enumerate products with SynthesisPipeline:
from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef
# Create configuration
config = ReactionConfig(
reactions=[
ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
description="Amide coupling",
)
],
reagent_file_list=["acids.smi", "amines.smi"]
)
# Create pipeline directly from config
pipeline = SynthesisPipeline(config)
# Enumerate a single product from SMILES
result = pipeline.enumerate_single_from_smiles(["CC(=O)O", "CCN"])
if result.success:
print(f"Product: {result.product_smiles}") # CCNC(C)=O
With alternative SMARTS patterns:
from TACTICS.library_enumeration import (
SynthesisPipeline, ReactionConfig, ReactionDef,
StepInput, InputSource
)
# Define primary and alternative patterns
config = ReactionConfig(
reactions=[
ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
pattern_id="primary",
description="Primary amines"
),
ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH:2]>>[C:1](=O)[N:2]",
step_index=0,
pattern_id="secondary",
description="Secondary amines"
),
],
reagent_file_list=["acids.smi", "amines.smi"],
step_inputs={
0: [
StepInput(source=InputSource.REAGENT_FILE, file_index=0),
StepInput(source=InputSource.REAGENT_FILE, file_index=1)
]
},
step_modes={0: "alternative"} # Mark step 0 as having alternatives
)
pipeline = SynthesisPipeline(config)
# Pipeline automatically tries alternative patterns at runtime
result = pipeline.enumerate_single_from_smiles(["CC(=O)O", "CCN(C)C"])
print(f"Pattern used: {result.patterns_used}") # {0: "secondary"}
Fundamental Types#
These are the basic building blocks used by higher-level classes.
InputSource#
Enum specifying the source of an input for a reaction step.
Value |
Description |
|---|---|
|
Input comes from a reagent file (use |
|
Input comes from output of a previous step (use |
Example
from TACTICS.library_enumeration import InputSource
source = InputSource.REAGENT_FILE
source = InputSource.PREVIOUS_STEP
ProtectingGroupInfo#
Dataclass defining a protecting group for detection and optional removal.
Dependencies
None - this is a standalone dataclass.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Human-readable name (e.g., “Boc”, “Fmoc”) |
|
|
Yes |
SMARTS pattern to detect the group |
|
|
No |
Reaction SMARTS for removal (optional) |
Example
from TACTICS.library_enumeration import ProtectingGroupInfo
boc = ProtectingGroupInfo(
name="Boc",
smarts="[NX3][C](=O)OC(C)(C)C",
deprotection_smarts="[N:1][C](=O)OC(C)(C)C>>[N:1]"
)
DeprotectionSpec#
Pydantic model specifying a deprotection to apply during synthesis. Deprotections can target either a reactant (before the reaction) or the product (after the reaction).
Dependencies
None - uses protecting group names as strings.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Name of protecting group (e.g., “Boc”, “Fmoc”) |
|
|
Yes |
Reactant index (int >= 0) for pre-reaction, or |
Property |
Type |
Description |
|---|---|---|
|
|
True if this deprotects the product (after reaction) |
|
|
The reactant index if targeting a reactant, None if targeting product |
Example: Reactant Deprotection
from TACTICS.library_enumeration import DeprotectionSpec
# Remove Boc from the second reactant (index 1) BEFORE reaction
deprot = DeprotectionSpec(group="Boc", target=1)
print(deprot.is_product_deprotection) # False
print(deprot.reactant_index) # 1
Example: Product Deprotection
from TACTICS.library_enumeration import DeprotectionSpec
# Remove Fmoc from the product AFTER reaction
deprot = DeprotectionSpec(group="Fmoc", target="product")
print(deprot.is_product_deprotection) # True
print(deprot.reactant_index) # None
StepInput#
Pydantic model specifying where an input comes from for a reaction step.
Dependencies
InputSource - enum specifying the source type
Depends on: InputSource
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Either |
|
|
Conditional |
Index into |
|
|
Conditional |
Index of previous step. Required if source is |
Example
from TACTICS.library_enumeration import StepInput, InputSource
# Input from first reagent file
input1 = StepInput(source=InputSource.REAGENT_FILE, file_index=0)
# Input from output of step 0
input2 = StepInput(source=InputSource.PREVIOUS_STEP, step_index=0)
Reaction Definition#
ReactionDef#
The core class for defining a single reaction with built-in validation and visualization.
This is the fundamental building block of the SMARTS toolkit. Each ReactionDef represents
a single chemical reaction that can validate reagent compatibility, visualize template matches,
and be combined into multi-step syntheses via ReactionConfig.
Dependencies
DeprotectionSpec - for deprotections (optional)
ProtectingGroupInfo - used during validation (optional)
Returns ValidationResult from validation methods
Depends on: DeprotectionSpec (optional), ProtectingGroupInfo (optional)
Constructor Parameters#
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Reaction SMARTS string (validated on creation) |
|
|
No |
Step in the sequence (0 = first step). Default: 0 |
|
|
No |
Identifier for alternatives (auto-generated if not provided) |
|
|
No |
Human-readable description |
|
|
No |
Deprotections to apply for this reaction. Default: [] |
Properties#
Property |
Type |
Description |
|---|---|---|
|
|
Number of reactants in the reaction |
|
|
True if |
|
|
Coverage percentage per position (0-100) |
|
|
Cached validation result (None if not validated) |
Validation Methods#
Method |
Description |
|---|---|
|
Validate reaction against reagent files or SMILES lists. Returns ValidationResult |
|
Get list of |
|
Get list of |
|
Get RDKit Mol template for a specific position |
|
Human-readable validation summary string |
validate_reaction parameters:
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No* |
Paths to reagent files (.smi format) |
|
|
No* |
Lists of |
|
|
No |
Custom protecting groups for detection |
|
|
No |
Apply deprotection during validation. Default: False |
|
|
No |
Remove salt fragments during validation. Default: False |
|
|
No |
Run test reactions to verify products form. Default: False |
* Must provide either reagent_files or reagent_smiles
Visualization Methods#
Method |
Description |
|---|---|
|
Visualize which atoms in a molecule match the reaction template |
|
Visualize the reaction scheme |
Example#
from TACTICS.library_enumeration import ReactionDef, DeprotectionSpec
# Define reaction with deprotection on reactant
rxn = ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
pattern_id="amide_coupling",
description="Amide bond formation",
deprotections=[DeprotectionSpec(group="Boc", target=1)]
)
# Validate against reagent files
result = rxn.validate_reaction(
reagent_files=["acids.smi", "amines.smi"],
deprotect=True,
test_reactions=True
)
# Check coverage
print(f"Position 0 coverage: {rxn.coverage_stats[0]:.1f}%")
print(f"Position 1 coverage: {rxn.coverage_stats[1]:.1f}%")
print(f"Reaction success rate: {result.reaction_success_rate:.1f}%")
# Get compatible reagents for position 1
compatible = rxn.get_compatible_reagents(1)
print(f"Found {len(compatible)} compatible amines")
# Troubleshoot a problematic reagent
img = rxn.visualize_template_match("CC(C)(C)OC(=O)NCCn", position=1)
Configuration#
ReactionConfig#
Container for synthesis configuration holding one or more ReactionDef objects.
Use ReactionConfig to define complex syntheses including:
Single reactions (one
ReactionDef)Alternative SMARTS patterns (multiple
ReactionDefobjects with samestep_index)Multi-step syntheses (multiple
ReactionDefobjects with differentstep_indexvalues)Protecting group deprotections (via
ReactionDef.deprotections)
Dependencies
Composes multiple lower-level classes:
ReactionDef - one or more reaction definitions (required)
StepInput - input sources for multi-step synthesis (required for multi-step)
ProtectingGroupInfo - custom protecting groups (optional)
Depends on: ReactionDef, StepInput, ProtectingGroupInfo
Constructor Parameters#
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
List of reaction definitions (minimum 1) |
|
|
No |
Paths to reagent files. Default: [] |
|
|
Conditional |
Mapping of step_index to input sources. Required if multiple reactions |
|
|
No |
Mark steps with alternative patterns: |
|
|
No |
Custom protecting group definitions |
Properties#
Property |
Type |
Description |
|---|---|---|
|
|
Number of unique steps |
|
|
True if more than one step |
|
|
Step indices that have alternative patterns |
|
|
Sorted list of all step indices |
Methods#
Method |
Description |
|---|---|
|
Get all ReactionDef objects for a given step |
|
Get the primary reaction (pattern_id=’primary’ or first) |
|
Get StepInput configuration for a step |
|
Check if step has alternative SMARTS patterns |
|
Validate all reactions, returns |
Example: Single Reaction#
from TACTICS.library_enumeration import ReactionDef, ReactionConfig, SynthesisPipeline
rxn = ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
description="Amide coupling"
)
config = ReactionConfig(
reactions=[rxn],
reagent_file_list=["acids.smi", "amines.smi"]
)
# Create pipeline directly from config
pipeline = SynthesisPipeline(config)
Example: Multi-Step Synthesis#
from TACTICS.library_enumeration import (
ReactionDef, ReactionConfig, StepInput, InputSource, DeprotectionSpec,
SynthesisPipeline
)
# Step 0: Amide coupling
step0 = ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
description="Amide coupling"
)
# Step 1: Reductive amination (with Boc deprotection on reactant)
step1 = ReactionDef(
reaction_smarts="[NH2:1].[CH:2]=O>>[NH:1][CH2:2]",
step_index=1,
description="Reductive amination",
deprotections=[DeprotectionSpec(group="Boc", target=0)] # Deprotect first input
)
config = ReactionConfig(
reactions=[step0, step1],
reagent_file_list=["acids.smi", "amines.smi", "aldehydes.smi"],
step_inputs={
0: [
StepInput(source=InputSource.REAGENT_FILE, file_index=0),
StepInput(source=InputSource.REAGENT_FILE, file_index=1)
],
1: [
StepInput(source=InputSource.PREVIOUS_STEP, step_index=0),
StepInput(source=InputSource.REAGENT_FILE, file_index=2)
],
}
)
pipeline = SynthesisPipeline(config)
Example: Alternative SMARTS#
from TACTICS.library_enumeration import (
ReactionDef, ReactionConfig, StepInput, InputSource, SynthesisPipeline
)
# Primary pattern for primary amines
primary = ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
pattern_id="primary", # Required: first alternative should be "primary"
description="Primary amines"
)
# Alternative for secondary amines
secondary = ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH:2]>>[C:1](=O)[N:2]",
step_index=0,
pattern_id="secondary",
description="Secondary amines"
)
config = ReactionConfig(
reactions=[primary, secondary],
reagent_file_list=["acids.smi", "amines.smi"],
step_inputs={
0: [
StepInput(source=InputSource.REAGENT_FILE, file_index=0),
StepInput(source=InputSource.REAGENT_FILE, file_index=1)
]
},
step_modes={0: "alternative"} # Mark step 0 as having alternatives
)
pipeline = SynthesisPipeline(config)
# Runtime pattern fallback: pipeline automatically tries patterns until one succeeds
# No need to call auto_detect_compatibility()
Pipeline#
SynthesisPipeline#
Main entry point for executing syntheses and enumerating products.
SynthesisPipeline orchestrates the synthesis process, handling:
Single reactions from SMARTS strings
Automatic routing to compatible alternative patterns (runtime fallback)
Multi-step syntheses with intermediate tracking
Batch enumeration with optional parallelization
Integration with Thompson Sampling via multiprocessing support
Dependencies
ReactionConfig - passed to constructor
ReactionDef - accessed via config
Returns EnumerationResult, EnumerationError, AutoDetectionResult
Depends on: ReactionConfig
Constructor#
from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef
config = ReactionConfig(
reactions=[ReactionDef(reaction_smarts="...", step_index=0)],
reagent_file_list=["reagents.smi"]
)
pipeline = SynthesisPipeline(config)
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Reaction configuration with reactions and reagent files |
Enumeration Methods#
Method |
Description |
|---|---|
|
Enumerate single product from RDKit Mol objects |
|
Enumerate single product from SMILES strings |
|
Enumerate all combinations from reagent files |
|
Enumerate specific combinations, optionally in parallel |
enumerate_single_from_smiles parameters:
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
SMILES strings for each reagent position |
|
|
No |
Keys for compatibility lookup (names or identifiers) |
|
|
No |
Store intermediate products for multi-step. Default: False |
Returns: EnumerationResult
Utility Methods#
Method |
Description |
|---|---|
|
Get internal validator for troubleshooting |
|
Validate all steps and patterns |
|
Pre-detect which reagents work with which patterns (optional optimization) |
|
Manually register a reagent’s compatible patterns |
|
Find pattern compatible with all reagents |
|
Get full compatibility mapping |
|
Serialize for multiprocessing workers |
|
Reconstruct pipeline in worker process (class method) |
Properties#
Property |
Type |
Description |
|---|---|---|
|
|
Number of reaction steps |
|
|
Number of reagent files/positions required |
|
|
True if multi-step synthesis |
|
|
True if any step has alternative patterns |
|
|
Available pattern IDs at each step |
|
|
All reactions from config |
|
|
Reagent file paths from config |
Example: Basic Enumeration#
from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef
config = ReactionConfig(
reactions=[
ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0
)
],
reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(config)
# Single enumeration
result = pipeline.enumerate_single_from_smiles(["CC(=O)O", "CCN"])
if result.success:
print(f"Product: {result.product_smiles}")
print(f"Pattern used: {result.patterns_used}")
else:
print(f"Failed: {result.error}")
Example: Library Enumeration#
from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef, results_to_dataframe
config = ReactionConfig(
reactions=[
ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0
)
],
reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(config)
# Enumerate all combinations with 4 parallel workers
all_results = pipeline.enumerate_library(n_jobs=4, show_progress=True)
# Analyze results
successes = [r for r in all_results if r.success]
failures = [r for r in all_results if not r.success]
print(f"Success rate: {len(successes)/len(all_results)*100:.1f}%")
# Convert to Polars DataFrame
df = results_to_dataframe(all_results)
Protecting Groups and Deprotection#
The SMARTS toolkit includes built-in support for common protecting groups and allows defining custom groups for specialized chemistry.
Built-in Protecting Groups#
The following protecting groups are available by default:
Name |
Protects |
Detection SMARTS |
Result |
|---|---|---|---|
Boc |
Amine (N) |
|
Free amine |
Fmoc |
Amine (N) |
|
Free amine |
Cbz |
Amine (N) |
|
Free amine |
Acetamide |
Amine (N) |
|
Free amine |
TBS |
Alcohol (O) |
|
Free alcohol |
O-Benzyl |
Alcohol (O) |
|
Free alcohol |
Trityl |
Amine/Alcohol |
|
Free N/O |
tBu-ester |
Carboxylic acid |
|
Free acid |
Me-ester |
Carboxylic acid |
|
Free acid |
Et-ester |
Carboxylic acid |
|
Free acid |
Accessing protecting group information:
from TACTICS.library_enumeration.smarts_toolkit.constants import (
get_all_protecting_group_names,
get_protecting_group,
)
# List all available protecting groups
print(get_all_protecting_group_names())
# ['Boc', 'Fmoc', 'Cbz', 'Acetamide', 'TBS', 'O-Benzyl', 'Trityl', 'tBu-ester', 'Me-ester', 'Et-ester']
# Get details for a specific group
boc = get_protecting_group("Boc")
print(f"Name: {boc.name}")
print(f"Detection SMARTS: {boc.smarts}")
print(f"Deprotection SMARTS: {boc.deprotection_smarts}")
Reactant Deprotection#
Deprotect a reactant before the reaction runs using target=<reactant_index>:
from TACTICS.library_enumeration import (
ReactionConfig, ReactionDef, DeprotectionSpec, SynthesisPipeline
)
config = ReactionConfig(
reactions=[
ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
deprotections=[
# Remove Boc from second reactant (index 1) before reaction
DeprotectionSpec(group="Boc", target=1),
],
)
],
reagent_file_list=["acids.smi", "boc_amines.smi"],
)
pipeline = SynthesisPipeline(config)
# Boc-protected amine will be deprotected before amide coupling
result = pipeline.enumerate_single_from_smiles([
"CC(=O)O", # Acetic acid
"CC(C)(C)OC(=O)NCCN" # Boc-ethylenediamine
])
Product Deprotection#
Deprotect the product after the reaction runs using target="product":
from TACTICS.library_enumeration import (
ReactionConfig, ReactionDef, DeprotectionSpec, SynthesisPipeline
)
config = ReactionConfig(
reactions=[
ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
deprotections=[
# Remove Fmoc from product after reaction
DeprotectionSpec(group="Fmoc", target="product"),
],
)
],
reagent_file_list=["acids.smi", "amines.smi"],
)
pipeline = SynthesisPipeline(config)
Custom Protecting Groups#
Define custom protecting groups using ProtectingGroupInfo:
from TACTICS.library_enumeration import (
ReactionConfig, ReactionDef, DeprotectionSpec, ProtectingGroupInfo,
SynthesisPipeline
)
# Define a custom protecting group
alloc = ProtectingGroupInfo(
name="Alloc",
smarts="[NX3]C(=O)OCC=C",
deprotection_smarts="[N:1]C(=O)OCC=C>>[N:1]"
)
config = ReactionConfig(
reactions=[
ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0,
deprotections=[
DeprotectionSpec(group="Alloc", target="product"),
],
),
],
reagent_file_list=["acids.smi", "amines.smi"],
protecting_groups=[alloc], # Register custom group
)
pipeline = SynthesisPipeline(config)
Result Types#
ValidationResult#
Comprehensive results from SMARTS validation.
Returned by ReactionDef.validate_reaction().
Attribute |
Type |
Description |
|---|---|---|
|
|
|
|
|
|
|
|
Unparseable SMILES entries |
|
|
Duplicate entries |
|
|
|
|
|
|
|
|
Percent compatible per position (0-100) |
|
|
Percent of test reactions that succeeded |
|
|
Critical errors |
|
|
Non-critical warnings |
Methods and Properties:
Member |
Description |
|---|---|
|
True if all positions have >0% coverage and no critical errors |
|
Total compatible reagents across all positions |
|
Total incompatible reagents |
EnumerationResult#
Complete result from pipeline enumeration.
Returned by SynthesisPipeline enumeration methods.
Attribute |
Type |
Description |
|---|---|---|
|
|
Final product as RDKit Mol object |
|
|
SMILES of final product |
|
|
Name derived from reagent keys |
|
|
|
|
|
|
|
|
Error details if failed |
Properties:
Property |
Description |
|---|---|
|
True if enumeration succeeded (product is not None and no error) |
EnumerationError#
Details about a failed enumeration.
Contained in EnumerationResult.error when enumeration fails.
Attribute |
Type |
Description |
|---|---|---|
|
|
Which step failed |
|
|
Which pattern was attempted |
|
|
One of: |
|
|
Human-readable description |
|
|
Input SMILES that caused the failure |
Example: Handling Errors
result = pipeline.enumerate_single_from_smiles(["CC(=O)O", "CCC"]) # Propane has no amine
if not result.success:
err = result.error
print(f"Failed at step {err.step_index}")
print(f"Error type: {err.error_type}")
print(f"Message: {err.message}")
print(f"Reagents: {err.reagent_smiles}")
AutoDetectionResult#
Results from automatic SMARTS compatibility detection.
Returned by SynthesisPipeline.auto_detect_compatibility().
Attribute |
Type |
Description |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Warning messages |
Constants#
The module provides default protecting groups and salt fragments for common use cases.
DEFAULT_PROTECTING_GROUPS#
List of 10 common protecting groups (see Built-in Protecting Groups for details).
DEFAULT_SALT_FRAGMENTS#
List of ~25 common salt/counterion fragments including:
Halides (Cl⁻, Br⁻, I⁻)
Metal cations (Na⁺, K⁺, Li⁺, Ca²⁺, Mg²⁺)
Organic acids (TFA, acetate, formate)
Ammonium salts
Utility Functions#
Function |
Description |
|---|---|
|
Get ProtectingGroupInfo by name. Raises |
|
Get list of all default protecting group names. |
|
Convert list of EnumerationResult to Polars DataFrame. |
|
Convert failed EnumerationResult objects to DataFrame. |
|
Get summary statistics of enumeration failures. |