TACTICS Logo

Welcome to TACTICS documentation!#

TACTICS (Thompson Sampling-Assisted Chemical Targeting and Iterative Compound Selection for Drug Discovery) is a comprehensive Python package for Thompson Sampling-based optimization of chemical combinatorial libraries with a unified, flexible architecture.

Key Features#

  • Unified Thompson Sampler: Single sampler class that accepts different selection strategies

  • Multiple Selection Strategies: Greedy, Roulette Wheel (thermal cycling), UCB, Epsilon-Greedy, Bayes-UCB

  • Flexible Warmup Strategies: Balanced (recommended), Standard, Enhanced (legacy)

  • Multiple Evaluators: Lookup, Database, Fingerprint, ML models, ROCS, FRED docking

  • Parallel Evaluation: Built-in multiprocessing support for expensive evaluators

  • Pydantic Configuration: Type-safe configuration with validation and presets

  • Configuration Presets: Pre-configured setups for common use cases

  • Extensible Design: Easy integration of custom strategies, warmup methods, and evaluators

  • Library Enumeration: Efficient generation of combinatorial reaction products

  • SMARTS Toolkit: Advanced pattern validation (ReactionDef), multi-SMARTS routing, and multi-step synthesis

  • Library Analysis: Comprehensive analysis and visualization tools

Architecture Overview#

The following diagram shows how the main components of TACTICS work together:

digraph TACTICS { rankdir=TB; node [shape=box, style="rounded,filled", fontname="Helvetica", fontsize=10]; edge [fontname="Helvetica", fontsize=9]; nodesep=0.2; ranksep=0.4; // Core TACTICS Engine at top TACTICS [label="TACTICS\nThompson Sampler", fillcolor="#FFD700", style="filled,bold,rounded"]; // Three main pluggable components side by side subgraph cluster_plugins { label="Pluggable Components"; style=filled; color="#F5F5F5"; Strategies [label="Selection Strategy\n(5 options)", fillcolor="#FFB6C1"]; Warmups [label="Warmup Strategy\n(3 options)", fillcolor="#00CED1"]; Evaluators [label="Evaluator\n(7 options)", fillcolor="#DDA0DD"]; } // Library Enumeration subgraph cluster_enum { label="Library Enumeration"; style=filled; color="#F5F5F5"; Pipeline [label="SynthesisPipeline", fillcolor="#DCDCDC"]; ReactionDef [label="ReactionDef", fillcolor="#DCDCDC"]; } // Configuration Config [label="ThompsonSamplingConfig\n(Pydantic v2)", fillcolor="#98FB98"]; // Vertical flow Config -> TACTICS [style=bold]; TACTICS -> Strategies; TACTICS -> Warmups; TACTICS -> Evaluators; TACTICS -> Pipeline [style=dashed]; }

The diagram shows TACTICS as the central orchestrator that coordinates:

  • Selection Strategies: Greedy, RouletteWheel, UCB, EpsilonGreedy, BayesUCB

  • Warmup Strategies: Balanced (recommended), Standard, Enhanced

  • Evaluators: Lookup, DB, Fingerprint, MW, ROCS, FRED, MLClassifier

  • Library Enumeration: SynthesisPipeline with ReactionDef for product generation

Data Flow#

The Thompson Sampling optimization follows this workflow:

digraph DataFlow { rankdir=TB; node [shape=box, style="rounded,filled", fontname="Helvetica", fontsize=10]; edge [fontname="Helvetica", fontsize=9]; nodesep=0.3; ranksep=0.4; Reagents [label="Reagent Files (.smi)", shape=folder, fillcolor="#ADD8E6"]; Config [label="Configuration", fillcolor="#FFFACD"]; Warmup [label="Warmup Phase\n(Initialize Priors)", fillcolor="#E0FFFF"]; Search [label="Search Phase\n(Thompson Sampling)", fillcolor="#FFD700"]; Evaluate [label="Evaluate\n(Score Compounds)", fillcolor="#E6E6FA"]; Update [label="Update Posteriors\n(Bayesian)", fillcolor="#90EE90"]; Results [label="Results DataFrame\n(Polars)", shape=cylinder, fillcolor="#ADD8E6"]; Reagents -> Config; Config -> Warmup; Warmup -> Search; Search -> Evaluate; Evaluate -> Update; Update -> Search [label="iterate", style=dashed]; Search -> Results [label="complete"]; }

Quick Start#

Parallel Batch Processing#

For expensive evaluators (docking, ML models), use parallel batch mode:

from TACTICS.library_enumeration import SynthesisPipeline
from TACTICS.library_enumeration.smarts_toolkit import ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import ThompsonSampler, get_preset
from TACTICS.thompson_sampling.core.evaluator_config import FredEvaluatorConfig

# Create synthesis pipeline
rxn_config = ReactionConfig(
    reactions=[ReactionDef(
        reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
        step_index=0
    )],
    reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)

# Configure slow evaluator (molecular docking)
evaluator = FredEvaluatorConfig(design_unit_file="receptor.oedu")

# Get parallel batch preset
config = get_preset(
    "parallel_batch",
    synthesis_pipeline=pipeline,
    evaluator_config=evaluator,
    mode="minimize",  # Docking scores (lower is better)
    batch_size=100    # Sample 100 compounds per cycle
)

# Create sampler and run optimization
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()

Custom Configuration#

For full control over all parameters:

from TACTICS.library_enumeration import SynthesisPipeline
from TACTICS.library_enumeration.smarts_toolkit import ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import ThompsonSamplingConfig, ThompsonSampler
from TACTICS.thompson_sampling.strategies.config import RouletteWheelConfig
from TACTICS.thompson_sampling.warmup.config import BalancedWarmupConfig
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig

# Create synthesis pipeline
rxn_config = ReactionConfig(
    reactions=[ReactionDef(
        reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
        step_index=0
    )],
    reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)

# Create fully customized configuration
config = ThompsonSamplingConfig(
    synthesis_pipeline=pipeline,
    num_ts_iterations=5000,
    num_warmup_trials=5,
    strategy_config=RouletteWheelConfig(
        mode="maximize",
        alpha=0.1,  # Initial heating temperature
        beta=0.1,   # Initial cooling temperature
    ),
    warmup_config=BalancedWarmupConfig(
        observations_per_reagent=5,
        use_per_reagent_variance=True,
    ),
    evaluator_config=LookupEvaluatorConfig(
        ref_filename="scores.csv",
        score_col="binding_affinity"
    ),
    batch_size=10,
    results_filename="my_results.csv",
    log_filename="optimization.log"
)

# Create sampler and run optimization
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()

Advanced Usage: Direct Sampler Control#

For maximum flexibility, use ThompsonSampler directly:

from TACTICS.library_enumeration import SynthesisPipeline
from TACTICS.library_enumeration.smarts_toolkit import ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import ThompsonSampler
from TACTICS.thompson_sampling.presets import get_preset
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig

# Create synthesis pipeline
rxn_config = ReactionConfig(
    reactions=[ReactionDef(
        reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
        step_index=0
    )],
    reagent_file_list=["reagents1.smi", "reagents2.smi"]
)
pipeline = SynthesisPipeline(rxn_config)

# Use a preset configuration
config = get_preset(
    "balanced_sampling",
    synthesis_pipeline=pipeline,
    evaluator_config=LookupEvaluatorConfig(ref_filename="scores.csv")
)

# Create sampler from config
sampler = ThompsonSampler.from_config(config)

# Run optimization with full control
warmup_results = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
print(f"Warmup completed: {len(warmup_results)} compounds evaluated")

search_results = sampler.search(num_cycles=config.num_ts_iterations)
print(f"Search completed: {len(search_results)} unique compounds")

# Cleanup multiprocessing resources
sampler.close()

Multi-SMARTS Routing Example#

For reactions requiring multiple SMARTS patterns (e.g., primary vs secondary amines), use SynthesisPipeline.from_alternatives():

from TACTICS.thompson_sampling import ThompsonSamplingConfig, ThompsonSampler
from TACTICS.thompson_sampling.strategies.config import RouletteWheelConfig
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig
from TACTICS.library_enumeration import SynthesisPipeline

# Create pipeline with alternative SMARTS patterns
pipeline = SynthesisPipeline.from_alternatives(
    primary_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
    alternatives={
        "secondary_amine": "[C:1](=O)[OH].[NH:2]>>[C:1](=O)[N:2]"
    },
    reagent_files=["acids.smi", "amines.smi"],
    auto_detect=True  # Auto-detect reagent compatibility
)

# Create Thompson Sampling config with synthesis_pipeline
config = ThompsonSamplingConfig(
    synthesis_pipeline=pipeline,
    num_ts_iterations=1000,
    strategy_config=RouletteWheelConfig(mode="maximize"),
    evaluator_config=LookupEvaluatorConfig(ref_filename="scores.csv"),
)

# Create sampler and run optimization
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()

Installation#

# Basic installation
pip install -e .

# With test dependencies
pip install -e ".[test]"

# With Jupyter notebook support
pip install -e ".[notebook]"

Requirements#

  • Python 3.11+

  • RDKit for molecular operations

  • Polars for efficient data handling

  • Pydantic v2 for configuration validation

Optional dependencies:

  • OpenEye Toolkit for ROCS and FRED evaluators

  • scikit-learn for ML classifier evaluator

Indices and tables#