Thompson Sampling#
The Thompson Sampling module implements a unified, flexible Thompson Sampling framework for chemical library exploration. It provides pluggable selection strategies, warmup approaches, and evaluators for efficiently screening ultra-large combinatorial libraries.
The module follows a composition-based architecture where the core ThompsonSampler
class accepts pluggable components:
Selection Strategies - How to choose reagents during search
Warmup Strategies - How to initialize priors before search
Evaluators - How to score generated compounds
Module Architecture#
Quick Start#
Using presets (recommended):
from TACTICS.library_enumeration import SynthesisPipeline
from TACTICS.library_enumeration.smarts_toolkit import ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import ThompsonSampler, get_preset
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig
# 1. Create synthesis pipeline (single source of truth)
rxn_config = ReactionConfig(
reactions=[ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0
)],
reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)
# 2. Get preset configuration
config = get_preset(
"fast_exploration",
synthesis_pipeline=pipeline,
evaluator_config=LookupEvaluatorConfig(ref_filename="scores.csv"),
mode="minimize",
num_iterations=1000
)
# 3. Create sampler and run optimization
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()
print(results_df.sort("score").head(10))
Direct sampler control:
from TACTICS.library_enumeration import SynthesisPipeline
from TACTICS.library_enumeration.smarts_toolkit import ReactionConfig, ReactionDef
from TACTICS.thompson_sampling.core.sampler import ThompsonSampler
from TACTICS.thompson_sampling.strategies import RouletteWheelSelection
from TACTICS.thompson_sampling.warmup import BalancedWarmup
from TACTICS.thompson_sampling.factories import create_evaluator
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig
# 1. Create synthesis pipeline
rxn_config = ReactionConfig(
reactions=[ReactionDef(
reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
step_index=0
)],
reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)
# 2. Create components
strategy = RouletteWheelSelection(mode="maximize", alpha=0.1, beta=0.05)
warmup = BalancedWarmup(observations_per_reagent=3)
evaluator = create_evaluator(LookupEvaluatorConfig(ref_filename="scores.csv"))
# 3. Create sampler with pipeline
sampler = ThompsonSampler(
synthesis_pipeline=pipeline,
selection_strategy=strategy,
warmup_strategy=warmup,
batch_size=10
)
# 4. Set evaluator and run
sampler.set_evaluator(evaluator)
warmup_df = sampler.warm_up(num_warmup_trials=3)
results_df = sampler.search(num_cycles=1000)
sampler.close()
ThompsonSampler#
The main class for Thompson Sampling optimization.
The ThompsonSampler is the central orchestrator that coordinates selection strategies,
warmup strategies, and evaluators to efficiently explore combinatorial chemical libraries.
Dependencies
Requires these components:
SynthesisPipeline - single source of truth for reactions and reagents
SelectionStrategy - for reagent selection during search
WarmupStrategy - for initializing priors (optional, defaults to StandardWarmup)
Evaluator - for scoring compounds (set via
set_evaluator())
Depends on: SynthesisPipeline, SelectionStrategy, WarmupStrategy, Evaluator
Constructor#
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Pipeline containing reaction config and reagent files (single source of truth). |
|
|
Yes |
Selection strategy instance (Greedy, RouletteWheel, UCB, etc.). |
|
|
No |
Warmup strategy. Default: StandardWarmup(). |
|
|
No |
Compounds to sample per cycle. Default: 1. |
|
|
No |
CPU cores for parallel evaluation. Default: 1 (sequential). |
|
|
No |
Min compounds per core before batch evaluation. Default: 10. |
|
|
No |
Stop after this many consecutive duplicates. Default: None. |
|
|
No |
Path for log file output. |
|
|
No |
Pre-enumerated product CSV for testing mode. |
|
|
No |
Use Boltzmann-weighted updates (legacy RWS). Default: False. |
Factory Method: from_config#
Create a sampler from a Pydantic configuration.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Configuration with strategy, warmup, and evaluator settings. |
Returns
Type |
Description |
|---|---|
|
Configured sampler ready for warmup and search. |
Example
from TACTICS.library_enumeration import SynthesisPipeline
from TACTICS.library_enumeration.smarts_toolkit import ReactionConfig, ReactionDef
from TACTICS.thompson_sampling.core.sampler import ThompsonSampler
from TACTICS.thompson_sampling.config import ThompsonSamplingConfig
from TACTICS.thompson_sampling.strategies.config import RouletteWheelConfig
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig
# Create synthesis pipeline
rxn_config = ReactionConfig(
reactions=[ReactionDef(reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]", step_index=0)],
reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)
# Create Thompson Sampling config
config = ThompsonSamplingConfig(
synthesis_pipeline=pipeline,
num_ts_iterations=1000,
strategy_config=RouletteWheelConfig(mode="maximize"),
evaluator_config=LookupEvaluatorConfig(ref_filename="scores.csv")
)
sampler = ThompsonSampler.from_config(config)
Core Methods#
warm_up#
Initialize reagent posteriors with warmup evaluations.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
Trials per reagent. Default: 3. |
Returns
Type |
Description |
|---|---|
|
Warmup results with columns: |
search#
Run the main Thompson Sampling search loop.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
Maximum sampling cycles. Default: 100. |
|
|
No |
Stop after this many unique evaluations. |
Returns
Type |
Description |
|---|---|
|
Search results with columns: |
evaluate#
Evaluate a single reagent combination.
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Reagent indices for each component. |
Returns
Type |
Description |
|---|---|
|
(product_smiles, product_name, score). |
Setup Methods#
Method |
Description |
|---|---|
|
Set the scoring evaluator. |
|
Load pre-enumerated products for testing. |
|
Cleanup multiprocessing resources. |
Note
The synthesis_pipeline is now passed to the constructor and is the single source
of truth for reactions and reagents. The old read_reagents() and set_reaction()
methods have been removed.
Selection Strategies#
Selection strategies determine how reagents are chosen during the search phase.
All strategies implement the SelectionStrategy abstract base class.
SelectionStrategy (Base Class)#
Abstract base class for all selection strategies. Extend this to create custom strategies.
Method |
Description |
|---|---|
|
Select one reagent from the list. |
|
Select multiple reagents (optional override). |
GreedySelection#
Simple greedy selection using argmax/argmin of sampled scores.
Extends: SelectionStrategy
Fast convergence but may get stuck in local optima
Best for: Simple optimization landscapes, limited budgets
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
|
Example
from TACTICS.thompson_sampling.strategies import GreedySelection
strategy = GreedySelection(mode="maximize")
# For docking scores (lower is better)
strategy = GreedySelection(mode="minimize")
RouletteWheelSelection#
Roulette wheel selection with thermal cycling and Component-Aware Thompson Sampling (CATS).
Extends: SelectionStrategy
Boltzmann-weighted selection with adaptive temperature control
Component rotation for systematic exploration
CATS: Shannon entropy-based criticality analysis
Best for: Complex multi-modal landscapes, large libraries
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
|
|
|
No |
Base temperature for heated component. Default: 0.1. |
|
|
No |
Base temperature for cooled components. Default: 0.05. |
|
|
No |
Fraction before CATS starts. Default: 0.20. |
|
|
No |
Fraction when CATS fully applied. Default: 0.60. |
|
|
No |
Min observations before trusting criticality. Default: 5. |
Example
from TACTICS.thompson_sampling.strategies import RouletteWheelSelection
# Standard thermal cycling
strategy = RouletteWheelSelection(
mode="maximize",
alpha=0.1,
beta=0.05
)
# Higher exploration
strategy = RouletteWheelSelection(
mode="maximize",
alpha=0.2,
beta=0.1
)
UCBSelection#
Upper Confidence Bound selection with deterministic behavior.
Extends: SelectionStrategy
Balances exploitation and exploration via confidence bounds
Best for: Situations requiring deterministic, reproducible behavior
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
|
|
|
No |
Exploration parameter. Higher = more exploration. Default: 2.0. |
Example
from TACTICS.thompson_sampling.strategies import UCBSelection
strategy = UCBSelection(mode="maximize", c=2.0)
# Higher exploration
strategy = UCBSelection(mode="maximize", c=4.0)
EpsilonGreedySelection#
Simple exploration strategy with decaying epsilon.
Extends: SelectionStrategy
Random selection with probability epsilon, greedy otherwise
Best for: Baseline comparisons, simple exploration needs
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
|
|
|
No |
Initial exploration probability [0, 1]. Default: 0.1. |
|
|
No |
Decay rate per iteration. Default: 0.995. |
Example
from TACTICS.thompson_sampling.strategies import EpsilonGreedySelection
# 20% exploration with decay
strategy = EpsilonGreedySelection(
mode="maximize",
epsilon=0.2,
decay=0.995
)
BayesUCBSelection#
Bayesian UCB with Student-t quantiles and CATS integration.
Extends: SelectionStrategy
Theoretically grounded Bayesian confidence bounds
Percentile-based thermal cycling (analog to temperature)
Component-aware exploration based on Shannon entropy
Best for: Complex landscapes, escaping local optima
Requires: scipy
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
|
|
|
No |
Base percentile for heated component [0.5, 0.999]. Default: 0.90. |
|
|
No |
Base percentile for cooled components [0.5, 0.999]. Default: 0.60. |
|
|
No |
Fraction before CATS starts. Default: 0.20. |
|
|
No |
Fraction when CATS fully applied. Default: 0.60. |
|
|
No |
Min observations before trusting criticality. Default: 5. |
Example
from TACTICS.thompson_sampling.strategies import BayesUCBSelection
strategy = BayesUCBSelection(mode="maximize")
# More aggressive exploration
strategy = BayesUCBSelection(
mode="maximize",
initial_p_high=0.95,
initial_p_low=0.70,
exploration_phase_end=0.25
)
Warmup Strategies#
Warmup strategies determine how reagent combinations are sampled to initialize posteriors before the main search begins.
WarmupStrategy (Base Class)#
Abstract base class for warmup strategies.
Method |
Description |
|---|---|
|
Generate list of combinations to evaluate. |
|
Estimate number of evaluations. |
|
Return strategy name. |
BalancedWarmup (Recommended)#
Balanced warmup guaranteeing exactly K observations per reagent with stratified partners.
Extends: WarmupStrategy
Uniform coverage across all reagents
Per-reagent variance estimation with James-Stein shrinkage
Reduces bias from random sampling
Best for: Most use cases, especially asymmetric component sizes
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
Observations per reagent. Default: 3. |
|
|
No |
Use per-reagent variance estimation. Default: True. |
|
|
No |
James-Stein shrinkage strength. Default: 3.0. |
|
|
No |
Random seed for reproducibility. |
Example
from TACTICS.thompson_sampling.warmup import BalancedWarmup
warmup = BalancedWarmup(observations_per_reagent=5)
# With per-reagent variance
warmup = BalancedWarmup(
observations_per_reagent=5,
use_per_reagent_variance=True,
shrinkage_strength=3.0
)
StandardWarmup#
Standard warmup testing each reagent with random partners.
Extends: WarmupStrategy
Simple and straightforward
Ensures all reagents evaluated
Expected evaluations: sum(reagent_counts) * num_trials
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
Random seed for reproducibility. |
EnhancedWarmup (Legacy)#
Stochastic parallel pairing with shuffling from the original RWS algorithm.
Extends: WarmupStrategy
Parallel pairing of reagents across components
Required for replicating legacy RWS results
Best for: legacy_rws_maximize and legacy_rws_minimize presets
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
No |
Random seed for reproducibility. |
Evaluators#
Evaluators score compounds based on various criteria. Choose based on your data source and computational requirements.
Evaluator (Base Class)#
Abstract base class for all evaluators.
Method |
Description |
|---|---|
|
Score a compound (accepts Mol or product_name depending on evaluator). |
|
Number of evaluations performed. |
LookupEvaluator#
Fast evaluator that looks up pre-computed scores from a CSV file.
Extends: Evaluator
Use for: Pre-computed scores, benchmarking
Recommendation: Use
processes=1(parallel overhead exceeds lookup time)
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Path to CSV file with scores. |
|
|
No |
Column name for scores. Default: |
|
|
No |
Column name for compound IDs. Default: |
Example
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig
from TACTICS.thompson_sampling.factories import create_evaluator
config = LookupEvaluatorConfig(
ref_filename="scores.csv",
score_col="binding_affinity"
)
evaluator = create_evaluator(config)
DBEvaluator#
Fast evaluator using SQLite database for large datasets.
Extends: Evaluator
Use for: Large pre-computed datasets (millions of compounds)
Recommendation: Use
processes=1
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Path to SQLite database. |
|
|
No |
Key prefix for lookups. Default: |
FPEvaluator#
Evaluator using Morgan fingerprint Tanimoto similarity.
Extends: Evaluator
Use for: Similarity-based virtual screening
Returns: Tanimoto similarity [0, 1]
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Reference molecule SMILES. |
|
|
No |
Morgan fingerprint radius. Default: 2. |
|
|
No |
Fingerprint bit length. Default: 2048. |
MWEvaluator#
Simple evaluator returning molecular weight. Primarily for testing.
Extends: Evaluator
ROCSEvaluator#
3D shape-based evaluator using OpenEye ROCS.
Extends: Evaluator
Use for: Shape-based virtual screening
Requires: OpenEye Toolkit license
Recommendation: Use
processes>1for parallel evaluation
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Path to reference structure (.sdf). |
|
|
No |
Max conformers to generate. Default: 50. |
FredEvaluator#
Molecular docking evaluator using OpenEye FRED.
Extends: Evaluator
Use for: Structure-based virtual screening
Requires: OpenEye Toolkit license
Recommendation: Use
processes>1for parallel evaluationMode:
minimize(lower docking scores = better)
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Path to receptor file (.oedu). |
|
|
No |
Max conformers to generate. Default: 100. |
MLClassifierEvaluator#
Evaluator using a trained scikit-learn classifier.
Extends: Evaluator
Use for: ML-based scoring with trained models
Requires: scikit-learn, trained model pickle file
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
Path to pickled sklearn model. |
Strategy Selection Guide#
Choose the right strategy based on your use case:
Strategy |
Best For |
Pros |
Cons |
|---|---|---|---|
Greedy |
Simple landscapes, limited budgets |
Fast convergence |
Can get stuck in local optima |
RouletteWheel |
Complex multi-modal landscapes |
Thermal cycling, CATS, adaptive |
More parameters to tune |
UCB |
Deterministic optimization needs |
Theoretically grounded |
Less stochastic |
BayesUCB |
Complex landscapes, escaping optima |
Bayesian bounds, CATS |
Requires scipy |
EpsilonGreedy |
Baseline comparisons |
Very simple |
Less sophisticated |
Evaluator Selection Guide#
Choose based on your data source and computational requirements:
Fast Evaluators (use processes=1):
LookupEvaluator: Pre-computed scores in CSVDBEvaluator: Pre-computed scores in SQLite
Computational Evaluators:
FPEvaluator: Fingerprint similarity (fast)MWEvaluator: Molecular weight (testing only)
Slow Evaluators (use processes>1):
ROCSEvaluator: 3D shape similarity (requires OpenEye)FredEvaluator: Molecular docking (requires OpenEye)MLClassifierEvaluator: ML model predictions
See the Configuration System page for preset configurations and detailed examples.