Strategies Guide#
This guide helps you choose the right TACTICS configuration for your use case.
If you just want the best method, use get_preset() with no arguments — it
defaults to the best-performing configuration from large-scale benchmarking.
Which Preset Should I Use?#
TACTICS provides five presets. Each corresponds to a method configuration validated across 21 combinatorial libraries (114,450+ total trials).
Preset |
Selection Strategy |
Warmup |
Recovery |
When to use |
|---|---|---|---|---|
recommended (default) |
Top-Two TS |
Enhanced |
86.1% |
New work. Best overall. |
recommended_rws |
RWS / CATS |
Enhanced |
85.5% |
When you want the original TACTICS method. |
baseline |
Greedy |
Balanced |
~81% |
Reference point to measure strategy gains. |
legacy_rws |
RWS (round-robin) |
Enhanced |
~76% |
Reproducing Zhao et al. 2025 results. Pass |
Recovery is mean top-100 recovery across all benchmark libraries and queries.
Both recommended and recommended_rws default to batch_size=100.
Basic Usage#
from TACTICS.thompson_sampling import ThompsonSampler, get_preset
from TACTICS.library_enumeration import SynthesisPipeline
from TACTICS.library_enumeration.smarts_toolkit import ReactionConfig, ReactionDef
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig
pipeline = SynthesisPipeline(ReactionConfig(
reactions=[ReactionDef(reaction_smarts="...", step_index=0)],
reagent_file_list=["acids.smi", "amines.smi"]
))
# Best method — no need to specify a preset name
config = get_preset(
synthesis_pipeline=pipeline,
evaluator_config=LookupEvaluatorConfig(ref_filename="scores.csv"),
)
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()
Docking (Minimize Mode)#
For scoring functions where lower is better (docking scores, binding energies):
config = get_preset(
synthesis_pipeline=pipeline,
evaluator_config=FredEvaluatorConfig(design_unit_file="receptor.oedu"),
mode="minimize",
)
Custom Batch Size#
Both recommended and recommended_rws accept a batch_size parameter
(default 100). Adjust based on your evaluator speed:
# Slow evaluator — larger batches for better throughput
config = get_preset(
synthesis_pipeline=pipeline,
evaluator_config=evaluator,
batch_size=500,
)
# Fast evaluator — single compound per cycle for tightest feedback loop
config = get_preset(
synthesis_pipeline=pipeline,
evaluator_config=evaluator,
batch_size=1,
)
Understanding the Methods#
This section explains why the recommended methods work. Skip this if you just want to run TACTICS — the presets handle everything.
How Thompson Sampling Works#
TACTICS maintains a Bayesian posterior distribution (Normal-Normal conjugate) for each reagent in the combinatorial library. At each iteration:
Sample from each reagent’s posterior
Select reagents to form a compound (using the selection strategy)
Evaluate the compound (using the evaluator)
Update the posteriors with the observed score
The selection strategy determines how reagents are chosen from the sampled posteriors. This is where the methods differ.
Top-Two Thompson Sampling (Recommended)#
Top-Two TS targets best-arm identification rather than cumulative regret minimization. At each step, it samples posteriors twice:
The leader is the reagent with the best first sample
The challenger is the reagent with the best second sample (excluding the leader)
With probability \(\beta\), the challenger is selected instead of the leader
This mechanism allocates exploration budget toward reagents where posterior uncertainty matters — exactly the reagents whose top-K membership is ambiguous.
TACTICS adds adaptive per-component thermal cycling on top of TT-TS: posteriors are scaled up (heated) or down (cooled) to control exploration intensity. The adaptive mechanism tracks per-component disagreement rates via exponential moving averages and self-tunes the heated scale, which is critical for imbalanced libraries (e.g., 130 acids x 3,844 amines).
Roulette Wheel Selection / CATS#
RWS uses Boltzmann-weighted selection with thermal cycling. Instead of always picking the best-sampled reagent (greedy), it converts sampled posteriors into selection probabilities via a Boltzmann distribution and samples from that.
CATS (Component-Aware Thompson Sampling) adds GMIC-weighted component rotation: instead of cycling through components in round-robin order, it measures each component’s GMIC criticality score and preferentially heats the most “flexible” component (the one where the top-K set is least certain).
This is the mechanism responsible for the largest gains over legacy methods (+4.3 pts on 2-component libraries).
GMIC Criticality#
GMIC (Generalized Mean Information Coefficient) measures how concentrated or spread the posterior means are for a component’s reagents. A high GMIC indicates a component where a few reagents dominate — this component is “solved” and needs less exploration. A low GMIC indicates a flat landscape where more exploration would help.
TACTICS uses GMIC to allocate thermal cycling budget: components with lower GMIC get heated more often, directing search toward unsolved parts of the chemical space.
Enhanced vs Balanced Warmup#
Enhanced warmup (recommended) uses stochastic parallel pairing: all reagents are shuffled and paired exhaustively in each warmup trial. On imbalanced libraries, this naturally over-samples the small component, pre-solving its ranking during warmup. GMIC then detects this differential posterior quality and redirects search to the unsolved large component.
Balanced warmup guarantees exactly K observations per reagent via stratified partner selection. It is useful for isolating the framework’s warmup contribution (Balanced-Greedy provides +1.5 pts over Legacy-Greedy on 2-component libraries), but Enhanced warmup is universally better when paired with GMIC rotation.
Boltzmann Weighting#
When use_boltzmann_weighting=True (enabled in all recommended presets),
the posterior update weights observations exponentially: good scores have
more influence on the posterior than bad scores. This accelerates
convergence toward high-scoring reagents.
Custom Configuration#
For full control, construct a ThompsonSamplingConfig directly:
from TACTICS.thompson_sampling import ThompsonSamplingConfig, ThompsonSampler
from TACTICS.thompson_sampling.strategies.config import TopTwoConfig
from TACTICS.thompson_sampling.warmup.config import EnhancedWarmupConfig
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig
config = ThompsonSamplingConfig(
synthesis_pipeline=pipeline,
num_ts_iterations=5000,
num_warmup_trials=5,
strategy_config=TopTwoConfig(
mode="maximize",
beta=0.5, # Challenger selection probability
heated_scale=1.5, # Posterior inflation for heated component
cooled_scale=0.75, # Posterior deflation for cooled component
adaptive_disagreement=True, # Per-component adaptive scaling
),
warmup_config=EnhancedWarmupConfig(),
evaluator_config=LookupEvaluatorConfig(
ref_filename="scores.csv",
score_col="binding_affinity",
),
batch_size=100,
use_boltzmann_weighting=True,
)
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()
See the Configuration System reference for all available parameters.
Selection Strategies Reference#
All strategies inherit from SelectionStrategy and are interchangeable
via the strategy_config parameter.
Strategy |
Tier |
Description |
|---|---|---|
Recommended |
Top-Two TS with adaptive thermal cycling and GMIC rotation. Best overall (86.1% recovery). |
|
Recommended |
CATS with Boltzmann thermal cycling and GMIC rotation. Close second (85.5% recovery). |
|
Baseline |
Pure argmax. Simplest strategy — sample posteriors, pick the best. |
|
Baseline |
Upper Confidence Bound. Consistently outperformed by TopTwo/RWS. |
|
Baseline |
Epsilon-greedy with decay. Consistently outperformed by TopTwo/RWS. |
|
Baseline |
Bayesian UCB with Student-t quantiles. Consistently outperformed by TopTwo/RWS. |
See the Thompson Sampling reference for detailed parameter documentation for each strategy.
Warmup Strategies Reference#
Strategy |
Tier |
Description |
|---|---|---|
Recommended |
Stochastic parallel pairing. Universally optimal — used by all recommended presets. |
|
Alternative |
Exactly K observations per reagent. Useful for controlled experiments isolating warmup contribution. |
|
Baseline |
Random partner selection. For comparison only. |