Introduction
Benchmarking black-box optimizers is deceptively hard in practice: the optimizer might be a small, modern Python package, while the benchmarks you want to evaluate on often come with heavyweight (and sometimes conflicting) dependencies, pinned Python versions, external simulators, environment variables, or even “bitrotted” setup instructions. Getting a fair, reproducible comparison across a diverse benchmark set quickly turns into dependency-management work rather than optimization research.
Bencher (also see the paper) was created to make this workflow boring again. The key idea is to decouple benchmark execution from optimization logic via a simple client–server architecture:
- Benchmarks run inside a Bencher server and are isolated from each other (each benchmark, or compatible benchmark group, lives in its own Python environment).
- Your optimizer talks to the server via a stable RPC interface (gRPC), so your project only needs a lightweight client dependency.
- The server is easy to deploy locally via Docker and can also be run on HPC systems via Singularity/Apptainer (typically as a background instance), which makes experiments much more reproducible across machines and clusters.
In this post, we’ll integrate Bencher into an existing optimizer codebase (TuRBO) so that TuRBO can evaluate points on Bencher’s benchmark suite without pulling benchmark dependencies into TuRBO’s environment. Concretely, we will:
- Clone TuRBO and create a small runnable
main.py. - Start the Bencher server (Docker) and connect to it from Python.
- Wrap Bencher benchmarks as a callable objective function that TuRBO can optimize.
- (Optional) Package everything for HPC execution with Apptainer, keeping the Bencher service running in the background while the optimization runs.
With that motivation in place, let’s start from a clean TuRBO checkout and build up the integration step by step.
git clone https://github.com/uber-research/TuRBO.git
cd TuRBO
Create a virtual environment and install dependencies:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
From here on, assume all commands run inside this virtual environment.
Next, add a main.py file in the project root (based largely on the official TuRBO example notebook):
from turbo import Turbo1
import numpy as np
class Levy:
def __init__(self, dim=10):
self.dim = dim
self.lb = -5 * np.ones(dim)
self.ub = 10 * np.ones(dim)
def __call__(self, x):
assert x.ndim == 1
assert len(x) == self.dim
assert np.all(x <= self.ub) and np.all(x >= self.lb)
w = 1 + (x - 1.0) / 4.0
val = (
np.sin(np.pi * w[0]) ** 2
+ np.sum(
(w[1 : self.dim - 1] - 1) ** 2
* (1 + 10 * np.sin(np.pi * w[1 : self.dim - 1] + 1) ** 2)
)
+ (w[self.dim - 1] - 1) ** 2
* (1 + np.sin(2 * np.pi * w[self.dim - 1]) ** 2)
)
return val
def main():
f = Levy(dim=10)
turbo1 = Turbo1(
f=f, # objective function
lb=f.lb, # lower bounds (numpy array)
ub=f.ub, # upper bounds (numpy array)
n_init=20, # initial points (Latin hypercube)
max_evals=1000, # maximum number of function evaluations
batch_size=10, # batch size
verbose=True, # print progress
use_ard=True, # ARD kernel for GP
max_cholesky_size=2000,
n_training_steps=50,
min_cuda=1024, # run on CPU for small datasets
device="cpu", # "cpu" or "cuda"
dtype="float64", # "float64" or "float32"
)
turbo1.optimize()
if __name__ == "__main__":
main()
Adding Bencher
Running the Bencher container
Pull the Bencher Docker image from Docker Hub:
docker pull gaunab/bencher:latest
On non-amd64 machines (e.g., Apple Silicon), pull the amd64 variant explicitly:
docker pull --platform linux/amd64 gaunab/bencher:latest
Start the container in the background and expose the gRPC port:
docker run -d --restart always -p 50051:50051 gaunab/bencher:latest
This maps container port 50051 to localhost:50051 on the host. If 50051 is already taken, map a different host port (e.g., -p 50052:50051) and adjust the client configuration accordingly.
Verify it is running:
docker ps
Example output:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e971fa532d01 gaunab/bencher:latest "python3.11 /entrypo…" 55 minutes ago Up 55 minutes 0.0.0.0:50051->50051/tcp, [::]:50051->50051/tcp gifted_chaum
Adding Bencher benchmarks to TuRBO
Next, add bencherscaffold as a dependency (e.g., in setup.py). Then install the project package locally:
pip install .
Now update main.py to evaluate TuRBO’s objective function through Bencher:
import numpy as np
from bencherscaffold.client import BencherClient
from bencherscaffold.protoclasses.bencher_pb2 import Value, ValueType
from turbo import Turbo1
BENCHER_BENCHMARK_DIMS = {
# Add more benchmarks here as needed
"lasso-dna": 180,
"mopta08": 124,
"svm": 388,
"rover": 60,
}
class BencherObjective:
def __init__(self, client: BencherClient, benchmark_name: str):
if benchmark_name not in BENCHER_BENCHMARK_DIMS:
raise ValueError(
f"Unknown benchmark '{benchmark_name}'. "
f"Available: {sorted(BENCHER_BENCHMARK_DIMS)}"
)
self.benchmark_name = benchmark_name
self.dim = BENCHER_BENCHMARK_DIMS[benchmark_name]
self.lb = np.zeros(self.dim)
self.ub = np.ones(self.dim)
self.client = client
self._Value = Value
self._ValueType = ValueType
def __call__(self, x: np.ndarray) -> float:
assert isinstance(x, np.ndarray), "x must be a numpy array"
assert x.ndim == 1, "x must be 1D"
assert x.size == self.dim, f"x must have length {self.dim}"
assert np.all((0.0 <= x) & (x <= 1.0)), "x must be in [0, 1]"
# Convert x into Bencher Value protos
point = [
self._Value(type=self._ValueType.CONTINUOUS, value=float(v))
for v in x
]
# Evaluate
res = self.client.evaluate_point(
benchmark_name=self.benchmark_name,
point=point,
)
return float(res)
def main():
client = BencherClient(hostname="localhost", port=50051) # adjust if needed
f = BencherObjective(client=client, benchmark_name="mopta08")
turbo1 = Turbo1(
f=f,
lb=f.lb,
ub=f.ub,
# keep / tune the remaining parameters as desired
n_init=20,
max_evals=1000,
batch_size=10,
verbose=True,
use_ard=True,
max_cholesky_size=2000,
n_training_steps=50,
min_cuda=1024,
device="cpu",
dtype="float64",
)
turbo1.optimize()
if __name__ == "__main__":
main()
Running on an HPC cluster with Apptainer
On many HPC systems Docker isn’t available, but Apptainer/Singularity is. A practical pattern is:
- Build an Apptainer image based on the Bencher Docker image.
- Start Bencher as a background service via an Apptainer instance.
- Run your optimizer (which connects to Bencher over
localhostinside the same instance).
Create an Apptainer definition file that inherits from the Bencher image:
Bootstrap: docker
From: gaunab/bencher:latest
%environment
export LANG=C.UTF-8
export PATH="/root/.local/bin:$PATH"
%files
./turbo /opt/TuRBO/turbo
./main.py /opt/TuRBO/main.py
./setup.py /opt/TuRBO/setup.py
%post
set -e
apt-get update -y
apt-get install -y --no-install-recommends python3-venv build-essential
python3 -m venv /opt/venv
. /opt/venv/bin/activate
pip install --upgrade pip
cd /opt/TuRBO
pip install .
# Reduce image size
apt-get purge -y --auto-remove build-essential
rm -rf /var/lib/apt/lists/*
rm -rf /root/.cache/pip
%startscript
echo "Starting Bencher service..."
# Keep Bencher running in the instance.
exec python3 /entrypoint.py
%runscript
cd /opt/TuRBO
exec /opt/venv/bin/python /opt/TuRBO/main.py "$@"
Save this as TuRBO.def.
Key points:
From: gaunab/bencher:latestensures the container includes the Bencher service.- The
%startscriptlaunches Bencher when the instance starts (so it stays up while you run experiments). - The
%runscriptexecutes your optimizer, passing through any CLI arguments.
Typical workflow on the cluster
Build the image:
apptainer build TuRBO.sif TuRBO.def
Start an instance (this triggers %startscript and keeps Bencher running):
apptainer instance start TuRBO.sif TuRBO
Run the optimizer inside the instance:
apptainer run instance://TuRBO
Because %runscript uses "$@", you can pass command-line arguments through to main.py (if you add argument parsing).