Search Algorithms

Choose how Kubeflow searches for the best hyperparameters.

Overview

Different algorithms have different trade-offs:

Algorithm

Best For

Trade-off

Random Search

Quick exploration, simple problems

May miss optimal regions

Bayesian Optimization

Expensive training, small budgets

More overhead per trial

Grid Search

Exhaustive search, few parameters

Doesn’t scale well

Random Search (Default)

Randomly samples hyperparameters from the search space:

from kubeflow.optimizer.types import RandomSearch

client.optimize(
    trial_template=template,
    search_space=search_space,
    algorithm=RandomSearch(),
)

When to use:

  • You have a large search space

  • Training is relatively fast

  • You want a simple baseline

Pros: Simple, parallelizes well, surprisingly effective.

Cons: No learning between trials, may miss optimal regions.

Bayesian Optimization

Uses a probabilistic model to guide the search:

from kubeflow.optimizer.types import BayesianOptimization

client.optimize(
    trial_template=template,
    search_space=search_space,
    algorithm=BayesianOptimization(),
)

When to use:

  • Training is expensive (hours per run)

  • You have a limited compute budget

  • You want to minimize the number of trials

Pros: Learns from previous trials, converges faster.

Cons: Doesn’t parallelize as well, more complex.

Algorithm Recommendations

Scenario

Recommended Algorithm

First exploration of a new model

Random Search with 20-50 trials

Training takes hours per run

Bayesian Optimization

Only 2-3 hyperparameters to tune

Grid Search

Large compute budget available

Random Search with many parallel trials

Need to find good config quickly

Bayesian Optimization with early stopping