API Reference

OptimizerClient

class kubeflow.optimizer.OptimizerClient(backend_config: KubernetesBackendConfig | None = None)[source]

Bases: object

__init__(backend_config: KubernetesBackendConfig | None = None)[source]

Initialize a Kubeflow Optimizer client.

Parameters:

backend_config (KubernetesBackendConfig | None) – Backend configuration. Either KubernetesBackendConfig or None to use default config class. Defaults to KubernetesBackendConfig.

Raises:

ValueError – Invalid backend configuration.

optimize(trial_template: TrainJobTemplate, *, trial_config: TrialConfig | None = None, search_space: dict[str, Any], objectives: list[Objective] | None = None, algorithm: BaseAlgorithm | None = None) str[source]

Create an OptimizationJob for hyperparameter tuning.

Parameters:
  • trial_template (TrainJobTemplate) – The TrainJob template defining the training script.

  • trial_config (TrialConfig | None) – Optional configuration to run Trials.

  • objectives (list[Objective] | None) – List of objectives to optimize.

  • search_space (dict[str, Any]) – Dictionary mapping parameter names to Search specifications using Search.uniform(), Search.loguniform(), Search.choice(), etc.

  • algorithm (BaseAlgorithm | None) – The optimization algorithm to use. Defaults to RandomSearch.

Returns:

The unique name of the Experiment that has been generated.

Raises:
list_jobs() list[OptimizationJob][source]

List of the created OptimizationJobs

Returns:

List of created OptimizationJobs. If no OptimizationJob exist,

an empty list is returned.

Raises:
get_job(name: str) OptimizationJob[source]

Get the OptimizationJob object

Parameters:

name (str) – Name of the OptimizationJob.

Returns:

A OptimizationJob object.

Raises:
get_job_logs(name: str, trial_name: str | None = None, follow: bool = False) Iterator[str][source]

Get logs from a specific trial of an OptimizationJob.

You can watch for the logs in realtime as follows: ```python from kubeflow.optimizer import OptimizerClient

# Get logs from the best current trial for logline in OptimizerClient().get_job_logs(name=”n7fb28dbee94”):

print(logline)

# Get logs from a specific trial for logline in OptimizerClient().get_job_logs(

name=”n7fb28dbee94”, trial_name=”n7fb28dbee94-abc123”, follow=True

):

print(logline)

```

Parameters:
  • name (str) – Name of the OptimizationJob.

  • trial_name (str | None) – Optional name of a specific Trial. If not provided, logs from the current best trial are returned. If no best trial is available yet, logs from the first trial are returned.

  • follow (bool) – Whether to stream logs in realtime as they are produced.

Returns:

Iterator of log lines.

Raises:
get_best_results(name: str) Result | None[source]

Get the best hyperparameters and metrics from an OptimizationJob.

This method retrieves the optimal hyperparameters and their corresponding metrics from the best trial found during the optimization process.

Parameters:

name (str) – Name of the OptimizationJob.

Returns:

A Result object containing the best hyperparameters and metrics, or None if no best trial is available yet.

Raises:
wait_for_job_status(name: str, status: set[str] = {'Complete'}, timeout: int = 3600, polling_interval: int = 2, callbacks: list[Callable[[OptimizationJob], None]] | None = None) OptimizationJob[source]

Wait for an OptimizationJob to reach a desired status.

Parameters:
  • name (str) – Name of the OptimizationJob.

  • status (set[str]) – Expected statuses. Must be a subset of Created, Running, Complete, and Failed statuses.

  • timeout (int) – Maximum number of seconds to wait for the OptimizationJob to reach one of the expected statuses.

  • polling_interval (int) – The polling interval in seconds to check OptimizationJob status.

  • callbacks (list[Callable[[OptimizationJob], None]] | None) – Optional list of callback functions to be invoked after each polling interval. Each callback should accept a single argument: the OptimizationJob object.

Returns:

An OptimizationJob object that reaches the desired status.

Raises:
  • ValueError – The input values are incorrect.

  • RuntimeError – Failed to get OptimizationJob or OptimizationJob reaches unexpected Failed status.

  • TimeoutError – Timeout to wait for OptimizationJob status.

delete_job(name: str)[source]

Delete the OptimizationJob.

Parameters:

name (str) – Name of the OptimizationJob.

Raises:
get_job_events(name: str) list[Event][source]

Get events for an OptimizationJob.

This provides additional clarity about the state of the OptimizationJob when logs alone are not sufficient. Events include information about trial state changes, errors, and other significant occurrences.

Parameters:

name (str) – Name of the OptimizationJob.

Returns:

A list of Event objects associated with the OptimizationJob.

Raises:

Search Space

class kubeflow.optimizer.Search[source]

Bases: object

Helper class for defining search space parameters.

static uniform(min: float, max: float) V1beta1ParameterSpec[source]

Sample a float value uniformly between min and max.

Parameters:
  • min (float) – Lower boundary for the float value.

  • max (float) – Upper boundary for the float value.

Returns:

Katib ParameterSpec object.

static loguniform(min: float, max: float) V1beta1ParameterSpec[source]

Sample a float value with log-uniform distribution between min and max.

Parameters:
  • min (float) – Lower boundary for the float value.

  • max (float) – Upper boundary for the float value.

Returns:

Katib ParameterSpec object.

static choice(values: list) V1beta1ParameterSpec[source]

Sample a categorical value from the list.

Parameters:

values (list) – List of categorical values.

Returns:

Katib ParameterSpec object.

Configuration

class kubeflow.optimizer.TrialConfig(num_trials: int = 10, parallel_trials: int = 1, max_failed_trials: int | None = None) None[source]

Bases: object

Trial configuration for hyperparameter optimization.

Parameters:
  • num_trials (int) – Number of trials to run. Defaults to 10.

  • parallel_trials (int) – Number of trials to run in parallel. Defaults to 1.

  • max_failed_trials (Optional[int]) – Maximum number of failed trials before stopping.

num_trials: int = 10
parallel_trials: int = 1
max_failed_trials: int | None = None
class kubeflow.optimizer.Objective(metric: str = 'loss', direction: Direction = Direction.MINIMIZE) None[source]

Bases: object

Objective configuration for hyperparameter optimization.

Parameters:
  • metric (str) – The name of the metric to optimize. Defaults to “loss”.

  • direction (Direction) – Whether to maximize or minimize the metric. Defaults to “minimize”.

metric: str = 'loss'
direction: Direction = 'minimize'