API Reference¶
OptimizerClient¶
- class kubeflow.optimizer.OptimizerClient(backend_config: KubernetesBackendConfig | None = None)[source]¶
Bases:
object- __init__(backend_config: KubernetesBackendConfig | None = None)[source]¶
Initialize a Kubeflow Optimizer client.
- Parameters:
backend_config (
KubernetesBackendConfig|None) – Backend configuration. Either KubernetesBackendConfig or None to use default config class. Defaults to KubernetesBackendConfig.- Raises:
ValueError – Invalid backend configuration.
- optimize(trial_template: TrainJobTemplate, *, trial_config: TrialConfig | None = None, search_space: dict[str, Any], objectives: list[Objective] | None = None, algorithm: BaseAlgorithm | None = None) str[source]¶
Create an OptimizationJob for hyperparameter tuning.
- Parameters:
trial_template (
TrainJobTemplate) – The TrainJob template defining the training script.trial_config (
TrialConfig|None) – Optional configuration to run Trials.objectives (
list[Objective] |None) – List of objectives to optimize.search_space (
dict[str,Any]) – Dictionary mapping parameter names to Search specifications using Search.uniform(), Search.loguniform(), Search.choice(), etc.algorithm (
BaseAlgorithm|None) – The optimization algorithm to use. Defaults to RandomSearch.
- Returns:
The unique name of the Experiment that has been generated.
- Raises:
ValueError – Input arguments are invalid.
TimeoutError – Timeout to create Experiment.
RuntimeError – Failed to create Experiment.
- list_jobs() list[OptimizationJob][source]¶
List of the created OptimizationJobs
- Returns:
- List of created OptimizationJobs. If no OptimizationJob exist,
an empty list is returned.
- Raises:
TimeoutError – Timeout to list OptimizationJobs.
RuntimeError – Failed to list OptimizationJobs.
- get_job(name: str) OptimizationJob[source]¶
Get the OptimizationJob object
- Parameters:
name (
str) – Name of the OptimizationJob.- Returns:
A OptimizationJob object.
- Raises:
TimeoutError – Timeout to get a OptimizationJob.
RuntimeError – Failed to get a OptimizationJob.
- get_job_logs(name: str, trial_name: str | None = None, follow: bool = False) Iterator[str][source]¶
Get logs from a specific trial of an OptimizationJob.
You can watch for the logs in realtime as follows: ```python from kubeflow.optimizer import OptimizerClient
# Get logs from the best current trial for logline in OptimizerClient().get_job_logs(name=”n7fb28dbee94”):
print(logline)
# Get logs from a specific trial for logline in OptimizerClient().get_job_logs(
name=”n7fb28dbee94”, trial_name=”n7fb28dbee94-abc123”, follow=True
- ):
print(logline)
- Parameters:
name (
str) – Name of the OptimizationJob.trial_name (
str|None) – Optional name of a specific Trial. If not provided, logs from the current best trial are returned. If no best trial is available yet, logs from the first trial are returned.follow (
bool) – Whether to stream logs in realtime as they are produced.
- Returns:
Iterator of log lines.
- Raises:
TimeoutError – Timeout to get an OptimizationJob.
RuntimeError – Failed to get an OptimizationJob.
- get_best_results(name: str) Result | None[source]¶
Get the best hyperparameters and metrics from an OptimizationJob.
This method retrieves the optimal hyperparameters and their corresponding metrics from the best trial found during the optimization process.
- Parameters:
name (
str) – Name of the OptimizationJob.- Returns:
A Result object containing the best hyperparameters and metrics, or None if no best trial is available yet.
- Raises:
TimeoutError – Timeout to get an OptimizationJob.
RuntimeError – Failed to get an OptimizationJob.
- wait_for_job_status(name: str, status: set[str] = {'Complete'}, timeout: int = 3600, polling_interval: int = 2, callbacks: list[Callable[[OptimizationJob], None]] | None = None) OptimizationJob[source]¶
Wait for an OptimizationJob to reach a desired status.
- Parameters:
name (
str) – Name of the OptimizationJob.status (
set[str]) – Expected statuses. Must be a subset of Created, Running, Complete, and Failed statuses.timeout (
int) – Maximum number of seconds to wait for the OptimizationJob to reach one of the expected statuses.polling_interval (
int) – The polling interval in seconds to check OptimizationJob status.callbacks (
list[Callable[[OptimizationJob],None]] |None) – Optional list of callback functions to be invoked after each polling interval. Each callback should accept a single argument: the OptimizationJob object.
- Returns:
An OptimizationJob object that reaches the desired status.
- Raises:
ValueError – The input values are incorrect.
RuntimeError – Failed to get OptimizationJob or OptimizationJob reaches unexpected Failed status.
TimeoutError – Timeout to wait for OptimizationJob status.
- delete_job(name: str)[source]¶
Delete the OptimizationJob.
- Parameters:
name (
str) – Name of the OptimizationJob.- Raises:
TimeoutError – Timeout to delete OptimizationJob.
RuntimeError – Failed to delete OptimizationJob.
- get_job_events(name: str) list[Event][source]¶
Get events for an OptimizationJob.
This provides additional clarity about the state of the OptimizationJob when logs alone are not sufficient. Events include information about trial state changes, errors, and other significant occurrences.
- Parameters:
name (
str) – Name of the OptimizationJob.- Returns:
A list of Event objects associated with the OptimizationJob.
- Raises:
TimeoutError – Timeout to get an OptimizationJob events.
RuntimeError – Failed to get an OptimizationJob events.
Search Space¶
- class kubeflow.optimizer.Search[source]¶
Bases:
objectHelper class for defining search space parameters.
- static uniform(min: float, max: float) V1beta1ParameterSpec[source]¶
Sample a float value uniformly between min and max.
Configuration¶
- class kubeflow.optimizer.TrialConfig(num_trials: int = 10, parallel_trials: int = 1, max_failed_trials: int | None = None) None[source]¶
Bases:
objectTrial configuration for hyperparameter optimization.
- Parameters:
num_trials (int) – Number of trials to run. Defaults to 10.
parallel_trials (int) – Number of trials to run in parallel. Defaults to 1.
max_failed_trials (Optional[int]) – Maximum number of failed trials before stopping.
- class kubeflow.optimizer.Objective(metric: str = 'loss', direction: Direction = Direction.MINIMIZE) None[source]¶
Bases:
objectObjective configuration for hyperparameter optimization.
- Parameters:
metric (str) – The name of the metric to optimize. Defaults to “loss”.
direction (Direction) – Whether to maximize or minimize the metric. Defaults to “minimize”.
- direction: Direction = 'minimize'¶