Bayesian Optimisation for Hyperparameter Tuning in Python SciKit Learn

Home » #Technology » Bayesian Optimisation for Hyperparameter Tuning in Python SciKit Learn

Hyperparameter tuning is crucial for building high-performing machine learning models. Bayesian Optimization is a powerful approach that intelligently explores the search space using probabilistic models like Gaussian Processes. Unlike Grid Search and Random Search, it focuses on promising hyperparameter regions, reducing unnecessary evaluations and making it highly efficient. For over 20 years, I’ve driven innovation, built scalable solutions, and led organizations to transformative success, redefine their technological future with confidence. This tech concept, Explains Bayesian Optimization with Python examples using skopt for Random Forest and optuna for XGBoost.

For more detail hyperparamter information: Hyperparameter Tuning for Optimal Model Performance: Finding the Perfect Balance for Machine Learning Models >>

What is Bayesian Optimization?

Bayesian Optimization estimates the relationship between hyperparameters and model performance using a probabilistic model. Based on past results, it selects the next set of hyperparameters to evaluate, leading to:

Faster convergence: It identifies optimal hyperparameters quickly.
Efficient search: It avoids testing poor-performing configurations.
Less computational cost: It reduces the number of required evaluations.

When to Use Bayesian Optimization?

When hyperparameter tuning is expensive due to long training times.
When optimizing deep learning and complex machine learning models.
When a large hyperparameter search space needs efficient exploration.

Example 1: Bayesian Optimization with `skopt` on Random Forest

Step 1: Install Required Libraries

pip install scikit-optimize

Step 2: Import Libraries & Load Dataset

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from skopt import BayesSearchCV
from skopt.space import Integer, Categorical
from sklearn.datasets import load_iris

# Load Iris dataset
data = load_iris()
X, y = data.data, data.target

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 3: Define Search Space & Perform Bayesian Optimization

# Define hyperparameter search space
search_space = {
    'n_estimators': Integer(50, 300),  # Number of trees
    'max_depth': Integer(5, 20),       # Maximum depth of trees
    'criterion': Categorical(['gini', 'entropy'])  # Splitting criterion
}

# Initialize the model
rf = RandomForestClassifier(random_state=42)

# Perform Bayesian Optimization
bayes_search = BayesSearchCV(estimator=rf, search_spaces=search_space,
                             n_iter=20, cv=5, scoring='accuracy',
                             n_jobs=-1, random_state=42)

# Fit the model
bayes_search.fit(X_train, y_train)

Step 4: Get the Best Parameters & Evaluate Performance

# Get the best hyperparameters
best_params = bayes_search.best_params_
print("Best Hyperparameters:", best_params)

# Train the model with the best hyperparameters
best_model = bayes_search.best_estimator_
y_pred = best_model.predict(X_test)

# Evaluate model performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Best Model Accuracy: {accuracy:.4f}")

Example 2: Bayesian Optimization with `optuna` on XGBoost

Step 1: Install Optuna & XGBoost

pip install optuna xgboost

Step 2: Import Libraries & Load Dataset

import optuna
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_digits

# Load Digits dataset
data = load_digits()
X, y = data.data, data.target

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 3: Define Objective Function for Bayesian Optimization

def objective(trial):
    param = {
        'n_estimators': trial.suggest_int('n_estimators', 50, 300),
        'max_depth': trial.suggest_int('max_depth', 3, 20),
        'learning_rate': trial.suggest_loguniform('learning_rate', 0.01, 0.3),
        'subsample': trial.suggest_uniform('subsample', 0.5, 1.0),
        'colsample_bytree': trial.suggest_uniform('colsample_bytree', 0.5, 1.0)
    }

    model = xgb.XGBClassifier(**param, use_label_encoder=False, eval_metric='mlogloss')
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)

    return accuracy_score(y_test, y_pred)

Step 4: Run Bayesian Optimization with Optuna

# Create Optuna study
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=20)

# Get the best hyperparameters
best_params = study.best_params
print("Best Hyperparameters:", best_params)

# Train and evaluate the model with the best hyperparameters
best_model = xgb.XGBClassifier(**best_params, use_label_encoder=False, eval_metric='mlogloss')
best_model.fit(X_train, y_train)
y_pred = best_model.predict(X_test)

# Evaluate model performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Best Model Accuracy: {accuracy:.4f}")

Why Choose Bayesian Optimization?

More efficient: Bayesian Optimization focuses on promising hyperparameter regions.
Reduces unnecessary evaluations: Saves computational resources.
Ideal for expensive models: Works well when training takes a long time.

My Tech Advice: Tuning an ML model is an ongoing process. Bayesian Optimization is a powerful, intelligent, and efficient hyperparameter tuning technique that reduces unnecessary iterations by learning from past evaluations and selecting promising hyperparameters intelligently. It is especially useful for models where training is expensive, ensuring optimal performance with fewer evaluations. Start optimizing your models today with Bayesian Optimization!
#AskDushyant

Note: The example and pseudo code is for illustration only. You must modify and experiment with the concept to meet your specific needs.

#TechConcept #TechAdvice #AI #ML #Python #ModelTuning

What is Bayesian Optimization?

When to Use Bayesian Optimization?

Example 1: Bayesian Optimization with `skopt` on Random Forest

Step 1: Install Required Libraries

Step 2: Import Libraries & Load Dataset

Step 3: Define Search Space & Perform Bayesian Optimization

Step 4: Get the Best Parameters & Evaluate Performance

Example 2: Bayesian Optimization with `optuna` on XGBoost

Step 1: Install Optuna & XGBoost

Step 2: Import Libraries & Load Dataset

Step 3: Define Objective Function for Bayesian Optimization

Step 4: Run Bayesian Optimization with Optuna

Why Choose Bayesian Optimization?

Section

Tags

Leave a Reply Cancel reply