How to use Pycaret for classification?

GeoSense ✅
2 min readMar 25

PyCaret is a popular open-source library for machine learning in Python. It provides a simple and easy-to-use interface for training and deploying machine learning models. In this post, we’ll go through the steps involved in using PyCaret for classification.

First, let’s make sure we have PyCaret installed. We can install it using pip:

pip install pycaret

Now, let’s start by loading some data for classification. We’ll use the famous Iris dataset as an example:

from sklearn.datasets import load_iris
import pandas as pd
iris = load_iris()
df = pd.DataFrame(, columns=iris.feature_names)
df['target'] =

Next, we need to initialize a PyCaret classification object:

from pycaret.classification import *
clf = setup(data=df, target='target')

This initializes a PyCaret environment for classification with the specified data and target variable. PyCaret will automatically preprocess the data and split it into training and testing sets.

Now, we can compare different classification models using the compare_models() function:

best_model = compare_models()

This will train and evaluate several classification models and return the best-performing one based on the chosen evaluation metric.

Finally, we can use the trained model to make predictions on new data:

predictions = predict_model(best_model, data=new_data)

This will use the best model to make predictions on new_data, which should have the same columns as the original dataset used for training.

In conclusion, PyCaret provides a simple and efficient way to train and deploy classification models in Python. By following these steps, you can quickly compare different classification models and choose the best one for your dataset.

GeoSense ✅

🌏 Remote sensing | 🛰️ Geographic Information Systems (GIS) | ℹ️