Realizing the Significance of Features in Machine Learning

With all of the packages and tools available, building a machine learning model isn’t difficult. However, building a good machine learning model is another story. If you think that machine learning involves throwing hundreds of columns of data into a notebook and using Scikit-Learn to build a model, think again.

Feature Importance Explained

VIDEO: Feature Selection In Machine Learning | Feature Selection Techniques With Examples | Simplilearn

Simplilearn

Feature importance is a step in building a machine learning model that involves calculating the score for all input features in a model to establish the importance of each feature in the decision-making process. The higher the score for a feature, the larger effect it has on the model to predict a certain variable.

A huge step that is often ignored is feature importance, or selecting the appropriate features for your model. Useless data results in bias that messes up the final results of our machine learning. In this article, we will discuss the feature importance, a step that plays a pivotal role in machine learning.

We’ll cover what feature importance is, why it’s so useful, how you can implement feature importance with Python and how you can visualize feature importance in Gradio.

What Is Feature Importance?

VIDEO: Feature Selection in Machine Learning: Easy Explanation for Data Science Interviews

Emma Ding

Feature importance refers to techniques that calculate a score for all the input features for a given model. The scores represent the “importance” of each feature. A higher score means that the specific feature will have a larger effect on the model that is being used to predict a certain variable.

Let’s take a real-life example for a better understanding. Suppose you have to buy a new house near your workplace. While purchasing a house, you might think of different factors. The most important factor in your decision making might be the location of the property, and so, you’ll likely only look for houses that are near your workplace. Feature importance works in a similar way. It will rank features based on the effect that they have on the model’s prediction.

More on Machine LearningA Deep Dive Into Non-Maximum Suppression (NMS)

Why Is Feature Importance Useful?

VIDEO: How to find Feature Importance in your model

Jonathan Perry

Feature importance is extremely useful for the following reasons:

1. Data Comprehension

VIDEO: How do I select features for Machine Learning?

Data School

Building a model is one thing, but understanding the data that goes into the model is another. Like a correlation matrix, feature importance allows you to understand the relationship between the features and the target variable. It also helps you understand what features are irrelevant for the model.

2. Model Improvement

VIDEO: Understand ANY Machine Learning Model

CodeEmporium

When training your model, you can use the scores calculated from feature importance to reduce the dimensionality of the model. The higher scores are usually kept and the lower scores are deleted as they are not important for the model. This simplifies the model and speeds up the model’s working, ultimately improving the performance of the model.

3. Model Interpretability

VIDEO: SHAP values for beginners | What they mean and their applications

A Data Odyssey

Feature importance is also useful for interpreting and communicating your model to other stakeholders. By calculating scores for each feature, you can determine which features attribute the most to the predictive power of your model.

How to Calculate Feature Importance

VIDEO: Feature Engineering Techniques For Machine Learning in Python

Greg Hogg

There are different ways to calculate feature importance, but this article will focus on two methods: Gini importance and permutation feature importance.

Gini Importance

VIDEO: Feature Engineering Full Course - in 1 Hour | Beginner Level

Analytics Vidhya

In Scikit-Learn, Gini importance is used to calculate the node impurity. Feature importance is basically a reduction in the impurity of a node weighted by the number of samples that are reaching that node from the total number of samples. This is known as node probability. Let us suppose we have a tree with two child nodes, the equation:

Feature importance equation — Gini importance equation to calculate node impurity. | Image: Terence Shin

Here, we have:

nij: Node j importance.
wj: Weighted number of samples reaching node j.
Cj: The impurity value of node j.
left(j): Child node on left of node j.
right(j): Child node on right of node j.

This equation gives us the importance of a node j, which is used to calculate the feature importance for every decision tree. A single feature can be used in the different branches of the tree. We can calculate the feature importance as follows.

The features are normalized against the sum of all feature values present in the tree, and after dividing it with the total number of trees in our random forest, we get the overall feature importance. With this, you can get a better grasp of the feature importance in random forests.

Permutation Feature Importance

VIDEO: What are Precision and Recall in Machine Learning?

Levity

The idea behind permutation feature importance is simple. Under this method, the feature importance is calculated by noticing the increase or decrease in error when we permute the values of a feature. If permuting the values causes a huge change in the error, it means the feature is important for our model.

The best thing about this method is that it can be applied to every machine learning model. Its approach is model agnostic, which gives you a lot of freedom. There are no complex mathematical formulas behind it. The permutation feature importance is based on an algorithm that works as follows.

Calculate the mean squared error with the original values.
Shuffle the values for the features and make predictions.
Calculate the mean squared error with the shuffled values.
Compare the difference between them.
Sort the differences in descending order to get features with most to least importance.

How to Calculate Feature Importance in Python

VIDEO: Computer Scientist Explains Machine Learning in 5 Levels of Difficulty | WIRED

WIRED

In this section, we’ll create a random forest model using the Boston housing data set.

1. Import the Required Libraries and Data Set

VIDEO: Entropy (for data science) Clearly Explained!!!

StatQuest with Josh Starmer

First, we’ll import all the required libraries and our data set.

import numpy as np
import pandas as pd
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.inspection import permutation_importance
from matplotlib import pyplot as plt

2. Train Test Split

VIDEO: Interpretable vs Explainable Machine Learning

A Data Odyssey

The next step is to load the data set and split it into a test and training set.

boston = load_boston()

X = pd.DataFrame(boston.data, columns=boston.feature_names)
y = boston.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

3. Create a Random Forest Model

VIDEO: What is feature engineering | Feature Engineering Tutorial Python # 1

codebasics

Next, we’ll create the random forest model.

rf = RandomForestRegressor(n_estimators=150)
rf.fit(X_train, y_train)

4. Apply Feature Importance and Plot Results

VIDEO: AI vs Machine Learning

IBM Technology

Once the model is created, we can conduct feature importance and plot it on a graph to interpret the results.

sort = rf.feature_importances_.argsort()

plt.barh(boston.feature_names[sort], rf.feature_importances_[sort])

plt.xlabel("Feature Importance")

RM is the average number of rooms per dwelling, and it’s the most important feature in predicting the target variable.

A tutorial on how to find feature importance in a machine learning model. | Video: Jonathan Perry

More on Machine LearningIntroduction to Prolog: A Programming Language in Python

How to Calculate Feature Importance with Gradio

VIDEO: Modi Government's Honest Report Card Breakdown by Think School I Detailed Analysis

TRS Clips

Gradio is a package that helps create simple and interactive interfaces for machine learning models. With Gradio, you can evaluate and test your model in real time. It can also calculate the feature importance with a single parameter, and we can interact with the features to see how it affects feature importance.

Here’s an example:

1. Import the Required Libraries and Data Set

VIDEO: How to Use Machine Learning for Predictive Maintenance

RealPars

First, we’ll import all the required libraries and our data set. In this example, I will be using the iris data set from the Seaborn library.

# Importing libraries
import numpy as np
import pandas as pd
import seaborn as sns

# Importing data
iris=sns.load_dataset("iris")

2. Fit the Data Set to the Model

VIDEO: Understanding Feature Importance in Machine Learning | Data Science | Alphien

Alphien

Then, we’ll split the data set and fit it on the model.

from sklearn.model_selection import train_test_split
X=iris.drop("species",axis=1)
y=iris["species"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

from sklearn.svm import SVC
model = SVC(probability=True)
model.fit(X_train,y_train)

3. Create a Prediction Function

VIDEO: Feature Selection Techniques Easily Explained | Machine Learning

Krish Naik

We’ll also create a prediction function that will be used in our Gradio interface.

def predict_flower(sepal_length, sepal_width, petal_length, petal_width):
   df = pd.DataFrame.from_dict({'Sepal Length':[sepal_length],
                                'Sepal Width': [sepal_width],
                                'Petal Length': [petal_length],  
                                'Petal Width': [petal_width]})
   predict = model.predict_proba(df)[0]
   return {model.classes_[i]: predict[i] for i in range(3)}

4. Install Gradio and Create an Interface

VIDEO: 🔥Simplified DevOps: Day 16 to Day 20 | Crash Course in DevOps: 5 Days, 5 Keys to Success 🥷🥷🥷

T3P - Technology To The Point

Finally, we’ll install Gradio with pip and create our interface.

# Installing and importing Gradio
!pip install gradio
import gradio as gr

sepal_length = gr.inputs.Slider(minimum=0, maximum=10, default=5, label="sepal_length")
sepal_width = gr.inputs.Slider(minimum=0, maximum=10, default=5, label="sepal_width")

petal_length = gr.inputs.Slider(minimum=0, maximum=10, default=5, label="petal_length")
petal_width = gr.inputs.Slider(minimum=0, maximum=10, default=5, label="petal_width")

gr.Interface(predict_flower, [sepal_length, sepal_width, petal_length, petal_width], "label", live=True, interpretation="default").launch(debug=True)

The gr.Interface takes an interpretation parameter which gives us the importance of the features for the mode. Below is the result:

Output after setting the feature importance for the data set. | Image: Terence Shin

Feature importance legend. | Image: Terence Shin

The legend tells you how changing that feature will affect the output. Increasing petal length and petal width will increase the confidence in the virginica class. Petal length is more “important” only in the sense that increasing petal length gets you “redder,” more confident, faster.

If you made it this far, congrats. Hopefully, you have a thorough understanding of what feature importance is, why it’s useful and how you can use it.

Article information

Author: David Martin

Last Updated: 1704105841

Rating: 4 / 5 (58 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: David Martin

Birthday: 1987-06-26

Address: 2500 Michele Way Apt. 813, Lake Rebeccachester, LA 72932

Phone: +4220721739098355

Job: Chemical Engineer

Hobby: Poker, Sewing, Reading, Knitting, Drone Flying, Camping, Hiking

Introduction: My name is David Martin, I am a accessible, unwavering, Gifted, Precious, spirited, Open, enterprising person who loves writing and wants to share my knowledge and understanding with you.