What is the Inception Score (IS)?

Published on Apr 28,2025 9 Views
Generative AI enthusiast with expertise in RAG (Retrieval-Augmented Generation) and LangChain, passionate... Generative AI enthusiast with expertise in RAG (Retrieval-Augmented Generation) and LangChain, passionate about building intelligent AI-driven solutions

What is the Inception Score (IS)?

edureka.co

Imagine you’re generating synthetic fashion designs using a GAN, and you want to assess whether your AI is producing realistic and varied outfits. How do you measure that—especially without human judgment? This is where the Inception Score (IS) becomes incredibly valuable. Widely used in evaluating Generative Adversarial Networks (GANs), IS quantifies how realistic and diverse your AI-generated images are.

Let’s explore how Inception Score works, its strengths and weaknesses, and how it compares to other evaluation metrics.

What is the inception score (IS)?

The Inception Score is a metric designed to evaluate the performance of generative models, especially GANs, by assessing:

It leverages a pre-trained Inception v3 classifier to estimate these two qualities without needing labeled data.

Next, let’s see how it actually works under the hood.

How does the inception score work?

The IS uses the following process:

IS=exp⁡(Ex[KL(p(y∣x)∣∣p(y))])IS = exp left( mathbb{E}_x [KL(p(y|x) || p(y))] right)

Intuition:

But no metric is perfect. Let’s look at IS limitations next.

What are the limitations of the inception score?

The limitations are as follows:

To tackle these, researchers often compare IS with a more robust metric, FID.

Inception score vs. Fréchet inception distance

FeatureInception Score (IS)Fréchet Inception Distance (FID)
PurposeMeasures image quality and diversityMeasures similarity between real and generated images
Based onKL divergence of class probabilitiesFréchet distance of embedding distributions
Compares to real dataNoYes
Handles mode collapseNoYes
Ease of computationEasySlightly complex
Use caseQuick training feedbackBenchmarking and production evaluation
PopularityUsed in older GAN researchPreferred in modern evaluations

How to calculate the inception score?

To compute IS:

This gives a numerical score where higher = better.

Let’s implement this using NumPy next.

How to implement the inception score?

You can implement the Inception Score by passing generated images through a pretrained classifier (like InceptionV3), collecting softmax outputs, and computing the KL divergence between conditional and marginal class distributions.

Here’s a simplified end-to-end version using Keras and NumPy:


from tensorflow.keras.applications.inception_v3 import InceptionV3, preprocess_input
from tensorflow.keras.preprocessing import image
import numpy as np
from scipy.stats import entropy
import tensorflow as tf
from PIL import Image

# Load pretrained InceptionV3 model
model = InceptionV3(include_top=True, weights='imagenet', pooling='avg')

def preprocess_images(img_list):
processed = []
for img in img_list:
img = img.resize((299, 299)).convert('RGB')
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
processed.append(x)
return np.vstack(processed)

def calculate_inception_score(img_list, splits=10):
imgs = preprocess_images(img_list)
preds = model.predict(imgs, verbose=0)

N = preds.shape[0]
split_scores = []

for k in range(splits):
part = preds[k * N // splits: (k+1) * N // splits]
py = np.mean(part, axis=0)
scores = [entropy(pyx, py) for pyx in part]
split_scores.append(np.exp(np.mean(scores)))

return np.mean(split_scores), np.std(split_scores)

In the above code we are using the following key points:

How to Calculate the Inception Score?

To compute IS:

This gives a numerical score where higher = better.

Let’s implement this using NumPy next.

How to Implement the Inception Score With NumPy?

Here is the code snippet showing the implementation of inception scores:


import numpy as np
from scipy.stats import entropy

def inception_score(preds, splits=10):
N = preds.shape[0]
split_scores = []

for k in range(splits):
part = preds[k * N // splits: (k+1) * N // splits]
py = np.mean(part, axis=0)
scores = [entropy(pyx, py) for pyx in part]
split_scores.append(np.exp(np.mean(scores)))

return np.mean(split_scores), np.std(split_scores)

preds should be a NumPy array of predicted class probabilities for each image.

Next up, let’s use Keras to automate prediction and get IS-ready scores.

How to Implement the Inception Score With Keras?

Here is the code snippet showing how to implementt inception scores withKerass:


from tensorflow.keras.applications.inception_v3 import InceptionV3, preprocess_input
from tensorflow.keras.preprocessing import image
import numpy as np

model = InceptionV3(include_top=True, weights='imagenet', pooling='avg')

def get_predictions(img_list):
processed_imgs = np.array([preprocess_input(image.img_to_array(img.resize((299, 299)))) for img in img_list])
preds = model.predict(processed_imgs)
return preds

Combine this with the NumPy inception_score() function above to compute IS for your GAN outputs.

As you implement this, be aware of some core issues still lingering with IS.

Problems With the Inception Score

Despite these, IS is still widely used. Let’s wrap this up.

Conclusion

Hence, the Inception Score remains a quick and easy way to evaluate how realistic and diverse your generative model outputs are—especially when used alongside other metrics like FID. While not flawless, IS is a powerful first-step tool in the validation pipeline for Generative AI models.

If you want certifications in Generative AI and large language models, Edureka offers the best certifications and training in this field.

For a wide range of courses, training, and certification programs across various domains, check out Edureka’s website to explore more and enhance your skills!

FAQs

1. What is a good Inception Score?

A good Inception Score is typically:

2. What is the Inception Score scale?

The Inception Score scale is unbounded but generally falls between:

3. How to calculate the Inception Score?

Here’s a simplified version in Python:

</wp-p>

<div class="contain-inline-size rounded-md border-[0.5px] border-token-border-medium relative bg-token-sidebar-surface-primary">
<div class="overflow-y-auto p-4">

import torch
import torch.nn.functional as F
from torchvision.models import inception_v3
from torchvision.transforms import Resize, ToTensor, Normalize, Compose
from scipy.stats import entropy
import numpy as np

def calculate_inception_score(images, splits=10):
model = inception_v3(pretrained=True, transform_input=False).eval()
preprocess = Compose([Resize((299, 299)), ToTensor(), Normalize((0.5,), (0.5,))])

preds = []
for img in images:
img_tensor = preprocess(img).unsqueeze(0)
with torch.no_grad():
pred = F.softmax(model(img_tensor), dim=1).cpu().numpy()
preds.append(pred)

preds = np.vstack(preds)
split_scores = []

for k in range(splits):
part = preds[k * len(preds) // splits: (k+1) * len(preds) // splits]
py = np.mean(part, axis=0)
scores = [entropy(pyx, py) for pyx in part]
split_scores.append(np.exp(np.mean(scores)))

return np.mean(split_scores), np.std(split_scores)

4. What is the Inception Score in generative AI?

The Inception Score is a metric used in Generative AI (especially for GANs) to evaluate:

It uses a pretrained Inception v3 model to measure how realistic and varied generated images are.

Upcoming Batches For Generative AI Course: Masters Program
Course NameDateDetails
Generative AI Course: Masters Program

Class Starts on 3rd May,2025

3rd May

SAT&SUN (Weekend Batch)
View Details
BROWSE COURSES