Gen AI Masters Program (36 Blogs)

Generative AI: Variational Autoencoders (VAEs)

Published on Apr 16,2025 12 Views

Generative AI enthusiast with expertise in RAG (Retrieval-Augmented Generation) and LangChain, passionate... Generative AI enthusiast with expertise in RAG (Retrieval-Augmented Generation) and LangChain, passionate about building intelligent AI-driven solutions
image not found!image not found!image not found!image not found!Copy Link!

Ever utilized a program that creates blurriness-enhanced images or lifelike faces? Generative artificial intelligence underlies that magic, and the Variational Autoencoder (VAE) is a really effective instrument in that domain. Applied in everything from medical development to image creation, VAEs learn to compress and creatively replicate data. We will discuss VAEs in this blog, their differences from autoencoders and GANs, and walk through how to create one in TensorFlow using actual code examples.

What is Generative AI?

Generative artificial intelligence is a class of artificial intelligence models capable of creating fresh content—text, images, audio, even code. These models learn data’s patterns and structure to produce like but unique outputs. For text, GPT; for images, GANs; for probabilistic modeling, VAEs.

Now that we understand generative AI, let’s look at one of the powerful tools within this space—Variational Autoencoders.

What is a variational autoencoder (VAE)?

A variational autoencoder (VAE) is a kind of neural network designed to learn to encode input data into a distribution and subsequently decode it to rebuild the data. VAEs learn probability distributions—usually Gaussian—rather than fixed vectors unlike conventional autoencoders.

This lets VAEs create fresh data by sampling from the learnt distribution—a fundamental feature of generative models.

But how are VAEs different from regular autoencoders? Let’s compare the two.

Autoencoders vs. variational autoencoders

FeatureAutoencoderVariational Autoencoder (VAE)
Latent spaceDeterministicProbabilistic (mean & variance)
OutputExact reconstructionApproximate (with variation)
Generative capabilityLimitedStrong generative capabilities
Loss FunctionReconstruction lossReconstruction + KL divergence loss

VAEs add a KL Divergence term to the loss function to ensure the learned distribution is close to a standard normal distribution.

Let’s understand how VAEs work under the hood

How do VAEs work in neural networks?

VAEs consist of two main parts:

How-do-VAEs-work-in-neural-networks

  • Encoder: Maps input data to a latent distribution (mean and variance).

  • Sampling Layer: Draws a latent vector from the encoded distribution using the reparameterization trick.

  • Decoder: Reconstructs the data from the sampled latent vector.

Reparameterization Trick:

To enable backpropagation through a stochastic process, VAEs use:

 data-start="2385" data-end="2450">z = mu + sigma * epsilon # epsilon ~ N(0, 1)
data-start="2385" data-end="2450">

[/python]

This lets gradients flow through the sampling operation.

Now that we know the theory, let’s build one step by step.

Designing the Model

We’ll use TensorFlow and Keras to build a VAE for the MNIST dataset.

 1: Imports


import tensorflow as tf
from tensorflow.keras import layers
import numpy as np

2: Sampling Layer


class Sampling(layers.Layer):
def call(self, inputs):
z_mean, z_log_var = inputs
epsilon = tf.random.normal(shape=tf.shape(z_mean))
return z_mean + tf.exp(0.5 * z_log_var) * epsilon

This custom layer implements the reparameterization trick.

Now, let’s implement the full VAE architecture.

Implementing a Variational Autoencoder in TensorFlow

3: Encoder


latent_dim = 2

encoder_inputs = tf.keras.Input(shape=(28, 28, 1))
x = layers.Flatten()(encoder_inputs)
x = layers.Dense(128, activation='relu')(x)
z_mean = layers.Dense(latent_dim)(x)
z_log_var = layers.Dense(latent_dim)(x)
z = Sampling()([z_mean, z_log_var])
encoder = tf.keras.Model(encoder_inputs, [z_mean, z_log_var, z], name="encoder")

4: Decoder


latent_inputs = tf.keras.Input(shape=(latent_dim,))
x = layers.Dense(128, activation='relu')(latent_inputs)
x = layers.Dense(28 * 28, activation='sigmoid')(x)
decoder_outputs = layers.Reshape((28, 28, 1))(x)
decoder = tf.keras.Model(latent_inputs, decoder_outputs, name="decoder")

5: VAE Model


class VAE(tf.keras.Model):
def __init__(self, encoder, decoder, **kwargs):
super().__init__(**kwargs)
self.encoder = encoder
self.decoder = decoder

def train_step(self, data):
if isinstance(data, tuple):
data = data[0]
with tf.GradientTape() as tape:
z_mean, z_log_var, z = self.encoder(data)
reconstruction = self.decoder(z)
reconstruction_loss = tf.reduce_mean(
tf.keras.losses.binary_crossentropy(data, reconstruction)
)
kl_loss = -0.5 * tf.reduce_mean(1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))
total_loss = reconstruction_loss + kl_loss
grads = tape.gradient(total_loss, self.trainable_weights)
self.optimizer.apply_gradients(zip(grads, self.trainable_weights))
return {"loss": total_loss}

With the model in place, let’s draw our blog to a close and cover common questions.

Conclusion

Variational autoencoders close the distance separating probabilistic graphical models from deep learning. For generative projects like image synthesis, anomaly detection, or representation learning—where we need control over the latent space—they are potent.

If you want certifications in Generative AI and large language models, Edureka offers the best certifications and training in this field.

For a wide range of courses, training, and certification programs across various domains, check out Edureka’s website to explore more and enhance your skills!

FAQ’s

1. What is the difference between variational and standard autoencoder?

  • Standard autoencoders learn a direct encoding → decoding.

  • VAEs learn a distribution over the latent space, enabling sampling and generative capabilities.

2. What are the uses of VAEs?

  • Image generation

  • Anomaly detection

  • Latent space interpolation

  • Data compression and reconstruction

3. What is the difference between PCA and Variational autoencoder?

FeaturePCAVAE
Transformation typeLinear transformationNon-linear, deep network-based
Loss FunctionNo reconstruction lossUses reconstruction + KL loss
Generative CapabilityNo generative abilityStrong generative ability

4. What is the drawback of VAE?

  • Reconstructions tend to be blurry, especially with images, due to the Gaussian assumption.

  • Less sharp than GANs for high-fidelity data generation.

5. What is better GANs or VAE?

CriteriaGANsVAEs
SharpnessHighModerate
Training StabilityDifficult (adversarial)More stable
Latent SpaceUnstructuredStructured & interpretable
Use CasesImage synthesis, style transferRepresentation learning, interpolation
Comments
0 Comments

Join the discussion

Browse Categories

Subscribe to our Newsletter, and get personalized recommendations.