Astrophysics & AI with Python: Forging Cosmic Nebulas with Generative Adversarial Networks

DEV Community

X

or synthetic

G(Z)

Output: A probability score (1 for real, 0 for fake).

The Training Loop: An Arms Race

The training is a minimax game. The Discriminator tries to maximize its ability to classify correctly, while the Generator tries to minimize the Discriminator's success. Mathematically, we seek to solve the value function $V(D, G)$ :

\min_{G} \max_{D} V(D, G) = \mathbb{E}{x \sim p{data}(x)} [\log D(x)] + \mathbb{E}{z \sim p{z}(z)} [\log (1 - D(G(z)))]

The Critic ( $D$ ): Maximizes the equation. It wants $\log D(x)$ (real data) to be high and $\log(1 - D(G(z)))$ (fake data) to be high.
The Forger ( $G$ ): Minimizes the equation. It cannot affect the first term, so it focuses on the second: it wants $D(G(z))$ to be close to 1, making $\log(1 - D(G(z)))$ approach negative infinity.

This feedback loop forces the Generator to produce increasingly sharp, realistic textures until the Critic is baffled.

Architectural Deep Dive: DCGANs for Nebulas

To model the high-dynamic-range of astronomical images, we use Deep Convolutional GANs (DCGANs). These replace standard pooling layers with strided convolutions (in $D$ ) and transposed convolutions (in $G$ ).

Data Preprocessing: Handling Cosmic Dynamic Range

Before coding, we must address the unique nature of astronomical data (often FITS files). Nebulas have extreme dynamic ranges—bright cores and faint dust lanes.

Logarithmic Stretch: We apply a log stretch to compress the range, ensuring the network learns faint details.
Normalization: GANs typically output values in the range $[-1, 1]$ (using tanh activation), so we normalize our pixel data accordingly.

Python Implementation: The Genesis of a Synthetic Nebula

Below is a complete, functional skeleton of a DCGAN designed to generate $64 \times 64$ RGB images. This code uses PyTorch, the industry standard for deep learning research.

1. The Generator (The Forger)

The Generator takes a latent vector $Z$ and uses ConvTranspose2d to "paint" an image, upsampling from a $1 \times 1$ feature map to a $64 \times 64$ image.

import torch
import torch.nn as nn
# Configuration
LATENT_DIM = 100
IMAGE_CHANNELS = 3
FEATURES_G = 64 # Base feature map size

class Generator(nn.Module):
 def __init__(self):
 super(Generator, self).__init__()
 self.net = nn.Sequential(
 # Input: Latent Vector Z, going to a 4x4 feature map
 nn.ConvTranspose2d(LATENT_DIM, FEATURES_G * 8, 4, 1, 0, bias=False),
 nn.BatchNorm2d(FEATURES_G * 8),
 nn.ReLU(True),
 # Upsample to 8x8
 nn.ConvTranspose2d(FEATURES_G * 8, FEATURES_G * 4, 4, 2, 1, bias=False),
 nn.BatchNorm2d(FEATURES_G * 4),
 nn.ReLU(True),
 # Upsample to 16x16
 nn.ConvTranspose2d(FEATURES_G * 4, FEATURES_G * 2, 4, 2, 1, bias=False),
 nn.BatchNorm2d(FEATURES_G * 2),
 nn.ReLU(True),
 # Upsample to 32x32
 nn.ConvTranspose2d(FEATURES_G * 2, FEATURES_G, 4, 2, 1, bias=False),
 nn.BatchNorm2d(FEATURES_G),
 nn.ReLU(True),
 # Final layer: Output 3-channel image (64x64)
 nn.ConvTranspose2d(FEATURES_G, IMAGE_CHANNELS, 4, 2, 1, bias=False),
 nn.Tanh() # Outputs values between -1 and 1
 )
 def forward(self, x):
 return self.net(x)
# Initialize
netG = Generator()

2. The Discriminator (The Critic)

The Discriminator acts as a binary classifier. It uses standard Conv2d layers to downsample the image into a single probability score.

class Discriminator(nn.Module):
 def __init__(self):
 super(Discriminator, self).__init__()
 self.net = nn.Sequential(
 # Input: 3 x 64 x 64 Image
 nn.Conv2d(IMAGE_CHANNELS, FEATURES_G, 4, 2, 1, bias=False),
 nn.LeakyReLU(0.2, inplace=True),
 # Downsample to 32x32
 nn.Conv2d(FEATURES_G, FEATURES_G * 2, 4, 2, 1, bias=False),
 nn.BatchNorm2d(FEATURES_G * 2),
 nn.LeakyReLU(0.2, inplace=True),
 # Downsample to 16x16
 nn.Conv2d(FEATURES_G * 2, FEATURES_G * 4, 4, 2, 1, bias=False),
 nn.BatchNorm2d(FEATURES_G * 4),
 nn.LeakyReLU(0.2, inplace=True),
 # Downsample to 8x8
 nn.Conv2d(FEATURES_G * 4, FEATURES_G * 8, 4, 2, 1, bias=False),
 nn.BatchNorm2d(FEATURES_G * 8),
 nn.LeakyReLU(0.2, inplace=True),
 # Final layer: Output single probability
 nn.Conv2d(FEATURES_G * 8, 1, 4, 1, 0, bias=False),
 nn.Sigmoid() # Squashes output to 0-1 probability
 )
 def forward(self, x):
 return self.net(x).view(-1, 1).squeeze(1)
# Initialize
netD = Discriminator()

3. The Training Loop

The core logic involves alternating updates. We train $D$ to maximize the probability of assigning the correct label to both real and fake images, then train $G$ to minimize $\log(1 - D(G(z)))$ (or, in modern practice, to maximize $\log(D(G(z)))$ for better gradients).

import torch.optim as optim
# Optimizers (Adam is standard for GANs)
optimizerD = optim.Adam(netD.parameters(), lr=0.0002, betas=(0.5, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=0.0002, betas=(0.5, 0.999))
# Loss function
criterion = nn.BCELoss()
# Dummy training loop structure
def train_step(real_images):
 batch_size = real_images.size(0)
 real_labels = torch.ones(batch_size)
 fake_labels = torch.zeros(batch_size)
 # --- Train Discriminator ---
 optimizerD.zero_grad()
 # 1. Train with real images
 output_real = netD(real_images)
 errD_real = criterion(output_real, real_labels)
 errD_real.backward()
 # 2. Train with fake images
 noise = torch.randn(batch_size, LATENT_DIM, 1, 1)
 fake_images = netG(noise)
 output_fake = netD(fake_images.detach()) # Detach to avoid updating G
 errD_fake = criterion(output_fake, fake_labels)
 errD_fake.backward()
 optimizerD.step()
 # --- Train Generator ---
 optimizerG.zero_grad()
 # We want the Discriminator to think these fake images are real (label 1)
 output_fake_for_G = netD(fake_images)
 errG = criterion(output_fake_for_G, real_labels)
 errG.backward()
 optimizerG.step()
 return errD_real.item() + errD_fake.item(), errG.item()

Challenges: Mode Collapse and Stability

Training GANs is notoriously difficult. Two major pitfalls can ruin your synthetic nebulas:

Mode Collapse: The Generator finds one specific nebula shape that reliably fools the Discriminator and stops exploring the latent space. Your entire dataset becomes 10,000 copies of the same cloud.
Vanishing Gradients: If the Discriminator becomes too good too quickly, it outputs near-zero for fake images. The Generator's gradients vanish, and it stops learning.

To fix this, researchers often use Wasserstein GANs (WGAN) or Spectral Normalization, which provide smoother gradients and prevent the Discriminator from becoming too confident.

Conclusion

By implementing a GAN, we transform a random noise vector into a structured, scientifically plausible nebula. This "Cosmic Forger" isn't just a novelty; it is a tool for augmenting scarce datasets, validating hydrodynamical simulations, and exploring the unobserved parameter space of the universe.

Let's Discuss

Synthetic vs. Real: If a GAN generates a nebula that looks indistinguishable from a Hubble image but represents a physical impossibility (e.g., violating gas density laws), is it still useful for scientific research?
Future of Discovery: Could we eventually train a GAN on "normal" galaxy distributions and use its failures to generate anomalies that lead to the discovery of new types of celestial objects?

The concepts and code demonstrated here are drawn directly from the comprehensive roadmap laid out in the ebook
Astrophysics & AI: Building Research Agents for Astronomy, Cosmology, and SETI. You can find it here. Check all the other 50 Programming & AI ebooks with python, typescript, swift, c#: here