WinDiNet repurposes the LTX-Video video diffusion transformer as a fast, differentiable surrogate for computational fluid dynamics (CFD) simulations of urban wind patterns. Fine-tuned on 10,000 CFD simulations, it generates full 112-frame velocity rollouts in under a second.
Prerequisites: Python 3.10+, CUDA-capable GPU (48 GB VRAM recommended for training).
pip install -e .
# For training (adds decord, pandas, scipy):
pip install -e ".[training]"
# For inverse design optimization (adds scipy):
pip install -e ".[inverse]"Pretrained checkpoints are hosted on HuggingFace and downloaded automatically on first use. No manual setup is needed.
The repository contains three files:
dit.safetensors-- Finetuned diffusion transformer (DiT)scalar_embedding.safetensors-- Scalar conditioning modulevae_decoder.safetensors-- Physics-informed VAE decoder
To download manually:
huggingface-cli download rabischof/windinet --local-dir checkpoints/Each sample consists of a pair of files in the input directory:
name.png-- Building footprint image (black=building, white=fluid). Resized to 256x256 internally.name.json-- Scalar conditioning:{"inlet_speed_mps": 10.0, "field_size_m": 1400}
See examples/footprints/ for sample inputs and examples/predictions/ for corresponding outputs.
python scripts/inference.py configs/inference.yaml \
--input_dir examples/footprints/All three checkpoints are downloaded automatically on first run. Settings are in configs/inference.yaml. CLI flags (--checkpoint, --num_inference_steps, etc.) override the config.
Each prediction is saved as {name}.npz and {name}.mp4:
NPZ fields:
u_fields: horizontal velocity [T, H, W] in m/s (float16)v_fields: vertical velocity [T, H, W] in m/s (float16)bldg_mask: building footprint [H, W] (bool)
MP4 video: wind magnitude with coolwarm colormap.
WinDiNet training has two stages: (1) finetuning the VAE decoder with physics-informed losses, then (2) training the diffusion transformer with scalar conditioning.
Finetune the VAE decoder to improve reconstruction of wind velocity fields. The physics-informed loss enforces incompressibility and wall boundary conditions:
python scripts/finetune_vae.py configs/finetune_vae.yamlEdit configs/finetune_vae.yaml to set data.data_root to your wind simulation dataset. The dataset should contain subdirectories, each with a fields.npz (keys: u_fields, v_fields, bldg_mask) and a meta.json (key: wind_speed_mps).
The loss function combines three terms (see windinet/training/losses.py):
- Distance-weighted MSE: reconstruction loss with higher weight near building boundaries
- Divergence loss: penalises violations of incompressibility (du/dx + dv/dy ~ 0)
- Wall no-penetration loss: enforces zero normal velocity at building walls
The resulting checkpoint is used at inference via WINDINET_VAE_ADAPTER_CKPT.
Encode wind field simulations into VAE latents and extract scalar conditioning values:
python scripts/preprocess_dataset.py /path/to/wind_dataset/train \
--output-dir /path/to/preprocessedThe script reads fields.npz + meta.json from each sample directory, truncates to 112 simulation frames, prepends a conditioning frame, encodes through the VAE, and saves latent tensors + scalars as .pt files.
python scripts/train.py configs/windinet.yamlEdit configs/windinet.yaml to set data.preprocessed_data_root and output_dir.
python scripts/metrics.py \
--pred_dir /path/to/predictions \
--samples_root /path/to/gt_samples \
--manifest /path/to/dataset.json \
--out_dir /path/to/metricsOutputs per_sample.csv and summary.csv with: vRMSE, MAE (m/s), MRE (%), MSE, Spectral Divergence, Wasserstein distance.
Optimise building layouts for pedestrian wind comfort using WinDiNet as a differentiable surrogate:
python scripts/inverse_design.py configs/inverse_opt.yamlSee configs/inverse_opt.yaml for all parameters. An example building layout is provided in examples/inverse_optimization/. The optimizer adjusts building positions to minimise a Pedestrian Wind Comfort (PWC) loss that penalises dangerous (>15 m/s), uncomfortable (>5 m/s), and stagnant (<1 m/s) wind conditions.
The inverse design framework is designed to be extensible in two directions:
Custom objectives (inverse/objective.py): Add new loss functions that operate on the predicted velocity fields (u, v) and a spatial mask. Any function returning a dict with a "total" key works as a drop-in replacement. For example, you could add building-code compliance checks, noise-based comfort metrics, or pollutant dispersion penalties. See the module docstring for the expected interface.
Custom building parametrizations (inverse/footprint.py): Replace the axis-aligned rectangle representation with arbitrary differentiable shapes (splines, polygons, level sets). Any nn.Module that returns a soft (H, W) occupancy map from its forward() method is compatible with the optimizer. See the module docstring for the required interface.
WinDiNet modifies LTX-Video in two ways:
-
VAE Physics Adapter (
windinet/vae_adapter.py): The VAE decoder is finetuned with physics-informed losses (windinet/training/losses.py), improving reconstruction fidelity for wind velocity fields. Loaded at inference time viaWINDINET_VAE_ADAPTER_CKPT. -
Scalar Conditioning (
windinet/scalar_embeddings.py): Replaces text conditioning with Fourier-feature-encoded scalar inputs (inlet speed, field size), enabling precise physical parameterization.
Built on LTX-Video-Trainer by Lightricks, licensed under Apache 2.0.