Awesome Image Generation

A curated list of AI image generation APIs, SDKs, and production-ready tools. Focused on services developers can integrate today.

Maintained by Backblaze.

Related Lists

Text-to-Image APIs

Commercial image-generation APIs with hosted inference and developer SDKs.

Adobe Firefly API – Image generation, editing, Photoshop automation, and Lightroom operations. Part of Firefly Services platform. Docs | SDK: JS/TS (official)
Amazon Titan Image Generator – Text-to-image via AWS Bedrock. Image conditioning, color palette guidance, background removal, and variations. Docs | SDK: Python (boto3), Java, PHP
Black Forest Labs (FLUX Pro) – FLUX 1.1 Pro and FLUX.2 (32B params) via REST API. From the creators of FLUX and Stable Diffusion. Also on Replicate, fal.ai, Together AI. Docs
fal.ai – Serverless inference hosting 1000+ image models. Fastest diffusion inference engine. Hosts FLUX, SD, and more. SOC 2 compliant. Docs | SDK: Python, JS
Google Gemini Image API – Native image generation via Gemini models (gemini-2.5-flash-image, gemini-3.1-flash-image-preview). Text-to-image, editing, multi-turn. Python/JS/Go/Java SDKs. Free tier via AI Studio. Docs | SDK: Python (google-genai), JS (google/generative-ai), Go, Java
Google Imagen (Vertex AI) – Imagen 4 via Vertex AI. Text-to-image, editing, outpainting, inpainting, customization. Docs | SDK: Python (google-cloud-aiplatform), Node
Ideogram – Known for high-quality text rendering in images. Ideogram 3.0 supports generation, remix, edit, and character reference. OpenAI-compatible interface. Docs
Leonardo AI – Text-to-image, image-to-image, and image-to-video. Webhooks, LoRA models, and "Get API Code" export from web UI. Docs | SDK: TypeScript, Python
Midjourney – Official API released late 2025. Enterprise/Pro plan holders only; no public self-service access. Docs
OpenAI GPT Image – gpt-image-1, gpt-image-1.5, gpt-image-1-mini. Natively multimodal generation, editing, and inpainting. DALL-E 2/3 deprecated May 2026. Docs | SDK: Python, Node
Recraft AI – Raster and vector image generation. V4 model (Feb 2026). Background removal, inpainting, outpainting, vectorization. OpenAI-compatible interface. Docs
Stability AI – Stable Diffusion 3.5 and Stable Image via REST API. Text-to-image, image-to-image, upscaling, inpainting. Docs
xAI Image Generation API – grok-imagine-image model via REST API. Text-to-image and image editing. Batch up to 10 images, 1k/2k resolution. OpenAI-compatible interface. Docs | SDK: Python (xai-sdk), JS (openai-compatible)

Open Source Models

Open-weight image-generation models you can run locally or self-host.

FLUX.1 [schnell] – 12B param rectified flow transformer. 1-4 step generation. Fully open for commercial use. Docs
FLUX.1 Kontext [dev] – 12B param instruction-based image editing model. Edit existing images via text prompts; character/style reference without finetuning. Non-commercial license. Docs
DeepFloyd IF – Cascaded pixel-space diffusion (64px → 256px → 1024px). Strong text rendering. Zero-Shot FID 6.66 on COCO.
LCM / LCM-LoRA – Latent Consistency Models enabling 2-4 step generation. LCM-LoRA is a lightweight ~100MB adapter for any SDXL model. Docs
PixArt-Alpha / PixArt-Sigma – DiT-based T2I at 10.8% of SD1.5 training cost. Near-commercial quality. Docs
Kandinsky 3 – Open-source T2I from AI Forever. 2x larger U-Net and 10x larger text encoder vs v2.x. Docs
FLUX.1 [dev] – 12B param guidance-distilled model. High quality, competitive with closed-source. Non-commercial license.
FLUX.2 [dev] – 32B param model with generation, editing, and multi-reference combining.
GLM-Image – 16B hybrid autoregressive + diffusion model from Zhipu AI. Excels at text rendering inside images. Supports T2I and I2I. Runs via GlmImagePipeline in diffusers. Docs
HiDream-I1 – 17B sparse diffusion transformer for text-to-image. Three variants (Full, Dev, Fast). Top benchmark scores; diffusers-native via HiDreamImagePipeline. Docs
Playground v2.5 – Aesthetic-focused model fine-tuned on SDXL architecture.
Qwen-Image – Alibaba's open-weight T2I family. Qwen-Image-2512 (text-to-image) and Qwen-Image-Edit variants. Strong text rendering including Chinese. Diffusers-native, Apache 2.0. Docs
SDXL-Turbo – Adversarial distillation of SDXL enabling single-step generation.
Stable Diffusion 1.5 – 860M UNet, runs on consumer GPUs. Foundation for massive community ecosystem of LoRAs, fine-tunes, and extensions.
Stable Diffusion 3.5 Large – MMDiT architecture with three text encoders (including T5-XXL). Highest-quality Stability open model. Docs
Stable Diffusion XL (SDXL) – Native 1024x1024. Improved text-in-image and limb generation. Base + refiner pipeline.

Open Source Frameworks and UIs

Graphical and programmatic interfaces for running diffusion pipelines.

AUTOMATIC1111 WebUI – Most widely used Gradio-based SD web UI. 161k+ stars. Extensive extension ecosystem. Docs
ComfyUI – Node-based graph UI and backend for diffusion models. Highly customizable, API-accessible. Supports SD, SDXL, Flux, and modern models. Docs
Fooocus – Midjourney-inspired SDXL UI. Prompt-only workflow, no manual parameter tweaking.
InvokeAI – Creative engine for SD models targeting professionals. Industry-leading WebUI. Docs
Forge – Fork of AUTOMATIC1111 with improved GPU memory management and performance. Compatible with A1111 extensions.
AI Toolkit (ostris) – All-in-one training suite for diffusion models. GUI and CLI. Trains FLUX.1/2, SDXL, SD 1.5, Qwen-Image, HiDream, and video models on consumer hardware.
ComfyUI-Manager – Extension for ComfyUI that installs, updates, and manages 800+ custom nodes via a GUI or CLI. Auto-installed with ComfyUI Desktop. Docs
DiffSynth-Studio – Python diffusion engine by ModelScope. Inference and LoRA training for FLUX.1/2, Qwen-Image, Z-Image, and JoyAI-Image. Low-VRAM optimizations, ControlNet, IP-Adapter support.
kohya_ss – Gradio-based GUI for Kohya's SD training scripts. Supports LoRA, DreamBooth, and fine-tuning for SD 1.5, SDXL, SD3, and FLUX.1.
OneTrainer – GUI and CLI training suite for diffusion models. Supports FLUX.1/2, Chroma, SD 1.5/2/3/XL, SDXL, PixArt, HiDream, and Hunyuan Video.
stable-diffusion.cpp – Diffusion model inference in pure C/C++ with no external dependencies. Runs SD 1.x/2.x/XL/3.5, FLUX.1/2, Chroma, Qwen-Image, and Z-Image. CPU/CUDA/Metal/Vulkan backends.

Image Editing and Enhancement

Conditioning, adaptation, restoration, and upscaling tools.

GFPGAN – Face restoration from Tencent ARC. Restores facial details from degraded images. Often paired with Real-ESRGAN.
Real-ESRGAN – Image and video upscaler, up to 8x. Handles real-world blind super-resolution with noise/artifact removal. Docs
IP-Adapter – Lightweight adapter (~100MB) for image-based prompting. New cross-attention layers for image feature conditioning. Docs
ControlNet – Precise structural control for diffusion models via edge maps, depth, pose, normals. Available for SD1.5, SDXL, and Flux. Docs
Upscayl – Desktop GUI for AI image upscaling on Linux, macOS, and Windows. Uses Real-ESRGAN and other models; up to 16x upscale. Requires Vulkan GPU. Docs

SDKs and Developer Tooling

Libraries and client SDKs for integrating image generation into apps.

Gradio – Python library for building interactive ML demos and web UIs. Foundation for AUTOMATIC1111, Fooocus, and HuggingFace Spaces. Includes gradio-client for programmatic access. Docs | SDK: Python (pip install gradio)
HuggingFace Diffusers – The canonical PyTorch library for diffusion models. SD 1.5, SDXL, SD3, Flux, ControlNet, IP-Adapter, and more. Docs | SDK: Python (pip install diffusers)
Replicate SDK – Python/JS client for 50,000+ hosted ML models. Pay-per-second, no GPU management. Docs | SDK: Python (pip install replicate), Node (npm install replicate)
fal.ai SDK – Python and JS SDKs for serverless inference. Also a Vercel AI SDK provider. Docs | SDK: Python (pip install fal-client), Node (npm install @fal-ai/client)
OpenAI SDK – Official SDK for GPT Image generation and editing. client.images.generate() and client.images.edit(). SDK: Python (pip install openai), Node (npm install openai)

GPU Cloud Providers

Serverless and on-demand GPU platforms for running image models.

fal.ai (GPU) – Fastest diffusion inference engine. 1000+ hosted models. Docs
Lambda Labs – On-demand A100 and H100 GPUs. Competitive pricing (~$1.10/hr A100 80GB). Docs
Modal – Serverless Python GPU cloud. Sub-second cold starts. Docs | SDK: Python (pip install modal)
Replicate – Serverless model hosting for open-source image models. Docs
RunPod – GPU pods and serverless endpoints. 48% of serverless cold starts under 200ms. Docs
Together AI – Inference API for 200+ open models. Docs
WaveSpeed AI – Serverless inference platform with 700+ image and video models. Sub-second cold starts for FLUX and other diffusion models. OpenAI-compatible REST API. Docs | SDK: Python, JS

Image Storage and Delivery

Object stores and CDNs suited to generated-image workloads.

Backblaze B2 – S3-compatible object storage at low cost. Free egress via Cloudflare. Docs | B2 integration
Cloudflare Images – Image CDN on Cloudflare's global network. Pre-defined variants for transformations.
Cloudinary – Enterprise image/video CDN with AI-powered transformations. Docs | SDK: Python, Node, Ruby, PHP, Java, .NET
Imgix – Real-time image processing CDN. URL-parameter-based transforms. Connects to existing S3/GCS storage. Docs

Evaluation and Observability

Metrics, leaderboards, and quality tooling for generated images.

pytorch-fid – PyTorch FID (Fréchet Inception Distance) implementation. Measures distribution similarity between real and generated images. SDK: Python (pip install pytorch-fid)
IQA-PyTorch – Comprehensive image quality toolbox. PSNR, SSIM, LPIPS, FID, NIQE, MUSIQ, TOPIQ, NIMA, BRISQUE, and more.
CLIP Score – Measures semantic alignment between text prompts and generated images using CLIP embeddings. Available via torchmetrics.multimodal.CLIPScore.
ImageReward – First general-purpose human preference reward model for T2I (NeurIPS 2023). Trained on 137k expert comparison pairs. Docs
torch-fidelity – High-fidelity ISC, FID, KID, and PRC metrics. Supports InceptionV3, CLIP, DINOv2, VGG16 feature extractors. Docs | SDK: Python (pip install torch-fidelity)

Templates and Example Projects

Reference implementations, demos, and starter projects.

B2 Background Removal with Transformers.js – Browser-based background removal using Transformers.js with Backblaze B2 storage. B2 integration
B2 Image Generation Prompt Flow – Image generation pipeline with prompt flow and Backblaze B2 cloud storage integration. B2 integration
HuggingFace Diffusers Examples – Official scripts for DreamBooth, LoRA fine-tuning, ControlNet training, and more.
HuggingFace Spaces – Free hosting for Gradio and Streamlit ML demos. Thousands of image generation demos. Docs
OpenAI Cookbook (GPT Image) – Official notebooks for image generation and editing with gpt-image-1.
Replicate Text-to-Image Collection – Curated runnable models with inline API code examples.

Contributing

Contributions are welcome. See CONTRIBUTING.md. One entry per PR — edit entries.yaml only and let the maintainers regenerate README.md.

License

Released under CC0 1.0 Universal. You may copy, modify, and redistribute without attribution.

About Backblaze B2

Backblaze B2 Cloud Storage is S3-compatible object storage designed for AI and media workloads. This list is maintained as part of our work making B2 a convenient storage layer for AI workflows.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
categories.yaml		categories.yaml
entries.yaml		entries.yaml
footer.md		footer.md
header.md		header.md
llms.txt		llms.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Image Generation

Related Lists

Contents

Text-to-Image APIs

Open Source Models

Open Source Frameworks and UIs

Image Editing and Enhancement

SDKs and Developer Tooling

GPU Cloud Providers

Image Storage and Delivery

Evaluation and Observability

Templates and Example Projects

Contributing

License

About Backblaze B2

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome Image Generation

Related Lists

Contents

Text-to-Image APIs

Open Source Models

Open Source Frameworks and UIs

Image Editing and Enhancement

SDKs and Developer Tooling

GPU Cloud Providers

Image Storage and Delivery

Evaluation and Observability

Templates and Example Projects

Contributing

License

About Backblaze B2

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages