| # | Model | Family | $/img | Overall | VQ | PA | TR | CLIP | Sharp | NIMA | ARNIQA | TOPIQ | MUSIQ | S/$ | imgsys | Time | Seed | Try |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 🥇 | Gemini 3.1 Flash Image | $0.080 | 4.186 | 4.4 | 4.8 | 4.7 | 0.292 | 0.935 | 0.532 | 0.682 | 0.607 | 0.707 | 52 | 1268 | 42.0s | 42 | Try | |
| 🥈 | GPT Image 1.5 | OpenAI | $0.034 | 4.182 | 4.4 | 4.8 | 4.7 | 0.301 | 0.923 | 0.546 | 0.654 | 0.598 | — | 123 | — | 37.1s | — | Try |
| 🥉 | Imagen 4 Ultra | $0.060 | 4.170 | 4.4 | 4.7 | 4.7 | 0.300 | 0.902 | 0.545 | 0.676 | 0.635 | 0.705 | 70 | 1148 | 19.9s | 42 | Try | |
| 4 | Grok Imagine | xAI | $0.020 | 4.155 | 4.3 | 4.7 | 4.6 | 0.295 | 0.888 | 0.550 | 0.685 | 0.663 | — | 208 | 1175 | 4.7s | — | Try |
| 5 | Imagen 4 | $0.040 | 4.153 | 4.4 | 4.7 | 4.6 | 0.297 | 0.890 | 0.549 | 0.677 | 0.644 | 0.705 | 104 | 1133 | 13.5s | 42 | Try | |
| 6 | Gemini 3 Pro Image | $0.150 | 4.136 | 4.4 | 4.8 | 4.6 | 0.294 | 0.916 | 0.522 | 0.659 | 0.579 | 0.693 | 28 | 1233 | 29.0s | 42 | Try | |
| 7 | FLUX.2 [max] | Black Forest Labs | $0.070 | 4.132 | 4.3 | 4.7 | 4.5 | 0.295 | 0.894 | 0.535 | 0.659 | 0.635 | 0.702 | 59 | 1167 | 50.3s | 42 | Try |
| 8 | FLUX.2 [flex] | Black Forest Labs | $0.050 | 4.119 | 4.3 | 4.7 | 4.5 | 0.295 | 0.889 | 0.553 | 0.646 | 0.595 | 0.696 | 82 | 1160 | 92.8s | 42 | Try |
| 9 | FLUX.2 [pro] | Black Forest Labs | $0.030 | 4.116 | 4.3 | 4.7 | 4.5 | 0.294 | 0.879 | 0.528 | 0.661 | 0.637 | 0.703 | 137 | 1155 | 73.4s | 42 | Try |
| 10 | Gemini 2.5 Flash Image | $0.040 | 4.112 | 4.4 | 4.7 | 4.6 | 0.295 | 0.847 | 0.544 | 0.665 | 0.603 | 0.689 | 103 | 1156 | 12.4s | 42 | Try | |
| 11 | GPT Image 1 Mini | OpenAI | $0.011 | 4.109 | 4.4 | 4.7 | 4.5 | 0.300 | 0.827 | 0.541 | 0.640 | 0.606 | — | 374 | 1104 | 28.5s | — | Try |
| 12 | GPT Image 1 | OpenAI | $0.042 | 4.098 | 4.4 | 4.7 | 4.4 | 0.301 | 0.811 | 0.535 | 0.636 | 0.604 | — | 98 | 1115 | 32.2s | — | Try |
| 13 | Bria FIBO | Other | $0.040 | 4.092 | 4.3 | 4.6 | 4.4 | 0.291 | 0.914 | 0.555 | 0.675 | 0.637 | — | 102 | — | 33.6s | 42 | Try |
| 14 | Recraft V4 | Recraft | $0.040 | 4.075 | 4.3 | 4.6 | 4.5 | 0.286 | 0.906 | 0.519 | 0.684 | 0.612 | — | 102 | 1100 | 17.8s | — | Try |
| 15 | Ideogram V3 Turbo | Ideogram | $0.030 | 4.073 | 4.3 | 4.5 | 4.6 | 0.287 | 0.854 | 0.551 | 0.707 | 0.633 | — | 136 | 1050 | 11.0s | 42 | Try |
| 16 | Reve v1.0 | Reve | $0.040 | 4.054 | 4.4 | 4.5 | 4.5 | 0.295 | 0.798 | 0.527 | 0.683 | 0.604 | — | 101 | 1177 | 7.0s | — | Try |
| 17 | Seedream v4 | ByteDance | $0.030 | 4.045 | 4.4 | 4.7 | 4.7 | 0.300 | 0.865 | 0.506 | 0.545 | 0.421 | — | 135 | 1118 | 16.4s | 42 | Try |
| 18 | Recraft V3 | Recraft | $0.040 | 4.041 | 4.3 | 4.4 | 4.3 | 0.292 | 0.917 | 0.542 | 0.708 | 0.659 | — | 101 | 1021 | 8.1s | — | Try |
| 19 | Wan 2.6 T2I Open | Alibaba | $0.030 | 4.020 | 4.3 | 4.4 | 4.5 | 0.289 | 0.856 | 0.548 | 0.690 | 0.602 | — | 134 | 1135 | 86.4s | 42 | Try |
| 20 | Qwen Image 2.0 Open | Alibaba | $0.035 | 4.010 | 4.3 | 4.6 | 4.7 | 0.287 | 0.904 | 0.499 | 0.640 | 0.462 | — | 115 | 1139 | 8.7s | 42 | Try |
| 21 | Imagen 4 Fast | $0.020 | 4.008 | 4.3 | 4.5 | 4.1 | 0.291 | 0.858 | 0.531 | 0.671 | 0.619 | 0.701 | 200 | — | 6.9s | 42 | Try | |
| 22 | FLUX.2 [dev] Open | Black Forest Labs | $0.012 | 4.006 | 4.3 | 4.7 | 4.2 | 0.304 | 0.832 | 0.529 | 0.635 | 0.519 | 0.659 | 334 | 1149 | 15.0s | 42 | Try |
| 23 | Z-Image Turbo | Other | $0.005 | 4.004 | 4.4 | 4.5 | 4.2 | 0.290 | 0.806 | 0.492 | 0.687 | 0.639 | — | 801 | 1080 | 1.9s | 42 | Try |
| 24 | ImagineArt 1.5 | Other | $0.030 | 4.002 | 4.3 | 4.6 | 4.4 | 0.299 | 0.915 | 0.528 | 0.577 | 0.453 | — | 133 | — | 22.5s | 42 | Try |
| 25 | Hunyuan Image 3.0 Open | Tencent | $0.100 | 3.993 | 4.2 | 4.6 | 3.9 | 0.295 | 0.876 | 0.532 | 0.682 | 0.627 | — | 40 | 1152 | 23.7s | 42 | Try |
| 26 | Luma Photon | Luma | $0.019 | 3.986 | 4.3 | 4.5 | 4.4 | 0.294 | 0.828 | 0.551 | 0.580 | 0.493 | — | 210 | — | 15.6s | — | Try |
| 27 | Qwen Image Open | Alibaba | $0.020 | 3.973 | 4.3 | 4.6 | 4.3 | 0.302 | 0.753 | 0.526 | 0.615 | 0.536 | — | 199 | 1058 | 67.5s | 42 | Try |
| 28 | Imagen 3 | $0.050 | 3.968 | 4.3 | 4.5 | 3.7 | 0.295 | 0.875 | 0.543 | 0.668 | 0.618 | 0.700 | 79 | 1059 | 30.8s | 42 | Try | |
| 29 | FLUX Pro 1.1 | Black Forest Labs | $0.040 | 3.966 | 4.3 | 4.4 | 4.2 | 0.285 | 0.876 | 0.546 | 0.617 | 0.579 | 0.673 | 99 | 1016 | 32.2s | 42 | Try |
| 30 | Seedream v5 Lite | ByteDance | $0.035 | 3.956 | 4.4 | 4.7 | 4.8 | 0.292 | 0.735 | 0.530 | 0.401 | 0.341 | — | 113 | 1111 | 41.6s | 42 | Try |
| 31 | Seedream v4.5 | ByteDance | $0.040 | 3.940 | 4.4 | 4.7 | 4.7 | 0.297 | 0.873 | 0.515 | 0.344 | 0.270 | — | 98 | 1144 | 32.6s | 42 | Try |
| 32 | SD 3.5 Large Open | Other | $0.065 | 3.910 | 4.2 | 4.5 | 3.5 | 0.292 | 0.884 | 0.538 | 0.662 | 0.613 | — | 60 | 939 | 35.8s | 42 | Try |
| 33 | FLUX.1 [schnell] Open Best Value | Black Forest Labs | $0.003 | 3.810 | 4.1 | 4.3 | 3.2 | 0.293 | 0.798 | 0.530 | 0.675 | 0.649 | 0.708 | 1270 | 950 | 3.8s | 42 | Try |
Prompt Browser (100 prompts × 33 models)
1 / 100
Portrait “A young woman with freckles and green eyes, golden hour sunlight, shallow depth of field, 85mm portrait lens”
Portrait “An elderly man with a weathered face and kind smile, wearing a wool flat cap, dramatic Rembrandt lighting”
Portrait “A Japanese woman in a red kimono standing under cherry blossoms, soft bokeh background, natural light”
Portrait “A bearded blacksmith in a dark workshop, sparks flying from the anvil, cinematic side lighting”
Portrait “A child blowing dandelion seeds in a sunlit meadow, backlit with lens flare, candid moment”
Portrait “A woman with elaborate gele headwrap and bold eye makeup, colorful Ankara print fabric, studio portrait on dark background, beauty lighting”
Portrait “An Indian bride in traditional red and gold lehenga, intricate henna on her hands, soft warm lighting”
Portrait “A street musician playing saxophone under a lamppost at night, rain-soaked pavement reflecting light”
Portrait “A freckled boy with bright blue eyes and messy hair, laughing, close-up, outdoor natural light”
Portrait “A ballet dancer mid-leap in an abandoned warehouse, dust particles in the air, dramatic window light”
Portrait “A tattooed chef in a white coat, arms crossed, confident pose in a professional kitchen, environmental portrait”
Portrait “An astronaut without a helmet, face lit by Earth glow, stars in the background, photorealistic”
Product “A pair of white leather sneakers on a marble countertop, soft studio lighting, product photography, 4K”
Product “A bottle of amber perfume on a bed of dried roses, dark moody background, luxury product photography”
Product “A minimalist wristwatch with a black leather strap on a concrete slab, harsh directional light, editorial style”
Product “A stack of colorful macarons on a pastel blue plate, top-down flat lay, bright even lighting”
Product “A matte black coffee mug with steam rising, placed on a wooden desk next to an open book, lifestyle product photo”
Product “A pair of gold hoop earrings on a velvet jewelry display, close-up macro shot, soft gradient background”
Product “A green glass bottle of craft beer with condensation droplets, placed on a rustic wooden bar, warm ambient light”
Product “A sleek laptop on a clean white desk with a potted succulent and a cup of espresso, minimalist workspace”
Product “A tube of red lipstick standing upright, melting slightly at the tip, with a smear on white background, beauty advertising”
Product “A leather messenger bag on a sun-dappled step in a Marrakech souk, colorful textiles and brass lanterns blurred in the background, lifestyle product”
Text Rendering “A glowing neon sign reading OPEN 24 HOURS in a dark rainy alley, reflections on wet cobblestones”
Text Rendering “A vintage movie poster with the title THE LAST HORIZON in bold serif letters, 1960s style illustration”
Text Rendering “A hand-lettered chalkboard menu outside a coffee shop reading ESPRESSO $3 LATTE $4 MOCHA $5”
Text Rendering “A birthday cake with HAPPY 30TH BIRTHDAY SARAH written in pink frosting, candles lit, confetti”
Text Rendering “A street sign at the intersection of BROADWAY and 42ND STREET, New York City, photorealistic”
Text Rendering “A weathered wooden sign reading WELCOME TO PINE VALLEY EST. 1892, rustic forest setting”
Text Rendering “A book cover with the title QUANTUM DREAMS by A. CHEN, minimalist design, white on black”
Text Rendering “A protest sign held up in a crowd reading SAVE THE PLANET, hand-painted letters, photojournalism style”
Text Rendering “A storefront window with GRAND OPENING painted in gold leaf letters, balloons visible inside”
Text Rendering “A license plate reading EVALYTIC on the back of a red sports car, close-up shot, shallow depth of field”
Text Rendering “A newspaper headline reading AI REVOLUTION BEGINS TODAY, black and white, vintage printing style”
Text Rendering “A graffiti mural on a brick wall spelling DREAM BIG in colorful bubble letters, urban street”
Landscape “A misty mountain range at sunrise, layers of purple and orange haze, pine trees in foreground, landscape photography”
Landscape “A turquoise glacial lake surrounded by snow-capped mountains, perfect mirror reflection, Patagonia”
Landscape “A vast wheat field under a dramatic thunderstorm sky, single lightning bolt, golden hour, wide angle”
Landscape “A waterfall cascading into a tropical pool surrounded by lush green ferns, long exposure silky water effect”
Landscape “An aerial view of a winding river through autumn forest, red and orange foliage, drone photography”
Landscape “A frozen lake with cracks in the ice, Northern Lights dancing in the sky above, Iceland”
Landscape “A desert sand dune at sunset with long shadows, Sahara, minimal composition, warm tones”
Landscape “A coastal cliff with crashing waves below, lighthouse on the edge, stormy sky, dramatic seascape”
Landscape “A field of lavender stretching to the horizon in Provence, golden sunset, rows converging to vanishing point”
Landscape “A volcano erupting at night with lava flowing down the slope, red glow against dark sky, long exposure”
Artistic Style “A cat sitting on a windowsill, impressionist oil painting style, visible brushstrokes, soft warm palette like Renoir”
Artistic Style “A cyberpunk street market in Tokyo, anime style, neon signs, rain, crowded, Blade Runner meets Studio Ghibli”
Artistic Style “A woman reading a book in a garden, watercolor painting, delicate washes, floral border, vintage botanical illustration style”
Artistic Style “A still life of fruit and wine on a table, Dutch Golden Age painting style, dramatic chiaroscuro, oil on canvas”
Artistic Style “A lone samurai standing on a misty bridge, ukiyo-e woodblock print style, limited color palette, flat perspective”
Artistic Style “A portrait in the style of pop art, bold primary colors, Ben-Day dots, thick black outlines, Andy Warhol inspired”
Artistic Style “A surreal melting clock draped over a barren tree in a desert, Salvador Dali style, dreamlike atmosphere”
Artistic Style “A medieval castle on a hilltop, fantasy art style, dramatic clouds, painterly, digital matte painting”
Artistic Style “A geometric abstract composition with red, blue, and yellow rectangles on white, Piet Mondrian style”
Artistic Style “A noir detective in a foggy alley, graphic novel style, high contrast black and white, ink illustration”
Composition “A red cube on top of a blue sphere, both sitting on a green table, simple studio lighting”
Composition “Three cats of different colors sitting in a row on a fence: one orange, one black, one white”
Composition “A small house between two tall trees, with a path leading to the front door, a mailbox on the left side”
Composition “Five red apples arranged in a circle on a white table, one apple is green, overhead shot”
Composition “A dog wearing sunglasses riding a skateboard down a city sidewalk, motion blur on background”
Composition “A woman holding an umbrella in the rain, reflected in a puddle below, split composition”
Composition “A cup of coffee on a saucer, with a spoon to the right and a sugar cube to the left, top-down view”
Composition “A bird perched on the left hand of a person, the right hand holding a flower, symmetrical composition”
Composition “A busy farmer's market scene with at least six different fruit stalls, people walking, overhead canopy”
Composition “A telescope pointing at the moon through an open window, curtains blowing in the wind, starry night outside”
Animal / Nature “A corgi wearing a red beret sitting in a Parisian cafe, photorealistic, shallow depth of field”
Animal / Nature “A majestic lion with a full mane, golden savanna background, National Geographic style wildlife photography”
Animal / Nature “A hummingbird hovering next to a bright red flower, frozen in mid-flight, iridescent feathers, macro photography”
Animal / Nature “A black cat sitting on a stack of old books in a cozy library, warm golden light from a desk lamp, dust particles in the air”
Animal / Nature “An owl perched on a snow-covered branch, intense amber eyes, soft snowfall, shallow depth of field”
Animal / Nature “A fox in a field of wildflowers, backlit by sunset, warm orange tones, wildlife photography”
Animal / Nature “Two kittens playing with a ball of red yarn on a hardwood floor, playful motion, soft window light”
Animal / Nature “A giant sea turtle swimming over a coral reef, underwater photography, clear blue water, sunbeams from above”
Animal / Nature “A golden retriever catching a frisbee in mid-air at a park, action shot, motion freeze, bright day”
Animal / Nature “A peacock displaying its full tail feathers, vibrant iridescent blues and greens, garden setting”
Architecture “A modern glass skyscraper reflecting clouds at sunset, shot from ground level looking up, architectural photography”
Architecture “A cozy Scandinavian living room with a fireplace, mid-century modern furniture, large windows with snow outside”
Architecture “A narrow cobblestone alley in Venice with colorful buildings, laundry hanging between windows, warm afternoon light”
Architecture “An abandoned Art Deco theater with peeling gold paint, ornate ceiling, dust and debris, urbex photography”
Architecture “A Japanese zen garden with raked gravel, moss-covered stones, wooden temple in the background, misty morning”
Architecture “A spiral staircase viewed from directly above, geometric pattern, black and white, abstract architectural”
Architecture “A futuristic city skyline with flying vehicles, sleek towers, holographic advertisements, sci-fi concept art”
Architecture “A traditional riad courtyard with intricate zellige mosaic tilework, central fountain, potted orange trees, warm Moroccan light”
Abstract “An explosion of colorful paint splashes forming a human silhouette, white background, dynamic energy”
Abstract “A fractal pattern made of galaxies and nebulae, cosmic scale, deep space colors, mathematical beauty”
Abstract “Smoke tendrils forming the shape of a phoenix, backlit in orange and red, black background”
Abstract “A macro photograph of oil droplets on water, iridescent rainbow colors, abstract pattern, sharp focus”
Abstract “Intertwining ribbons of light in blue and gold against a dark void, long exposure light painting”
Abstract “A glass sphere refracting a distorted cityscape, sitting on a reflective surface, surreal perspective”
Abstract “Geometric tessellation pattern transitioning from triangles to hexagons, gradient from blue to orange, M.C. Escher inspired”
Abstract “A double exposure photograph combining a forest with a woman's profile, ethereal, dreamy atmosphere”
Food “A perfectly plated sushi omakase on a black slate board, wasabi and ginger, top-down food photography”
Food “A rustic sourdough bread loaf, freshly baked with a cracked crust, steam rising, wooden cutting board, warm kitchen light”
Food “A dripping chocolate lava cake cut open on a white plate, vanilla ice cream melting beside it, fine dining”
Food “A colorful acai bowl topped with fresh berries, granola, coconut flakes, and a drizzle of honey, overhead shot”
Food “A steaming bowl of ramen with chashu pork, soft-boiled egg, nori, and green onions, close-up, moody lighting”
Sci-Fi / Fantasy “A 3D render of a cute robot watering plants in a miniature garden, Pixar style, soft global illumination”
Other “An isometric low-poly village with tiny houses, trees, and a river, pastel colors, game art style”
Portrait “A children's book illustration of a bear and a rabbit having a tea party in the forest, whimsical, colored pencil”
Sci-Fi / Fantasy “A detailed cross-section of a fantasy treehouse showing rooms inside, cutaway illustration, warm colors”
Other “A clay render of a sports car, white material, studio lighting, automotive design visualization, octane render”
Methodology
Benchmark Setup
- 33 models evaluated on fal.ai infrastructure
- 100 curated prompts from PartiPrompts, DrawBench, OneIG-Bench + custom. Each prompt contributes equally to final scores (simple average).
- Judge: 3-Judge Median (Claude Sonnet 4.6, GPT-5.2, Gemini 2.5 Flash). Each image is scored by 3 independent VLM judges. Final score = median of 3, mitigating single-judge bias.
- Seed: 25 models with seed=42, 8 models without seed support (single run — 100 prompts provides statistical robustness). A fixed seed ensures the same prompt always produces the same image, making results reproducible and comparisons fair.
- Image size: square_hd
- Concurrency: 5 parallel requests
- Pipeline: text2img
- Date: 2026-03-23
VLM Judge Evaluation Criteria
- Visual Quality: Overall image quality — sharpness, color fidelity, coherence, realism, absence of artifacts. Scale: 1.0–5.0 with 0.1 increments.
- Prompt Adherence: How faithfully the image matches the prompt — objects, attributes, spatial relationships, style. Scale: 1.0–5.0.
- Text Rendering: Accuracy of rendered text (signs, labels). Only evaluated on 12 text-containing prompts. Scale: 1.0–5.0.
View judge prompt source code ↗
Inter-Judge Agreement
- 73.5% of dimensions: judges agree within 0.5 points
- Average judge gap: 0.45 points
- Extreme disagreement (gap > 1.0): 7.3% — most common in text rendering
- Median-of-3 approach reduces outlier impact from any single judge
Data & Reproducibility
- Prompt dataset (100 prompts, JSON) — curated from PartiPrompts, DrawBench, OneIG-Bench + custom
- Evalytic SDK (open source) — the same tool that generated this leaderboard
- Raw scores embedded in this page (
View Source → #leaderboard-data)
Reproduce this benchmark
pip install evalytic
evalytic bench \
--models flux-schnell flux-pro imagen-4 \
--judges "gemini-2.5-flash,gpt-5.2,claude-sonnet-4-6" \
--prompts prompts-text2img-v1.json \
--image-size square_hd --seed 42 --yes
Column Definitions
- Overall — Weighted combination of VLM judge scores and deterministic metrics. Formula:
1 + 4 × (vlm_w × (vlm_avg−1)/4 + det_w × det_avg). Default: 60% VLM + 40% deterministic. Adjustable via the weight panel above. - VQ (Visual Quality) — VLM judge score (1–5). Evaluates overall image quality: sharpness, color fidelity, coherence, realism, and absence of visual artifacts.
- PA (Prompt Adherence) — VLM judge score (1–5). How faithfully the generated image matches the text prompt — objects, attributes, spatial relationships, and style.
- TR (Text Rendering) — VLM judge score (1–5). Accuracy of text rendered within the image (signs, labels, logos). Only scored on prompts that contain text elements (~12% of prompts). Shows “—” for models/prompts without text.
- CLIP — Deterministic metric (0–1). CLIP ViT-L/14 cosine similarity between prompt text and generated image. Higher = better semantic alignment with prompt.
- Sharp (Sharpness) — Deterministic metric (0–1). Variance of Laplacian applied to the image. Higher = sharper, more detailed image. Low values may indicate blur or softness.
- NIMA — Deterministic metric (0–1). Neural Image Assessment (NIMA) trained on human aesthetic ratings (AVA dataset). Higher = more aesthetically pleasing.
- TOPIQ — Deterministic metric (0–1). Top-down Image Quality via CFANet architecture (KonIQ-10k trained). State-of-the-art no-reference quality metric.
- Score/$ — Value efficiency: Overall score divided by cost per image. Higher = better quality per dollar spent.
- imgsys — ELO rating from imgsys.org community voting. Independent reference point for cross-validation.
- Time — Average image generation time in seconds.
- Seed — Fixed random seed for reproducibility. Models that support seed get deterministic output. Models without seed support are run 3× per prompt with median score taken.
- $/img — Cost per image generation in USD on fal.ai.