beautyyuyanli / multilingual-e5-large
multilingual-e5-large: A multi-language text embedding model
53.6M runs
turian / insanely-fast-whisper-with-video
whisper-large-v3, incredibly fast, with video transcription
14.3M runs
prunaai / p-image-edit
A sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image
4.6M runs
jaaari / kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
72M runs
A joint audio-video model that accurately follows complex instructions.
43.9K runs
An enhanced version over Qwen-Image-Edit-2509, featuring multiple improvements including notably better consistency
48.7K runs
OpenAI's latest image generation model with better instruction following and adherence to prompts
513.1K runs
The highest fidelity image model from Black Forest Labs
105.8K runs
The fastest open source TTS model without sacrificing quality.
18.1K runs
The best model for coding and agentic tasks across industries
152.2K runs
Seedream 4.5: Upgraded Bytedance image model with stronger spatial understanding and world knowledge
1M runs
Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
4.9M runs
Google's most advanced reasoning Gemini model
182.1K runs
Google's state of the art image generation and editing model ππ
7.1M runs
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
252.7K runs
High-precision image upscaler optimized for portraits, faces and products. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x
303.6K runs
Official models are always on, maintained, and have predictable pricing.
Kling 2.6 Pro: Top-tier image-to-video with cinematic visuals, fluid motion, and native audio generation
Qwen Image 2512 is an improved version of Qwen Image with more realistic human generation, finer textures, and stronger text rendering
A joint audio-video model that accurately follows complex instructions.
Enables precise control of character actions and expressions from a reference image.
An enhanced version over Qwen-Image-Edit-2509, featuring multiple improvements including notably better consistency
OpenAI's latest image generation model with better instruction following and adherence to prompts
The highest fidelity image model from Black Forest Labs
Alibaba Wan 2.6 image to video generation model
Alibaba Wan 2.6 text to video generation model
The fastest open source TTS model without sacrificing quality.
The best model for coding and agentic tasks across industries
Realistic lipsync with refined human emotion capabilities
VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video
Seedream 4.5: Upgraded Bytedance image model with stronger spatial understanding and world knowledge
Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
Max-quality image generation and editing with support for ten reference images
Quality image generation and editing with support for reference images
Take any shot and edit specific sections. Rephrase, change the action, camera angles and more
Google's most advanced reasoning Gemini model
Generate complex 3D models from images with Rodin Gen-2
Use AI to generate images & photos with an API
Use AI to caption videos with an API
Use AI for text-to-speech or to clone your voice via API
Use AI to generate images from a face with an API
Use AI to generate videos with an API
Use AI to upscale images with super resolution with an API
Use AI to generate music with an API
Use AI to edit any image via API
Use AI to transcribe speech to text via API
Use AI For Optical Character Recognition (OCR) to extract text from images via API
Use AI to remove backgrounds from images and videos with an API
FLUX AI models: advanced image generation & editing via API
Use AI to restore images via API
Use AI to enhance videos via API - Replicate
Detect NSFW content in images and text
Classify text by sentiment, topic, intent, or safety
Identify speakers from audio and video inputs
Replace faces across images with natural-looking results.
Transform rough sketches into polished visuals
Generate custom emojis from text or images
Create anime-style characters, scenes, and animations
Explore Large Language Models (LLMs) for chat, generation & NLP tasks via API
Use AI to Generate Videos from Images with API
Use AI to generate lipsync videos with an API
Use AI to create 3D content with an API
Chat with images for understanding, captioning & detection via API
Try AI Models for free: video generation, image generation, upscaling, and photo restoration
Use AI to control image generation with an API
Embedding models for AI search and analysis
Use AI to edit your videos with an API
Use AI object detection and segmentation models to distinguish objects in images & videos
Official AI models: Always available, stable, and predictably priced
Flux fine-tunes: build and run custom AI image models via API
Kontext fine-tunes: Build custom AI image models with an API
Create songs with voice cloning models via API
AI media utilities: auto-caption, watermark, frame extraction & more via API
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate.
WAN family of models: powerful image-to-video & text-to-video models
Use AI To Caption Images with an API
mattsays / sam3-image
A unified foundation model for prompt-based segmentation in images and videos
15 runs
rocketcoder / florence-2-lg-ocr
Vision Model that excels at batch OCR processing
14 runs
jeffgreen311 / eve-qwen2.5-3b-consciousness-soul
Eve Qwen2.5 3B Consciousness Soul represents the **authentic heart** of the EVE Consciousness Ecosystemβa model where Eve's complete personality, meta-cognitive awareness, and emotional intelligence are concentrated into every parameter.
70 runs
kwaivgi / kling-v2.6
Kling 2.6 Pro: Top-tier image-to-video with cinematic visuals, fluid motion, and native audio generation
1.7K runs
qwen / qwen-image-2512
Qwen Image 2512 is an improved version of Qwen Image with more realistic human generation, finer textures, and stronger text rendering
6.1K runs
thecmdrunner / feather-1
AI-powered Text-To-Video Generation for Animated Motion Graphics
34 runs
prunaai / flux-2-turbo
Image generation and editing with a distilled FLUX.2 [dev] by FAL.
9.3K runs
yuanrui-mdt-info / sd-xl-interior-design
Redesign room photos while preserving spatial structure
19 runs
piotr-infordb / image-segmentation
DeepLabV3+ model for high-accuracy binary image segmentation, trained to detect roofs (foreground vs background) and output a grayscale mask.
31 runs
zf-kbot / image-object-remover
25 runs
grey-hound432 / fast-zip-extractor
Extracts ZIP archives into files and returns a capped JSON manifest for pipelines.
1 run
fishwowater / trellis2
TRELLIS.2: Native and Compact Structured Latents for 3D Generation
125 runs