These large language models understand and generate natural language. They power chatbots, search engines, writing aids, and more.
Use these for:
Language models keep getting bigger and better at these tasks. The largest models today exhibit impressive reasoning skills. But you can get great results from smaller, faster, cheaper models too.
Featured models
OpenAI's new model excelling at coding, writing, and reasoning.
Updated 1 week, 2 days ago
848.7K runs
Google's most advanced reasoning Gemini model
Updated 1 month, 1 week ago
101.9K runs
Claude Sonnet 4.5 is the best coding model to date, with significant improvements across the entire development lifecycle
Updated 3 months ago
399.6K runs
Recommended Models
If you want speed and low latency, google/gemini-2.5-flash and openai/gpt-5-nano are strong choices. Both are designed for fast responses and lower compute use while keeping good reasoning quality.
For conversational tasks at scale, anthropic/claude-4.5-haiku also offers quick turnarounds with solid performance.
openai/gpt-5-mini and anthropic/claude-4.5-sonnet both deliver high-quality writing, summarization, and reasoning at a manageable cost.
If you want strong reasoning without high overhead, deepseek-ai/deepseek-r1 and meta/meta-llama-3.1-405b-instruct offer impressive results for their size.
For natural dialogue and chatbots, openai/gpt-5, anthropic/claude-4.5-sonnet, and google/gemini-2.5-flash are all reliable.
They handle multi-turn conversations, context retention, and friendly tone well. Smaller variants like openai/gpt-5-nano or anthropic/claude-4.5-haiku are ideal for lighter-weight chat assistants.
anthropic/claude-4.5-sonnet and deepseek-ai/deepseek-r1 are tuned for structured reasoning, code generation, and debugging support.
openai/gpt-5 also performs well for both natural language and code reasoning tasks, especially in multi-step logic or problem-solving scenarios.
Large language models differ mainly by scale, tuning, and purpose:
These models output natural language text, often in conversational or structured formats.
They can generate, summarize, translate, or explain information, and some also handle light reasoning, analysis, or code generation.
Several models in this collection are open-weight and can be self-hosted, such as meta/meta-llama-3.1-405b-instruct or openai/gpt-oss-120b.
To publish your own model on Replicate, package it with a replicate.yaml defining input and output fields, then push it to your account to run on managed GPUs.
Yes—many of these models are available for commercial use, depending on their license. Always review the License section of each model page before deployment, as some require attribution or restrict redistribution.
You can run them directly on Replicate by providing a text prompt in the model’s playground or via API.
For example, type a question or instruction and receive a natural language response. Some models, like google/gemini-2.5-flash or openai/gpt-4o, may also accept image or multimodal inputs depending on version.
Recommended Models
Fastest, most cost-effective GPT-5 model from OpenAI
Updated 1 week, 2 days ago
3.2M runs
Faster version of OpenAI's flagship GPT-5 model
Updated 1 week, 2 days ago
579K runs
Google’s hybrid "thinking" AI model optimized for speed and cost-efficiency
Updated 2 weeks, 1 day ago
491.4K runs
120b open-weight language model from OpenAI
Updated 1 month, 2 weeks ago
137.2K runs
Latest hybrid thinking model from Deepseek
Updated 1 month, 2 weeks ago
159.9K runs
Grok 4 is xAI’s most advanced reasoning model. Excels at logical thinking and in-depth analysis. Ideal for insightful discussions and complex problem-solving.
Updated 1 month, 2 weeks ago
22.8K runs
A reasoning model trained with reinforcement learning, on par with OpenAI o1
Updated 1 month, 2 weeks ago
2.2M runs
Meta's flagship 405 billion parameter language model, fine-tuned for chat completions
Updated 1 month, 2 weeks ago
6.9M runs
20b open-weight language model from OpenAI
Updated 1 month, 2 weeks ago
125.4K runs
Claude Haiku 4.5 gives you similar levels of coding performance but at one-third the cost and more than twice the speed
Updated 2 months, 2 weeks ago
38.4K runs
Granite-3.3-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.
Updated 3 months, 1 week ago
1.6M runs
OpenAI's Flagship GPT model for complex tasks.
Updated 3 months, 2 weeks ago
268.8K runs
Fastest, most cost-effective GPT-4.1 model from OpenAI
Updated 3 months, 2 weeks ago
727K runs
Fast, affordable version of GPT-4.1
Updated 3 months, 2 weeks ago
1.3M runs
OpenAI's high-intelligence chat model
Updated 3 months, 3 weeks ago
336.2K runs
OpenAI's fast, lightweight reasoning model
Updated 4 months, 2 weeks ago
377.4K runs
A small model alternative to o1
Updated 4 months, 2 weeks ago
3.3K runs
OpenAI's first o-series reasoning model
Updated 4 months, 2 weeks ago
16.3K runs
Low latency, low cost version of OpenAI's GPT-4o model
Updated 4 months, 2 weeks ago
12.3M runs
Updated Qwen3 model for instruction following
Updated 4 months, 3 weeks ago
142.7K runs
Claude Sonnet 4 is a significant upgrade to 3.7, delivering superior coding and reasoning while responding more precisely to your instructions
Updated 6 months, 2 weeks ago
1.5M runs
DeepSeek-V3-0324 is the leading non-reasoning model, a milestone for open source
Updated 9 months ago
4.4M runs
The most intelligent Claude model and the first hybrid reasoning model on the market (claude-3-7-sonnet-20250219)
Updated 10 months, 1 week ago
3.5M runs
Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)
Updated 10 months, 2 weeks ago
2.9M runs
Anthropic's most intelligent language model to date, with a 200K token context window and image understanding (claude-3-5-sonnet-20241022)
Updated 10 months, 2 weeks ago
604K runs
Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
Updated 1 year, 5 months ago
33.2M runs
Base version of Llama 3, a 70 billion parameter language model from Meta.
Updated 1 year, 8 months ago
852.5K runs
A 70 billion parameter language model from Meta, fine tuned for chat completions
Updated 1 year, 8 months ago
164.3M runs
An 8 billion parameter language model from Meta, fine tuned for chat completions
Updated 1 year, 8 months ago
395M runs
Base version of Llama 3, an 8 billion parameter language model from Meta.
Updated 1 year, 8 months ago
51.2M runs
2B instruct version of Google’s Gemma model
Updated 1 year, 10 months ago
134.3K runs
LLaVA v1.6: Large Language and Vision Assistant (Vicuna-13B)
Updated 1 year, 10 months ago
3.7M runs
LLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)
Updated 1 year, 10 months ago
5M runs
7 billion parameter version of Stability AI's language model
Updated 2 years, 8 months ago
140.6K runs
A language model by Google for tasks like classification, summarization, and more
Updated 2 years, 8 months ago
151.3K runs
Transformers implementation of the LLaMA language model
Updated 2 years, 9 months ago
99.4K runs