Free AI Models Guide

Last Updated: April 2026

Table of Contents¶

Hugging Face Inference Providers
Other Free Tiers
Recommendations by Use Case
Model Comparison Tables

Free AI Models Overview¶

Hugging Face Inference Providers¶

OpenCode natively supports Hugging Face Inference Providers - giving you access to open models from 17+ providers.

Quick Setup¶

Hugging Face Setup

# 1. Create token at huggingface.co/settings/tokens
#    (needs "Make calls to Inference Providers" permission)

# 2. Run auth login
opencode auth login

# 3. Select Hugging Face when prompted
# Enter your token: hf_

# 4. Select a model
/models

Best Coding Models¶

Model	Best For	Provider	Context	Notes
Qwen2.5-Coder-32B	Code reasoning	Featherless	131K	Good for complex code tasks

Other Notable Models¶

For Reasoning:

DeepSeek-R1 - Chain-of-thought reasoning, great for complex logic
Kimi-K2.5 - Fast, good for general tasks
GLM-4.7 - Free tier available

For General Use:

Llama 3.1 8B - Fast, cheapest
Qwen2.5 7B - Good balance
Gemma 4 31B - Google's best

Resources:

OpenCode Free Models¶

OpenCode offers built-in free models through OpenCode Zen - a curated set of models developed by Zen Labs that have been tested and benchmarked specifically for coding agents.

Getting Started

OpenCode comes with free models ready to use. Just run:

opencode
/models
# Select from available free models

Available Free Models¶

Model	Context	Best For	Notes
Big Pickle	200K	Complex coding	Stealth model, optimized for agents
MiniMax M2.5 Free	128K+	General coding	Strong at tool use
Qwen3.6 Plus Free	131K+	Complex tasks	High performance, reasoning capable
MiMo V2 Pro Free	131K+	Fast inference	One of the fastest available
Nemotron 3 Super Free	1M	Long context	NVIDIA's open-weight agent model

Limited Time

These free models are available while OpenCode collects feedback. During this period, data may be used to improve the model. For sensitive code, consider paid options or local models (Ollama).

About Big Pickle¶

Big Pickle is a stealth model developed by Zen Labs, the team behind OpenCode. It's optimized specifically for coding agents with a 200K token context window - the largest among free options.

How to Access¶

opencode

# In terminal:
/models

# Select from available free models

Other Free Tiers¶

Free Tier Comparison¶

Provider	Free Credits	Best Models	Sign Up
Google AI Studio	15 RPM, 250K TPM	Gemini 2.5 Pro/Flash	aistudio.google.com
GitHub Models	50-150 req/day	o3-mini, GPT-4.1	github.com/models
NVIDIA NIM	1,000 credits	DeepSeek R1, Llama	build.nvidia.com
Hugging Face	Monthly credits	300+ models	huggingface.co
Groq	Low-cost	Llama 3.3, DeepSeek R1, Qwen QwQ	console.groq.com
xAI	$25 credits	Grok 4	x.ai

Google AI Studio (Recommended for Free)¶

Best Free Option

Gemini 2.5 Pro (best free model)
Gemini 2.5 Flash (fastest)
1M token context
Sign up at aistudio.google.com

GitHub Models¶

GPT-4.1 (excellent coding)
o3-mini (reasoning)
Direct integration with OpenCode via /connect

Resources:

Awesome Free AI APIs

Groq (Fastest Inference)¶

Fastest Inference Available

Groq is not free - but it offers the fastest inference speed available (~1000+ tokens/second). It's low-cost and integrates natively with OpenCode.

Model	Context	Notes
llama-3.3-70b-versatile	128K	Best overall
deepseek-r1-distill-llama-70b	128K	Strong reasoning
qwen-qwq-32b	128K	Fast reasoning
gemma-2-9b-it	8K	Lightweight

Setup¶

Get a free API key at console.groq.com/keys
In OpenCode, run /connect → search for "Groq" → paste your API key
Run /models to select a Groq model

Recommendations by Use Case¶

Best for Coding Tasks¶

Qwen2.5-Coder-32B (HF) - Code reasoning
GPT-4.1 (GitHub Models) - General coding
Gemini 2.5 Pro (Google AI Studio) - Long context

Best for Reasoning¶

DeepSeek-R1 (HF Hyperbolic) - Chain-of-thought
o3-mini (GitHub Models) - Reasoning
Gemini 2.5 Pro (Google) - Long context

Best for Speed¶

Groq - Fastest inference (1000+ t/s)
Qwen3-Coder-Next - 128 t/s

Best for Free Tier (No Credit Card)¶

Big Pickle (OpenCode built-in)
GLM-4.7 Flash (HF)
Gemini 2.5 Flash (Google)

Model Comparison Tables¶

Coding Models Comparison¶

Model	Provider	Context	Speed	Notes	Best For
Qwen2.5-Coder-32B	Featherless	131K	Medium	Good code reasoning	Code reasoning
DeepSeek-R1	Hyperbolic	131K	Medium	Chain-of-thought	Complex reasoning
GPT-4.1	GitHub	32K	Fast	Free tier	General coding

Free Models Comparison¶

Model	Source	Notes
Big Pickle	OpenCode	Works out of box
GLM 4.7 Flash	Hugging Face	Slower
Gemini 2.5 Flash	Google	Generous free tier
Gemma 4 31B	Hugging Face	Google's best open model

TODO: Review Schedule¶

Review monthly - check for new models
Check free tier limits changed
Update recommendations based on benchmarks

Resources¶

This is a living document. Revisit and update regularly.