Dolphin LLM Guide

Dolphin LLM

The Dolphin family of LLMs — open source, uncensored, and steerable models from the community.

What is Dolphin?¶

Dolphin is a family of open-source LLMs developed by Eric Hartford, Cognitive Computations, and collaborators.

Key Characteristics¶

Uncensored — Dolphin removes alignment and bias filters, making the model more compliant and steerable
Steerable — You set the system prompt, you decide the alignment, you have control of your data
Open Source — Fully open weights under various licenses (Llama, Apache 2.0)
Community Driven — Trained on curated datasets from the open source community
General Purpose — Designed to be similar to ChatGPT, Claude, and Gemini but with full user control

The Philosophy¶

Why Dolphin?

Unlike commercial models: 1. No hidden system prompts that change without notice 2. Your data stays private — Dolphin can't see or use your queries 3. Full steerability — You control the model's behavior and ethics 4. No imposed guidelines — You decide what's appropriate

Dolphin Model Family¶

Recent Models¶

Model	Size	Base Model	Context	License
Dolphin 3.0 Llama 3.1 8B	8B	Llama 3.1	8K+	Meta Llama 3.1
Dolphin 3.0 Llama 3.2 1B	1B	Llama 3.2	8K+	Meta Llama 3.2
Dolphin 3.0 Llama 3.2 3B	3B	Llama 3.2	8K+	Meta Llama 3.2
Dolphin 3.0 Qwen 2.5 3B	3B	Qwen 2.5	8K+	Apache 2.0
Dolphin X1 8B	8B	Llama 3.1 8B	32K	Meta Llama 3.1
Dolphin X1 405B	405B	Llama 3.1 405B	32K	Meta Llama 3.1
Dolphin 2.9 Llama 3 8B	8B	Llama 3	8K	Meta Llama 3
Dolphin 2.9 Mistral 7B	7B	Mistral	8K+	Apache 2.0
Dolphin 2.8 Mistral 7B	7B	Mistral	8K+	Apache 2.0

Available Quantizations¶

Dolphin models are available in various GGUF formats for different use cases:

Q4_K_M — Balance of size and quality
Q5_K_S — Higher quality
Q8_0 — Best quality, larger size
EXL2 — Various bit widths (2-8bpw)

Pricing¶

Free and Open

Dolphin is free to use! But you'll need some hardware or an API provider.

Cost Options¶

Method	Cost	Requirements
Self-hosted (Ollama/LM Studio)	Free	GPU with 4-24GB VRAM
Hugging Face Inference	Varies	API credits
Cloud vLLM	Compute cost	GPU rental

Hardware Requirements¶

Model Size	Minimum VRAM	Recommended
1B parameters	2GB	4GB
3B parameters	6GB	8GB
8B parameters	16GB	24GB
70B parameters	140GB	8x A100/H100
405B parameters	800GB+	Multi-GPU cluster

Video Overview¶

How to Use¶

Using Ollama¶

# Pull a Dolphin model
ollama pull dolphin

# Or specific version
ollama pull dolphin3.0-llama3.1-8b

Using LM Studio¶

Download LM Studio from lmstudio.ai
Search for "dolphin" in the model browser
Download your desired model
Load and chat locally

Using Hugging Face Transformers¶

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "cognitivecomputations/Dolphin3.0-Llama3.1-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

Using vLLM (Production)¶

vllm serve cognitivecomputations/Dolphin3.0-Llama3.1-8B

Resources¶

Disclaimer: Dolphin is an uncensored model. You're responsible for the content you create using it.