llama-3.1-8b-instant#

Providers / Groq / llama-3.1-8b-instant

📋 Overview#

ID: llama-3.1-8b-instant
Provider: Groq
Authors: Meta
Release Date: 2024-07-23
Knowledge Cutoff: 2023-12-01
Open Weights: true
Context Window: 131k tokens
Max Output: 8k tokens

🔬 Technical Specifications#

Sampling Controls:

🎯 Capabilities#

Feature Overview#

Input/Output Modalities#

Direction	Text	Image	Audio	Video	PDF
Input	✅	❌	❌	❌	❌
Output	✅	❌	❌	❌	❌

Core Features#

Tool Calling	Tool Definitions	Tool Choice	Web Search	File Attachments
❌	❌	❌	❌	❌

Response Delivery#

Streaming	Structured Output	JSON Mode	Function Call	Text Format
✅	❌	❌	❌	✅

🎛️ Generation Controls#

Sampling & Decoding#

Temperature	Top-P
0.0-2.0	0.0-1.0

Length & Termination#

Max Tokens	Stop Sequences
1-8k	✅

💰 Pricing#

Pricing shown for Groq

Token Pricing#

Input	Output	Reasoning	Cache Read	Cache Write
$0.05/1M	$0.08/1M	-	-	-

💰 Cost Calculator#

Calculate costs for common usage patterns:

Use Case	Input	Output	Total Cost
Quick chat (1K in, 500 out)	1k tokens	500 tokens	$0.000090
Document summary (10K in, 1K out)	10k tokens	1k tokens	$0.000580
RAG query (50K in, 2K out)	50k tokens	2k tokens	$0.002660
Code generation (5K in, 10K out)	5k tokens	10k tokens	$0.001050

Pricing Formula:

1Cost = (Input Tokens / 1M × $0.05) + (Output Tokens / 1M × $0.08)

📊 Example Costs#

Real-world usage examples and their costs:

Usage Tier	Daily Volume	Monthly Tokens	Monthly Cost
Personal (10 chats/day)	10 chats	675k	$0.0405
Small Team (100 chats/day)	100 chats	9.0M	$0.5400
Enterprise (1000 chats/day)	1000 chats	135.0M	$8.10

📋 Metadata#

Created: 0001-01-01 00:00:00 UTC

Last Updated: 2025-10-19 18:13:08 UTC

Last Updated: 2025-10-21 23:55:56 UTC | Generated by ModelWiki

Llama 3.1 8b Instant