LLM token cost
Is asking a LLM for a short answer to a question a cost effective idea?
Published: Thursday, Apr 4, 2024 Last modified: Thursday, Nov 14, 2024
Mistral
https://docs.mistral.ai/platform/pricing/
Mistral Large
- Input: 8$ / 1M tokens
- Output: 24$ / 1M tokens
Input tokens
Output tokens
Prompt limit: 2048 tokens
OpenAI
GPT-4 Turbo
- Input $10.00 / 1M tokens
- Output $30.00 / 1M tokens
- Context window: 128,000 tokens
- “Returns a maximum of 4,096 output tokens.”
Context tokens
Generated tokens
Prompt limit: 2048 tokens
Limits
https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
Depending on the model used, requests can use up to 128,000 tokens shared between prompt and completion. Some models, like GPT-4 Turbo, have different limits on input and output tokens. There are often creative ways to solve problems within the limit, e.g. condensing your prompt, breaking the text into smaller pieces, etc.
Anthropic
Claude 3 - Opus
- Input: $15 / MTok
- Output: $75 / MTok