Free AI Tool · Embeddings · Vector DB · Pinecone · OpenAI · Cohere · RAG Cost
AI Embedding Cost Calculator
Calculate the cost of generating and storing AI embeddings. Compare embedding models from OpenAI, Cohere, Voyage AI and Jina. Estimate vector database storage on Pinecone, Weaviate and pgvector. Essential for RAG pipeline budgeting.
How to Use the AI Embedding Cost Calculator
Enter the number of documents to embed, average tokens per document, monthly query volume and average query length. Furthermore, click Calculate to see indexing cost, monthly query cost, storage cost and total month-1 spend across 5 embedding models. The recommendation highlights the cheapest option. Additionally, storage estimates use Pinecone serverless pricing at $0.33 per GB per month.
- Enter corpus sizeNumber of documents and average tokens per document.
- Set query volumeMonthly queries and average query token length.
- Compare modelsSee index cost, query cost, storage and total for 5 models.
- Check recommendationtext-embedding-3-small is cheapest for 95 percent of use cases.
- Copy resultsCopy the full cost breakdown for your team.
Embedding Pricing (June 2026)
Embedding models are dramatically cheaper than generative models because they only process input tokens (no output). Furthermore, text-embedding-3-small at $0.02 per million tokens is 97 percent cheaper than GPT-5 Nano input pricing. The quality difference between small ($0.02) and large ($0.13) models is only 4 percentage points on the MTEB benchmark. Additionally, Batch API processing halves costs for non-real-time indexing.
| Model | Provider | $/M tokens | Dimensions | MTEB score |
|---|---|---|---|---|
| text-embedding-3-small | OpenAI | $0.02 | 1,536 | ~62% |
| text-embedding-3-large | OpenAI | $0.13 | 3,072 | ~66% |
| embed-v4 | Cohere | $0.10 | 1,024 | ~67% |
| voyage-3-large | Voyage AI | $0.18 | 1,024 | ~68% |
| jina-v3 | Jina | $0.02 | 1,024 | ~65% |
Sources: OpenAI Embedding Pricing · EmbeddingCost.com · PE Collective: Embedding Models Compared
Vector Database Storage Costs
Embedding costs are only half the equation. Furthermore, storing vectors in a database adds ongoing monthly cost. Storage follows the formula: vectors times dimensions times 4 bytes. For 1 million documents at 1,536 dimensions, that is approximately 5.7 GB. Additionally, Pinecone serverless charges approximately $0.33 per GB per month. Self-hosted pgvector on existing Postgres has near-zero marginal cost.
For large corpora, storage cost often exceeds embedding generation cost within a few months. Furthermore, 10 million documents at 3,072 dimensions consume 114 GB, costing approximately $38 per month on Pinecone. Using the smaller 1,536-dimension model halves this. Additionally, Matryoshka embeddings allow reducing dimensions with graceful quality degradation, further cutting storage.
Competitor Gap Analysis
Few tools calculate embedding costs. Furthermore, none combine multi-model comparison with vector database storage estimates in one calculator. Most developers estimate embedding costs manually or discover storage costs after deployment.
| Feature | Existing tools | LazyTools |
|---|---|---|
| Multi-model embedding comparison | Rare (1 tool) | 5 models, 4 providers |
| Vector DB storage estimate | No | Auto from dimensions |
| Query cost projection | No | Monthly query volume |
| Month-1 total (index + query + storage) | No | Combined view |
| Copy comparison | No | Full text report |
References
1. OpenAI: Embedding Pricing.
2. EmbeddingCost.com: Comprehensive Comparison.
3. PE Collective: Embedding Model Specs 2026.
4. Pinecone: Serverless Pricing.
Batch API Discounts for Embeddings
OpenAI's Batch API offers 50 percent discount on embedding costs. Furthermore, text-embedding-3-small drops to $0.01 per million tokens in batch mode. Batch processing is ideal for one-time corpus indexing, periodic re-indexing and non-urgent updates. Additionally, batch jobs complete within 24 hours and are significantly cheaper than real-time requests.
Use standard pricing only for real-time query embedding. Furthermore, when a user types a search query, the query must be embedded immediately (standard rate). The corpus documents were already embedded in batch mode at half price. Additionally, Voyage AI offers 33 percent batch discounts. Cohere offers batch pricing through their Coral platform.
Choosing the Right Embedding Model
For 95 percent of RAG applications, text-embedding-3-small at $0.02 per million tokens is sufficient. Furthermore, the quality gap between small and large models (62 percent versus 66 percent MTEB) rarely affects end-user experience in production retrieval systems. Upgrade to text-embedding-3-large only if your retrieval accuracy metrics prove you need it.
For specialised domains (legal, medical, scientific), Cohere embed-v4 and Voyage AI voyage-3-large lead quality benchmarks. Furthermore, these models excel at domain-specific semantic understanding. The 3x to 9x cost premium is justified when incorrect retrieval has high consequences. Additionally, Jina v3 at $0.02 per million tokens offers an excellent open-source alternative with strong multilingual support.
RAG Pipeline Total Cost
A complete RAG pipeline has three cost layers: embedding generation, vector storage and generative inference. Furthermore, embedding is typically the smallest cost (one-time indexing). Storage is ongoing but modest. Generative inference (the LLM answering from retrieved chunks) dominates total spend. Additionally, this calculator focuses on the embedding and storage layers. Use the AI Credit and Cost Calculator for generative inference costs.
For a typical enterprise RAG system with 100,000 documents, 50,000 monthly queries and Sonnet 4.6 inference, expect approximately $2 for indexing, $15 for monthly storage and $2,000+ for monthly inference. Furthermore, optimising the inference model and prompt length has 100x more impact on total cost than choosing between embedding models.
Frequently Asked Questions
Related AI Tools
AI Credit & Cost Calculator
Compare API costs for 20+ models. Furthermore, includes presets and recommendations.
→AI Token Counter
Count tokens with cost estimates for 9 models. Furthermore, shows context window fit.
→AI Fine-Tuning Cost Calculator
Compare fine-tuning costs across 6 providers. Furthermore, includes inference markup analysis.
→AI ROI Calculator
Calculate AI automation ROI with payback period. Furthermore, includes 3-year projections.
→AI Model Benchmark Comparator
Compare MMLU and HumanEval scores for 12 models. Furthermore, highlights category leaders.
→AI Context Window Planner
Plan token budgets for RAG chunks. Furthermore, shows model fit for 8 models.
→