medium severityOllama embedding API

Embedding vectors for the same input prompt differ numerically across Ollama versions (e.g., first values: [-0.234, 1.966,...] vs [-0.091, 1.581,...]), leading to inconsistent cosine similarities, especially for longer prompts. No errors, but poor reproducibility in RAG/similarity search.

Root cause

Different Ollama versions (e.g., 0.4.6 vs 0.17.0) handle the nomic-embed-text model differently, producing varying numerical embedding vectors for identical inputs, likely due to changes in model loading, quantization, or inference engine. Additionally, Nomic versions (v1/v1.5) have distinct latent spaces.

ollamanomic-embed-textembeddingsinconsistencyversion-mismatchbug

Citations