medium severityMarqo search highlights
Highlight extraction returns empty _highlights: [] or incomplete snippets for long documents (>30-50 pages or exceeding size limits). Indexing may fail with request size errors; search returns documents but no highlights despite matches.
Root cause
Marqo has request size limits during indexing (configurable via MARQO_MAX_DOC_BYTES); large documents exceed this or cause embedding/highlight generation failures due to token limits in models. Highlights rely on precise matches in searchable fields, which degrade with oversized texts.
Marqohighlight extraction_highlightslong documentslarge documentschunkingtensor search