medium severitypgvector (PostgreSQL extension)
Approximate nearest neighbor (ANN) queries using IVFFlat index return lower recall after data inserts/updates/deletes: true top-k neighbors missed, while exact search (high probes or sequential scan) finds them. Degradation worsens over time with more changes.
Root cause
IVFFlat uses fixed centroids from k-means clustering at index build time. Inserts/updates add vectors to existing clusters but do not update centroids, causing cluster imbalance and recall degradation as data distribution shifts. [TigerData blog](https://www.tigerdata.com/blog/nearest-neighbor-indexes-what-are-ivfflat-indexes-in-pgvector-and-how-do-they-work) (2023).
pgvectorIVFFlatrecalldegradationindexrebuildcentroidsvectorpostgres