Getting same cluster ids across multiple faiss kmeans run? #4423
Unanswered
Nitish1814
asked this question in
Q&A
Replies: 1 comment
-
I don't believe that there is a way that there is a way to return to the same cluster to embedding across multiple Kmeans run. The option to add init centroids will initialize train with a fixed centroids, but with additional input, throughout the iterations, centroids will be recalculated, and there is no guarantee in returning a same centroid to an embedding. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Suppose I have vector embeddings original = [e1, e2, e3] and I run faiss kmeans over it. I will get some set of centroids.
Now, I added few more embeddings, updated = [e1, e2, e3, e4, e5] and run faiss kmeans again over it.
So my question -> is there a way such that kmeans.index.search() will give same clusterId to the same embedding e1 across multiple kmeans run ?
I have read about passing previously computed centroids as an initialiser to current train but do it works? or it is not even possible?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions