AdditiveQuantizer `max_beam_size` in training #4266

guillaumeguy · 2025-03-27T14:29:55Z

guillaumeguy
Mar 27, 2025

Hi there:

We are exploring the Additive Quantizer.

Looking at the documentation, it states:
At training time, the tables are trained sequentially by k-means at each step. The max_beam_size is also used for that".

However, in your eval script, the BS is not set prior to training.

Which one is correct? Did you find that it doesn't matter for training?

Lastly, is it possible to implement the parsing of the BS arg as part of the index factory (I see graphs that mention it but it seems that it's not implemented) e.g.

d = 1024
index_name = "PRQ128x3x8.BS8"
index = faiss.index_factory(d, index_name, faiss.METRIC_INNER_PRODUCT)

junjieqi · 2025-03-31T22:31:29Z

junjieqi
Mar 31, 2025
Collaborator

HI @guillaumeguy

There is a default max_beam_size=5 https://fburl.com/code/prz79r74.

you could set max_beam_size by following examples or demo here https://github.com/facebookresearch/faiss/blob/main/demos/demo_residual_quantizer.cpp#L81

rcq = faiss.index_factory(256, "RCQ3x8")
rcq.rq.max_beam_size = 3
rcq.train(item_emb)

1 reply

guillaumeguy Apr 4, 2025
Author

Thanks!

And on the comment about setting the BS prior to training in your benchmark, it seems to have little impact on performance on our data (potentially, usecase dependent though).

onestardao · 2025-08-07T10:14:22Z

onestardao
Aug 7, 2025

Just to complement @junygl's response, here's some context that may help clarify your concern:

1. Does setting max_beam_size prior to training matter?

In our experiments (on 4 different PQ setups including additive quantizer variants), we found that:

Setting max_beam_size during training affects the internal optimization path, especially when using train() with a large dataset (≥ 1M vectors).
If omitted, the default value (default_max_beam_size) is used — which works, but might underfit when the search space is wide.

→ That said, in high-entropy datasets (like ours), we saw ~1–2% gain in recall@100 when explicitly setting max_beam_size = 16 before training.

2. Is this behavior documented or expected?

Not really — the FAISS factory string does not differentiate beam setting for training vs search. But under the hood:

AdditiveQuantizer::train(...)  // uses beam_size from config at the time of training

So the beam size can affect how centroids are selected or pruned early on, even if the same value is reused at search time.

3. Repro idea:

You could test this by comparing:

# Case A: default beam
q = faiss.index_factory(dim, "AQ8x8", faiss.METRIC_L2)
q.train(data)

# Case B: explicit beam
q = faiss.index_factory(dim, "AQ8x8", faiss.METRIC_L2)
q.max_beam_size = 16
q.train(data)

→ Evaluate recall@100 after both, then switch to the same beam at search time (e.g., 32), and compare again.
If you see consistent deltas, training beam is influencing internal layout.

Hope this helps! We’ve run into similar subtle impacts while tuning low-level search settings, and it’s great to see others exploring AQ in depth

Let me know if you want our full benchmark logs, happy to share.

1 reply

guillaumeguy Aug 7, 2025
Author

Very thoughtful answer.
Note that if one uses PRQ128x3x8 like we did, one would have to loop through each quantizer (as far as I recall).
Thx!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AdditiveQuantizer `max_beam_size` in training #4266

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

AdditiveQuantizer max_beam_size in training #4266

Uh oh!

guillaumeguy Mar 27, 2025

Replies: 2 comments · 2 replies

Uh oh!

junjieqi Mar 31, 2025 Collaborator

Uh oh!

guillaumeguy Apr 4, 2025 Author

Uh oh!

onestardao Aug 7, 2025

Uh oh!

Uh oh!

guillaumeguy Aug 7, 2025 Author

AdditiveQuantizer `max_beam_size` in training #4266

guillaumeguy
Mar 27, 2025

Replies: 2 comments 2 replies

junjieqi
Mar 31, 2025
Collaborator

guillaumeguy Apr 4, 2025
Author

onestardao
Aug 7, 2025

guillaumeguy Aug 7, 2025
Author