Inserting logits processors into BatchGenerator in batch_generate by arthurhjorth · Pull Request #1008 · ml-explore/mlx-lm · GitHub

arthurhjorth · 2026-03-16T09:48:06Z

Tiny change: logits_processors are currently included in the signature in generate.batch_generate() but were unused. To actually use them in BatchGenerator, they need to be included when inserting prompts. This PR does that.

P.s. I said I'd do this a long time ago (#845). Apologies for the delay, life hit hard and got in the way.

arthurhjorth · 2026-03-16T09:53:57Z

By the way, in the old PR I mentioned that the API for passing in logits_processors with generate-calls is inconsistent. E.g. in batch_generate() they are a named arg, in single generate() you have to pass them in with **kwargs, i.e.

g = generate(model, tokenizer, prompts[0], **{'logits_processors': logits_processors[0]})

If you'd like to make this consistent, I'd be happy to add those changes to this PR too.

angeloskath

Moved logits_processors as a kwarg that is passed to BatchGenerator. As an aside, there is no need to make a dictionary to pass keyword arguments passing them by name works fine.

…l-explore#845) Squash of ml-explore#845 (closed unmerged), 6 commits, applied on top of Patch 1 (case-project-v0.31.3.1). Adds outlines-based JsonSchemaLogitsProcessor wired into both batch and single generation paths in mlx_lm.server; OpenAI-compatible response_format extraction (json_schema, json_object, both nested and flat shapes). Files: mlx_lm/structured.py (new) — StructuredProcessorCache: per-tokenizer LRU cache of compiled outlines indices. mlx_lm/server.py — request parsing + processor integration; routes through the existing logits_processor pipeline (ml-explore#1008 plumbing already in v0.31.3) rather than reintroducing parallel infra. setup.py — adds outlines==1.2.12 dependency. Conflict resolution vs PR ml-explore#845 base: - generate.py: PR ml-explore#845's bb2f48d added logits_processors=logits_processors to gen.insert() in batch_generate(). In v0.31.3 batch_generate() receives logits_processors via **kwargs into BatchGenerator's constructor, and BatchGenerator.insert_segments falls back to self.logits_processors when not passed explicitly. PR's hunk would have been a NameError. Skipped. - server.py batch path: kept v0.31.3's GenerationContext + insert_segments + state_machines architecture. Built the structured processor and merged into the per-segment logits_processors list rather than swapping to PR ml-explore#845's older insert(prompts, max_tokens) call shape. - setup.py: PR pinned outlines==1.2.9 + outlines_core==0.2.14, which is impossible (1.2.9 requires outlines_core==0.2.11). Bumped to outlines==1.2.12 (transitively requires outlines_core==0.2.14) because 0.2.11's Metal kernel has a bfloat16 cast bug that crashes generation with `assigning to bfloat16_t from incompatible type 'float'`. 0.2.14's kernel uses `static_cast<T>(-INFINITY)`. Smoke-test (no --mtp): /v1/chat/completions with response_format={type:json_schema,json_schema:{Person schema}} returns valid JSON `{"name":"John Doe","age":30,"city":"New York"}`, all required keys, age is int, finish_reason=stop. ✓ KNOWN LIMITATION: --mtp + response_format crashes with `ValueError: No next state found for the current state ... with token ID ...` from outlines's stateful FSM. MTP's draft-rejection rollback is not compatible with outlines's Guide.advance() linear-progression assumption. Workaround: run the server WITHOUT --mtp when using response_format. A proper fix would teach structured.py to snapshot and roll back guide state on draft rejection — non-trivial follow-up, not blocking this tag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Inserting logits processors into BatchGenerator in batch_generate

a26629b

Move logits_processors to kwargs

e19322a

angeloskath approved these changes Mar 30, 2026

View reviewed changes

angeloskath merged commit bdeac59 into ml-explore:main Mar 31, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inserting logits processors into BatchGenerator in batch_generate#1008

Inserting logits processors into BatchGenerator in batch_generate#1008
angeloskath merged 2 commits into
ml-explore:mainfrom
arthurhjorth:logits_processors_for_batch_generate

arthurhjorth commented Mar 16, 2026

Uh oh!

arthurhjorth commented Mar 16, 2026

Uh oh!

angeloskath left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

arthurhjorth commented Mar 16, 2026

Uh oh!

arthurhjorth commented Mar 16, 2026

Uh oh!

angeloskath left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants