Sync random seed across ranks in distributed chat by kernelpool · Pull Request #801 · ml-explore/mlx-lm · GitHub

kernelpool · 2026-01-23T11:57:19Z

This fixes an issue affecting distributed chat where a temperature > 0 would result in ranks having a different random state, causing them to sample different tokens and diverge (often seen as artifacts in the output, shown below).

This fix follows the approach used in #741 and 1d76aab

mlx.launch --verbose --backend jaccl --hostfile hosts-jaccl.json --env MLX_METAL_FAST_SYNCH=1 -- /Users/optimus/repo/mlx-lm/.venv/bin/mlx_lm.chat --model mlx-community/Qwen3-4B-Instruct-2507-8bit --temp 1.0
[INFO] Running /Users/optimus/repo/mlx-lm/.venv/bin/python /Users/optimus/repo/mlx-lm/.venv/bin/mlx_lm.chat --model mlx-community/Qwen3-4B-Instruct-2507-8bit --temp 1.0 
Fetching 10 files: 100% 10/10 [00:00<00:00, 106184.91it/s]
Download complete: : 0.00B [00:00, ?B/s]              
Fetching 11 files: 100% 11/11 [00:00<00:00, 38067.12it/s]
Download complete: : 0.00B [00:00, ?B/s]              
Fetching 10 files: 100% 10/10 [00:02<00:00,  4.07it/s]MB/s]                
Fetching 11 files: 100% 11/11 [00:49<00:00,  4.46s/it]     
Download complete: 100% 4.27G/4.27G [00:49<00:00, 86.9MB/s]00:00, 152MB/s]  
Download complete: : 15.9MB [00:52, 306kB/s] :00, 152MB/s]                
[INFO] Starting chat session with mlx-community/Qwen3-4B-Instruct-2507-8bit.
The command list:
- 'q' to exit
- 'r' to reset the chat
- 'h' to display these commands
>> hello!
Hello! � How can I assist you today?

awni · 2026-01-23T14:14:20Z


    if group.size() > 1:
+        seed = mx.distributed.all_sum(mx.random.state[0]).view(mx.uint64).item()
+        mx.random.seed(seed)


Good catch! But this ignores the seed flag above. Wdyt about instead just setting a default seed and using that?

Ah, makes sense

awni · 2026-01-23T14:55:41Z

+        if args.seed is None:
+            mx.random.seed(0)


Nit: you can just set the DEFAULT_SEED=0 so the behavior is consistent in all cases. I think it's fine if the default behavior is seeded.

awni

Thx!

Sync random seed across ranks in distributed chat

25575c5

awni reviewed Jan 23, 2026

View reviewed changes

Use seed if provided

66d34d4

awni reviewed Jan 23, 2026

View reviewed changes

Set default seed

44ce6d4

awni approved these changes Jan 23, 2026

View reviewed changes

awni merged commit 12073b1 into ml-explore:main Jan 23, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync random seed across ranks in distributed chat#801

Sync random seed across ranks in distributed chat#801
awni merged 3 commits into
ml-explore:mainfrom
kernelpool:sync-random-seed

kernelpool commented Jan 23, 2026

Uh oh!

awni Jan 23, 2026

Uh oh!

kernelpool Jan 23, 2026

Uh oh!

awni Jan 23, 2026

Uh oh!

awni left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kernelpool commented Jan 23, 2026

Uh oh!

awni Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

kernelpool Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

awni Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

awni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants