ops: handle multi-compute of the same weight (CORE-153)#13705
Conversation
If the same weight is used multiple times within the same prefetch window, it should only apply compute state mutations once. Mark the weight as fully resident on the first pass accordingly.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
kijai
left a comment
There was a problem hiding this comment.
Tested that this resolves the issue for me as well.
If the same weight is used multiple times within the same prefetch window, it should only apply compute state mutations once. Mark the weight as fully resident on the first pass accordingly.
Example Test Conditions:
RTX5090, Linux, LTX2.3 + Content Lora + KJNodes LTXV Chunk FF
"A group of old men in suits dance in a new york street."
Before:
After:
Regression Tests:
RTX5090, Linux, LTX2.0 T2V ✅
RTX5090, Linux, LTX2.0 T2V --disable-async-offload ✅
RTX5090, Linux, WAN2.2 ✅
RTX5090, Linux, Stable cascade -> flux 2 ✅