Skip to content

Make the ltx audio vae more native (CORE-76)#13486

Merged
comfyanonymous merged 1 commit into
masterfrom
temp_pr
Apr 21, 2026
Merged

Make the ltx audio vae more native (CORE-76)#13486
comfyanonymous merged 1 commit into
masterfrom
temp_pr

Conversation

@comfyanonymous
Copy link
Copy Markdown
Member

Also set the dtype to fp32 only.

@alexisrolland alexisrolland changed the title Make the ltx audio vae more native. Make the ltx audio vae more native (CORE-76) Apr 21, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

📝 Walkthrough

Walkthrough

This pull request refactors audio VAE handling by simplifying the AudioVAE class and integrating it into the broader VAE system. The AudioVAE constructor was simplified to accept only metadata, with device management logic removed. The VAE class was extended to detect and instantiate AudioVAE for LTX Audio vocoder checkpoints, establishing runtime parameters including memory estimation, channel configuration, and sample rate handling. Audio-related nodes were refactored to use a shared base class and derive configuration from the unified VAE interface, with updated sample-rate selection logic and tensor layout adjustments during decoding.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description check ✅ Passed The description 'Also set the dtype to fp32 only' is directly related to the changeset, specifically referencing the dtype restriction visible in comfy/sd.py.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title 'Make the ltx audio vae more native (CORE-76)' directly reflects the main objective of the PR—refactoring the LTX audio VAE implementation to be more native and fp32-only, as confirmed by the PR description and substantial code changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
comfy/ldm/lightricks/vae/audio_vae.py (1)

238-239: ⚠️ Potential issue | 🟡 Minor

Orphaned memory_required references removed self.device_manager — will AttributeError if ever called.

This PR drops the ModelDeviceManager, so self.device_manager no longer exists on AudioVAE. The memory_required method at the bottom of the class still pokes at self.device_manager.patcher.model_size() and will crash if invoked. The AI summary even says this method was supposed to go — looks like it got left behind at the party after the lights came up.

In practice the VAE wrapper in comfy/sd.py routes memory estimation through its own memory_used_encode/memory_used_decode lambdas, so today this is dead weight rather than a live crash — but it's a dangling landmine for any future caller.

🧹 Suggested cleanup
-    def memory_required(self, input_shape):
-        return self.device_manager.patcher.model_size()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/ldm/lightricks/vae/audio_vae.py` around lines 238 - 239, The
memory_required method on AudioVAE still references
self.device_manager.patcher.model_size(), but ModelDeviceManager was removed;
either delete the memory_required method or change it to a safe stub (e.g.,
raise NotImplementedError or return a computed value from existing attributes)
so it no longer touches self.device_manager; update AudioVAE.memory_required
accordingly and note the VAE wrapper uses memory_used_encode/memory_used_decode
so callers should use those instead of relying on this method.
🧹 Nitpick comments (2)
comfy_extras/nodes_lt_audio.py (2)

83-90: Decode layout + sample-rate sourcing look correct; optional simplification available.

VAE.decode returns [B, T, C] (channels-last) via its final movedim(1,-1), so .movedim(-1, 1) restores the [B, C, T] waveform layout that the ComfyUI audio dict expects.

Minor: audio_vae.first_stage_model.output_sample_rate could equivalently be audio_vae.audio_sample_rate_output (set in comfy/sd.py line 814 to the same value), which avoids reaching through first_stage_model. Not a correctness issue — just depends on whether you'd rather depend on the VAE wrapper's public surface or on the inner model.

♻️ Optional tidy
-        audio = audio_vae.decode(audio_latent).movedim(-1, 1).to(audio_latent.device)
-        output_audio_sample_rate = audio_vae.first_stage_model.output_sample_rate
+        audio = audio_vae.decode(audio_latent).movedim(-1, 1).to(audio_latent.device)
+        output_audio_sample_rate = audio_vae.audio_sample_rate_output
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy_extras/nodes_lt_audio.py` around lines 83 - 90, The decode and channel
reordering are correct (audio = audio_vae.decode(...).movedim(-1, 1)), but to
avoid reaching into the internal model use the VAE wrapper's public attribute:
replace sourcing sample rate from audio_vae.first_stage_model.output_sample_rate
with audio_vae.audio_sample_rate_output when building the io.NodeOutput so the
code relies on the VAE wrapper API rather than the inner first_stage_model.

37-57: Neat delegation via super() — just flagging the argument reorder for future readers.

LTXVAudioVAEEncode keeps the node-level input order (audio, audio_vae) (matching the schema) while delegating to the base VAEEncodeAudio.execute(cls, vae, audio). The super().execute(audio_vae, audio) swap is correct and super() inside a classmethod resolves fine via the compiler-inserted __class__ cell.

One small ergonomic nit: if VAEEncodeAudio.execute ever grows a third parameter, this override will silently drop it. Consider *args, **kwargs forwarding if you want that to be future-proof — otherwise fine as-is.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy_extras/nodes_lt_audio.py` around lines 37 - 57, The override
LTXVAudioVAEEncode.execute currently accepts (cls, audio, audio_vae) and calls
super().execute(audio_vae, audio), which will drop any future extra parameters;
change the method signature to def execute(cls, audio, audio_vae, *args,
**kwargs) and forward all extra arguments when delegating to the base by calling
super().execute(audio_vae, audio, *args, **kwargs) so the parameter reorder is
explicit and future-proof; reference LTXVAudioVAEEncode.execute and
VAEEncodeAudio.execute when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/sd.py`:
- Around line 811-814: The audio pipeline is resampling twice because
self.audio_sample_rate remains at the 44100 default while
audio_sample_rate_output is set from first_stage_model.output_sample_rate; to
fix, set self.audio_sample_rate to the autoencoder's native rate too (i.e.,
assign self.audio_sample_rate = self.first_stage_model.output_sample_rate
adjacent to where audio_sample_rate_output is set) so VAEEncodeAudio.execute and
AudioVAE.encode will resample directly to the model's native sample rate (mirror
the AudioOobleckVAE pattern).
- Around line 809-825: The branch that detects LTX audio keys (checking
"vocoder.resblocks.0.convs1.0.weight") must normalize the state-dict prefixes
like LTXVAudioVAELoader does so AudioVAE's expected submodule name (autoencoder)
matches the keys; update the branch that constructs
comfy.ldm.lightricks.vae.audio_vae.AudioVAE to first remap sd keys replacing the
"audio_vae." (or "vocoder.vocoder." variants) prefix to "autoencoder." (or the
exact submodule names AudioVAE expects) before calling VAE/AudioVAE and
load_state_dict, or alternatively tighten the detection so it only triggers when
the remapping has already been applied; reference AudioVAE, LTXVAudioVAELoader,
VAELoader, VAE, load_state_dict and the sd dict when implementing the fix.

---

Outside diff comments:
In `@comfy/ldm/lightricks/vae/audio_vae.py`:
- Around line 238-239: The memory_required method on AudioVAE still references
self.device_manager.patcher.model_size(), but ModelDeviceManager was removed;
either delete the memory_required method or change it to a safe stub (e.g.,
raise NotImplementedError or return a computed value from existing attributes)
so it no longer touches self.device_manager; update AudioVAE.memory_required
accordingly and note the VAE wrapper uses memory_used_encode/memory_used_decode
so callers should use those instead of relying on this method.

---

Nitpick comments:
In `@comfy_extras/nodes_lt_audio.py`:
- Around line 83-90: The decode and channel reordering are correct (audio =
audio_vae.decode(...).movedim(-1, 1)), but to avoid reaching into the internal
model use the VAE wrapper's public attribute: replace sourcing sample rate from
audio_vae.first_stage_model.output_sample_rate with
audio_vae.audio_sample_rate_output when building the io.NodeOutput so the code
relies on the VAE wrapper API rather than the inner first_stage_model.
- Around line 37-57: The override LTXVAudioVAEEncode.execute currently accepts
(cls, audio, audio_vae) and calls super().execute(audio_vae, audio), which will
drop any future extra parameters; change the method signature to def
execute(cls, audio, audio_vae, *args, **kwargs) and forward all extra arguments
when delegating to the base by calling super().execute(audio_vae, audio, *args,
**kwargs) so the parameter reorder is explicit and future-proof; reference
LTXVAudioVAEEncode.execute and VAEEncodeAudio.execute when making this change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ef6e146e-e577-4eee-a7d9-b821ea6bec46

📥 Commits

Reviewing files that changed from the base of the PR and between c514890 and bfc70a7.

📒 Files selected for processing (4)
  • comfy/ldm/lightricks/vae/audio_vae.py
  • comfy/sd.py
  • comfy_extras/nodes_audio.py
  • comfy_extras/nodes_lt_audio.py

Comment thread comfy/sd.py
Comment on lines +809 to +825
elif "vocoder.resblocks.0.convs1.0.weight" in sd or "vocoder.vocoder.resblocks.0.convs1.0.weight" in sd: # LTX Audio
self.first_stage_model = comfy.ldm.lightricks.vae.audio_vae.AudioVAE(metadata=metadata)
self.memory_used_encode = lambda shape, dtype: (shape[2] * 330) * model_management.dtype_size(dtype)
self.memory_used_decode = lambda shape, dtype: (shape[2] * shape[3] * 87000) * model_management.dtype_size(dtype)
self.latent_channels = self.first_stage_model.latent_channels
self.audio_sample_rate_output = self.first_stage_model.output_sample_rate
self.autoencoder = self.first_stage_model.autoencoder # TODO: remove hack for ltxv custom nodes
self.output_channels = 2
self.pad_channel_value = "replicate"
self.upscale_ratio = 4096
self.downscale_ratio = 4096
self.latent_dim = 2
self.process_output = lambda audio: audio
self.process_input = lambda audio: audio
self.working_dtypes = [torch.float32]
self.disable_offload = True
self.extra_1d_channel = 16
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look for any other call-sites that construct the VAE for LTX audio checkpoints.
rg -nP --type=py -C3 '\bvocoder\.resblocks\.0\.convs1\.0\.weight\b|audio_vae\.'

Repository: Comfy-Org/ComfyUI

Length of output: 4008


🏁 Script executed:

#!/bin/bash
# Check all places where VAE class is instantiated with state_dict
rg -nP --type=py "VAE\(" -A 3 | head -60

Repository: Comfy-Org/ComfyUI

Length of output: 4603


🏁 Script executed:

#!/bin/bash
# Look for other places that might load LTX audio or use the detection keys
rg -nP --type=py "load_state_dict|AudioVAE" -B 2 -A 2

Repository: Comfy-Org/ComfyUI

Length of output: 22218


🏁 Script executed:

#!/bin/bash
# Check if there are other loaders or entry points that might construct VAE with vocoder keys
rg -nP --type=py "class.*VAE.*Loader|def.*load.*vae" -i

Repository: Comfy-Org/ComfyUI

Length of output: 1254


🏁 Script executed:

#!/bin/bash
# Check the generic VAELoader implementation
sed -n '729,850p' nodes.py

Repository: Comfy-Org/ComfyUI

Length of output: 4756


🏁 Script executed:

#!/bin/bash
# Check if there's any special handling for audio VAE detection in the generic loader
rg -nP --type=py "class VAELoader" -A 120 nodes.py | head -150

Repository: Comfy-Org/ComfyUI

Length of output: 5211


🏁 Script executed:

#!/bin/bash
# Check the full VAE class __init__ to understand when detection happens
sed -n '500,900p' comfy/sd.py | grep -A 200 "class VAE"

Repository: Comfy-Org/ComfyUI

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Check if this code appears in git history or is newly added
git log --oneline -n 20 -- comfy/sd.py | head -10

Repository: Comfy-Org/ComfyUI

Length of output: 104


🏁 Script executed:

#!/bin/bash
# Check for any recent additions to the LTX audio detection
git log -p --all -S "vocoder.resblocks.0.convs1.0.weight" -- comfy/sd.py | head -100

Repository: Comfy-Org/ComfyUI

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Show the details of that commit to see what changed
git show bfc70a7 --stat | head -50

Repository: Comfy-Org/ComfyUI

Length of output: 3016


🏁 Script executed:

#!/bin/bash
# See the actual changes in that commit for sd.py
git show bfc70a7 -- comfy/sd.py | head -200

Repository: Comfy-Org/ComfyUI

Length of output: 6850


🏁 Script executed:

#!/bin/bash
# Confirm the exact lines of the LTX audio block in this PR
git show bfc70a7:comfy/sd.py | sed -n '809,825p'

Repository: Comfy-Org/ComfyUI

Length of output: 1293


LTX Audio detection succeeds but state dict loading will fail via generic VAELoader due to missing prefix remapping.

The detection at line 809 matches vocoder.resblocks.0.convs1.0.weight, but AudioVAE's submodule is named self.autoencoder. The specialized LTXVAudioVAELoader handles this by remapping audio_vae.autoencoder. before VAE construction. However, if a user loads the same checkpoint through the generic VAELoader (which calls VAE(sd=sd, ...) at nodes.py:830 without remapping), the state dict keys won't match the expected submodule structure—load_state_dict at line 850 will fail silently with "Missing VAE keys" warnings and a non-functional audio VAE.

Either restrict the detection to trigger only when appropriate prefix remapping is applied, or add the same prefix normalization to this branch to make the generic loader path robust.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/sd.py` around lines 809 - 825, The branch that detects LTX audio keys
(checking "vocoder.resblocks.0.convs1.0.weight") must normalize the state-dict
prefixes like LTXVAudioVAELoader does so AudioVAE's expected submodule name
(autoencoder) matches the keys; update the branch that constructs
comfy.ldm.lightricks.vae.audio_vae.AudioVAE to first remap sd keys replacing the
"audio_vae." (or "vocoder.vocoder." variants) prefix to "autoencoder." (or the
exact submodule names AudioVAE expects) before calling VAE/AudioVAE and
load_state_dict, or alternatively tighten the detection so it only triggers when
the remapping has already been applied; reference AudioVAE, LTXVAudioVAELoader,
VAELoader, VAE, load_state_dict and the sd dict when implementing the fix.

Comment thread comfy/sd.py
Comment on lines +811 to +814
self.memory_used_encode = lambda shape, dtype: (shape[2] * 330) * model_management.dtype_size(dtype)
self.memory_used_decode = lambda shape, dtype: (shape[2] * shape[3] * 87000) * model_management.dtype_size(dtype)
self.latent_channels = self.first_stage_model.latent_channels
self.audio_sample_rate_output = self.first_stage_model.output_sample_rate
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

LTX Audio VAE autoencoder sampling rate

💡 Result:

The LTX Audio VAE autoencoder processes input waveforms converted to stereo audio at a sampling rate of 16 kHz. Mel-spectrograms are computed from this 16 kHz audio for encoding into latent representations. The decoder reconstructs mel-spectrograms at the same rate, which are then upsampled by a HiFi-GAN-based vocoder to a 24 kHz output waveform.

Citations:


🏁 Script executed:

# Find and examine the AudioVAE class around the mentioned lines
cd comfy && grep -n "class AudioVAE" sd.py && head -n 830 sd.py | tail -n 100

Repository: Comfy-Org/ComfyUI

Length of output: 43


🏁 Script executed:

# Check line 469 to see the default audio_sample_rate
sed -n '465,475p' comfy/sd.py

Repository: Comfy-Org/ComfyUI

Length of output: 628


🏁 Script executed:

# Find the VAEEncodeAudio implementation
fd "nodes_audio" --type f && grep -n "class VAEEncodeAudio" comfy_extras/nodes_audio.py

Repository: Comfy-Org/ComfyUI

Length of output: 163


🏁 Script executed:

# Look at the VAEEncodeAudio.execute method
grep -A 30 "class VAEEncodeAudio" comfy_extras/nodes_audio.py

Repository: Comfy-Org/ComfyUI

Length of output: 1076


🏁 Script executed:

sed -n '800,830p' comfy/sd.py

Repository: Comfy-Org/ComfyUI

Length of output: 2426


🏁 Script executed:

# Find where first_stage_model.output_sample_rate comes from
grep -n "output_sample_rate" comfy/sd.py | head -20

Repository: Comfy-Org/ComfyUI

Length of output: 154


🏁 Script executed:

# Search for LTX or audio VAE model references
grep -n "LTX\|latent_channels\|first_stage_model" comfy/sd.py | grep -A 2 -B 2 "811\|812\|813\|814"

Repository: Comfy-Org/ComfyUI

Length of output: 644


🏁 Script executed:

# Find the AudioVAE class definition
find . -name "audio_vae.py" -type f

Repository: Comfy-Org/ComfyUI

Length of output: 100


🏁 Script executed:

# Check the AudioVAE class and encode method
fd "audio_vae" -type f && cat $(fd "audio_vae" -type f)

Repository: Comfy-Org/ComfyUI

Length of output: 231


🏁 Script executed:

# Read the audio_vae.py file
cat comfy/ldm/lightricks/vae/audio_vae.py

Repository: Comfy-Org/ComfyUI

Length of output: 9012


🏁 Script executed:

# Also check for any sample_rate or output_sample_rate references in that file
grep -n "sample_rate\|output_sample_rate" comfy/ldm/lightricks/vae/audio_vae.py

Repository: Comfy-Org/ComfyUI

Length of output: 1266


🏁 Script executed:

# Search for AudioOobleckVAE or similar pattern in comfy/sd.py
grep -n "AudioOobleckVAE\|Oobleck" comfy/sd.py | head -20

Repository: Comfy-Org/ComfyUI

Length of output: 185


🏁 Script executed:

# Look for similar audio VAE branches to see the pattern
grep -B 5 -A 10 "audio_sample_rate = " comfy/sd.py | head -50

Repository: Comfy-Org/ComfyUI

Length of output: 2175


Set audio_sample_rate to match the LTX Audio autoencoder's actual sampling rate.

The LTX Audio branch sets audio_sample_rate_output from first_stage_model.output_sample_rate but leaves self.audio_sample_rate at the inherited default of 44100. Since VAEEncodeAudio.execute uses vae.audio_sample_rate to determine whether to resample input audio, and AudioVAE.encode defaults its sample_rate parameter to 44100, this mismatch causes inefficient double-resampling: the preprocessor resamples to 44100 (per VAEEncodeAudio), then the autoencoder internally resamples again to its actual native rate (16 kHz per the model specification).

Mirror the AudioOobleckVAE branch pattern by setting:

+                self.audio_sample_rate = int(self.first_stage_model.sample_rate)

This ensures the encode path resamples directly to the autoencoder's native rate in a single step.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.memory_used_encode = lambda shape, dtype: (shape[2] * 330) * model_management.dtype_size(dtype)
self.memory_used_decode = lambda shape, dtype: (shape[2] * shape[3] * 87000) * model_management.dtype_size(dtype)
self.latent_channels = self.first_stage_model.latent_channels
self.audio_sample_rate_output = self.first_stage_model.output_sample_rate
self.memory_used_encode = lambda shape, dtype: (shape[2] * 330) * model_management.dtype_size(dtype)
self.memory_used_decode = lambda shape, dtype: (shape[2] * shape[3] * 87000) * model_management.dtype_size(dtype)
self.latent_channels = self.first_stage_model.latent_channels
self.audio_sample_rate_output = self.first_stage_model.output_sample_rate
self.audio_sample_rate = int(self.first_stage_model.sample_rate)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/sd.py` around lines 811 - 814, The audio pipeline is resampling twice
because self.audio_sample_rate remains at the 44100 default while
audio_sample_rate_output is set from first_stage_model.output_sample_rate; to
fix, set self.audio_sample_rate to the autoencoder's native rate too (i.e.,
assign self.audio_sample_rate = self.first_stage_model.output_sample_rate
adjacent to where audio_sample_rate_output is set) so VAEEncodeAudio.execute and
AudioVAE.encode will resample directly to the model's native sample rate (mirror
the AudioOobleckVAE pattern).

@alexisrolland
Copy link
Copy Markdown
Member

alexisrolland commented Apr 21, 2026

Tested both example workflows t2v and i2v, before/after comparison give the same results.

@comfyanonymous comfyanonymous merged commit ad94d47 into master Apr 21, 2026
16 checks passed
@zwukong
Copy link
Copy Markdown

zwukong commented Apr 22, 2026

video vae is loaded by VAELoader KJ, maybe need a conversion too

Kosinkadink added a commit that referenced this pull request Apr 24, 2026
* fix: pin SQLAlchemy>=2.0 in requirements.txt (fixes #13036) (#13316)

* Refactor io to IO in nodes_ace.py (#13485)

* Bump comfyui-frontend-package to 1.42.12 (#13489)

* Make the ltx audio vae more native. (#13486)

* feat(api-nodes): add automatic downscaling of videos for ByteDance 2 nodes (#13465)

* Support standalone LTXV audio VAEs (#13499)

* [Partner Nodes]  added 4K resolution for Veo models; added Veo 3 Lite model (#13330)

* feat(api nodes): added 4K resolution for Veo models; added Veo 3 Lite model

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* increase poll_interval from 5 to 9

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>

* Bump comfyui-frontend-package to 1.42.14 (#13493)

* Add gpt-image-2 as version option (#13501)

* Allow logging in comfy app files. (#13505)

* chore: update workflow templates to v0.9.59 (#13507)

* fix(veo): reject 4K resolution for veo-3.0 models in Veo3VideoGenerationNode (#13504)

The tooltip on the resolution input states that 4K is not available for
veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the
lite combination. Selecting 4K with veo-3.0-generate-001 or
veo-3.0-fast-generate-001 would fall through and hit the upstream API
with an invalid request.

Broaden the guard to match the documented behavior and update the error
message accordingly.

Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>

* feat: RIFE and FILM frame interpolation model support (CORE-29) (#13258)

* initial RIFE support

* Also support FILM

* Better RAM usage, reduce FILM VRAM peak

* Add model folder placeholder

* Fix oom fallback frame loss

* Remove torch.compile for now

* Rename model input

* Shorter input type name

---------

* fix: use Parameter assignment for Stable_Zero123 cc_projection weights (fixes #13492) (#13518)

On Windows with aimdo enabled, disable_weight_init.Linear uses lazy
initialization that sets weight and bias to None to avoid unnecessary
memory allocation. This caused a crash when copy_() was called on the
None weight attribute in Stable_Zero123.__init__.

Replace copy_() with direct torch.nn.Parameter assignment, which works
correctly on both Windows (aimdo enabled) and other platforms.

* Derive InterruptProcessingException from BaseException (#13523)

* bump manager version to 4.2.1 (#13516)

* ModelPatcherDynamic: force cast stray weights on comfy layers (#13487)

the mixed_precision ops can have input_scale parameters that are used
in tensor math but arent a weight or bias so dont get proper VRAM
management. Treat these as force-castable parameters like the non comfy
weight, random params are buffers already are.

* Update logging level for invalid version format (#13526)

* [Partner Nodes] add SD2 real human support (#13509)

* feat(api-nodes): add SD2 real human support

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* fix: add validation before uploading Assets

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* Add asset_id and group_id displaying on the node

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* extend poll_op to use instead of custom async cycle

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* added the polling for the "Active" status after asset creation

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* updated tooltip for group_id

* allow usage of real human in the ByteDance2FirstLastFrame node

* add reference count limits

* corrected price in status when input assets contain video

Signed-off-by: bigcat88 <bigcat88@icloud.com>

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* feat: SAM (segment anything) 3.1 support (CORE-34) (#13408)

* [Partner Nodes] GPTImage: fix price badges, add new resolutions (#13519)

* fix(api-nodes): fixed price badges, add new resolutions

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* proper calculate the total run cost when "n > 1"

Signed-off-by: bigcat88 <bigcat88@icloud.com>

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* chore: update workflow templates to v0.9.61 (#13533)

* chore: update embedded docs to v0.4.4 (#13535)

* add 4K resolution to Kling nodes (#13536)

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* Fix LTXV Reference Audio node (#13531)

* comfy-aimdo 0.2.14: Hotfix async allocator estimations (#13534)

This was doing an over-estimate of VRAM used by the async allocator when lots
of little small tensors were in play.

Also change the versioning scheme to == so we can roll forward aimdo without
worrying about stable regressions downstream in comfyUI core.

* Disable sageattention for SAM3 (#13529)

Causes Nans

* execution: Add anti-cycle validation (#13169)

Currently if the graph contains a cycle, the just inifitiate recursions,
hits a catch all then throws a generic error against the output node
that seeded the validation. Instead, fail the offending cycling mode
chain and handlng it as an error in its own right.

Co-authored-by: guill <jacob.e.segal@gmail.com>

* chore: update workflow templates to v0.9.62 (#13539)

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Octopus <liyuan851277048@icloud.com>
Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>
Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com>
Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com>
Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com>
Co-authored-by: AustinMroz <austin@comfy.org>
Co-authored-by: Daxiong (Lin) <contact@comfyui-wiki.com>
Co-authored-by: Matt Miller <matt@miller-media.com>
Co-authored-by: blepping <157360029+blepping@users.noreply.github.com>
Co-authored-by: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com>
Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com>
Co-authored-by: guill <jacob.e.segal@gmail.com>
@comfyanonymous comfyanonymous deleted the temp_pr branch April 26, 2026 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants