Make the ltx audio vae more native (CORE-76) by comfyanonymous · Pull Request #13486 · Comfy-Org/ComfyUI · GitHub

comfyanonymous · 2026-04-21T02:58:44Z

Also set the dtype to fp32 only.

coderabbitai · 2026-04-21T03:05:27Z

📝 Walkthrough

Walkthrough

This pull request refactors audio VAE handling by simplifying the AudioVAE class and integrating it into the broader VAE system. The AudioVAE constructor was simplified to accept only metadata, with device management logic removed. The VAE class was extended to detect and instantiate AudioVAE for LTX Audio vocoder checkpoints, establishing runtime parameters including memory estimation, channel configuration, and sample rate handling. Audio-related nodes were refactored to use a shared base class and derive configuration from the unified VAE interface, with updated sample-rate selection logic and tensor layout adjustments during decoding.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description 'Also set the dtype to fp32 only' is directly related to the changeset, specifically referencing the dtype restriction visible in comfy/sd.py.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title 'Make the ltx audio vae more native (CORE-76)' directly reflects the main objective of the PR—refactoring the LTX audio VAE implementation to be more native and fp32-only, as confirmed by the PR description and substantial code changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

comfy/ldm/lightricks/vae/audio_vae.py (1)
238-239: ⚠️ Potential issue | 🟡 Minor

Orphaned memory_required references removed self.device_manager — will AttributeError if ever called.

This PR drops the ModelDeviceManager, so self.device_manager no longer exists on AudioVAE. The memory_required method at the bottom of the class still pokes at self.device_manager.patcher.model_size() and will crash if invoked. The AI summary even says this method was supposed to go — looks like it got left behind at the party after the lights came up.

In practice the VAE wrapper in comfy/sd.py routes memory estimation through its own memory_used_encode/memory_used_decode lambdas, so today this is dead weight rather than a live crash — but it's a dangling landmine for any future caller.
🧹 Suggested cleanup
-    def memory_required(self, input_shape):
-        return self.device_manager.patcher.model_size()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/ldm/lightricks/vae/audio_vae.py` around lines 238 - 239, The
memory_required method on AudioVAE still references
self.device_manager.patcher.model_size(), but ModelDeviceManager was removed;
either delete the memory_required method or change it to a safe stub (e.g.,
raise NotImplementedError or return a computed value from existing attributes)
so it no longer touches self.device_manager; update AudioVAE.memory_required
accordingly and note the VAE wrapper uses memory_used_encode/memory_used_decode
so callers should use those instead of relying on this method.

🧹 Nitpick comments (2)

comfy_extras/nodes_lt_audio.py (2)
83-90: Decode layout + sample-rate sourcing look correct; optional simplification available.

VAE.decode returns [B, T, C] (channels-last) via its final movedim(1,-1), so .movedim(-1, 1) restores the [B, C, T] waveform layout that the ComfyUI audio dict expects.

Minor: audio_vae.first_stage_model.output_sample_rate could equivalently be audio_vae.audio_sample_rate_output (set in comfy/sd.py line 814 to the same value), which avoids reaching through first_stage_model. Not a correctness issue — just depends on whether you'd rather depend on the VAE wrapper's public surface or on the inner model.
♻️ Optional tidy
-        audio = audio_vae.decode(audio_latent).movedim(-1, 1).to(audio_latent.device)
-        output_audio_sample_rate = audio_vae.first_stage_model.output_sample_rate
+        audio = audio_vae.decode(audio_latent).movedim(-1, 1).to(audio_latent.device)
+        output_audio_sample_rate = audio_vae.audio_sample_rate_output
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy_extras/nodes_lt_audio.py` around lines 83 - 90, The decode and channel
reordering are correct (audio = audio_vae.decode(...).movedim(-1, 1)), but to
avoid reaching into the internal model use the VAE wrapper's public attribute:
replace sourcing sample rate from audio_vae.first_stage_model.output_sample_rate
with audio_vae.audio_sample_rate_output when building the io.NodeOutput so the
code relies on the VAE wrapper API rather than the inner first_stage_model.
37-57: Neat delegation via super() — just flagging the argument reorder for future readers.

LTXVAudioVAEEncode keeps the node-level input order (audio, audio_vae) (matching the schema) while delegating to the base VAEEncodeAudio.execute(cls, vae, audio). The super().execute(audio_vae, audio) swap is correct and super() inside a classmethod resolves fine via the compiler-inserted __class__ cell.

One small ergonomic nit: if VAEEncodeAudio.execute ever grows a third parameter, this override will silently drop it. Consider *args, **kwargs forwarding if you want that to be future-proof — otherwise fine as-is.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy_extras/nodes_lt_audio.py` around lines 37 - 57, The override
LTXVAudioVAEEncode.execute currently accepts (cls, audio, audio_vae) and calls
super().execute(audio_vae, audio), which will drop any future extra parameters;
change the method signature to def execute(cls, audio, audio_vae, *args,
**kwargs) and forward all extra arguments when delegating to the base by calling
super().execute(audio_vae, audio, *args, **kwargs) so the parameter reorder is
explicit and future-proof; reference LTXVAudioVAEEncode.execute and
VAEEncodeAudio.execute when making this change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/sd.py`:
- Around line 811-814: The audio pipeline is resampling twice because
self.audio_sample_rate remains at the 44100 default while
audio_sample_rate_output is set from first_stage_model.output_sample_rate; to
fix, set self.audio_sample_rate to the autoencoder's native rate too (i.e.,
assign self.audio_sample_rate = self.first_stage_model.output_sample_rate
adjacent to where audio_sample_rate_output is set) so VAEEncodeAudio.execute and
AudioVAE.encode will resample directly to the model's native sample rate (mirror
the AudioOobleckVAE pattern).
- Around line 809-825: The branch that detects LTX audio keys (checking
"vocoder.resblocks.0.convs1.0.weight") must normalize the state-dict prefixes
like LTXVAudioVAELoader does so AudioVAE's expected submodule name (autoencoder)
matches the keys; update the branch that constructs
comfy.ldm.lightricks.vae.audio_vae.AudioVAE to first remap sd keys replacing the
"audio_vae." (or "vocoder.vocoder." variants) prefix to "autoencoder." (or the
exact submodule names AudioVAE expects) before calling VAE/AudioVAE and
load_state_dict, or alternatively tighten the detection so it only triggers when
the remapping has already been applied; reference AudioVAE, LTXVAudioVAELoader,
VAELoader, VAE, load_state_dict and the sd dict when implementing the fix.

---

Outside diff comments:
In `@comfy/ldm/lightricks/vae/audio_vae.py`:
- Around line 238-239: The memory_required method on AudioVAE still references
self.device_manager.patcher.model_size(), but ModelDeviceManager was removed;
either delete the memory_required method or change it to a safe stub (e.g.,
raise NotImplementedError or return a computed value from existing attributes)
so it no longer touches self.device_manager; update AudioVAE.memory_required
accordingly and note the VAE wrapper uses memory_used_encode/memory_used_decode
so callers should use those instead of relying on this method.

---

Nitpick comments:
In `@comfy_extras/nodes_lt_audio.py`:
- Around line 83-90: The decode and channel reordering are correct (audio =
audio_vae.decode(...).movedim(-1, 1)), but to avoid reaching into the internal
model use the VAE wrapper's public attribute: replace sourcing sample rate from
audio_vae.first_stage_model.output_sample_rate with
audio_vae.audio_sample_rate_output when building the io.NodeOutput so the code
relies on the VAE wrapper API rather than the inner first_stage_model.
- Around line 37-57: The override LTXVAudioVAEEncode.execute currently accepts
(cls, audio, audio_vae) and calls super().execute(audio_vae, audio), which will
drop any future extra parameters; change the method signature to def
execute(cls, audio, audio_vae, *args, **kwargs) and forward all extra arguments
when delegating to the base by calling super().execute(audio_vae, audio, *args,
**kwargs) so the parameter reorder is explicit and future-proof; reference
LTXVAudioVAEEncode.execute and VAEEncodeAudio.execute when making this change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ef6e146e-e577-4eee-a7d9-b821ea6bec46

📥 Commits

Reviewing files that changed from the base of the PR and between c514890 and bfc70a7.

📒 Files selected for processing (4)

comfy/ldm/lightricks/vae/audio_vae.py
comfy/sd.py
comfy_extras/nodes_audio.py
comfy_extras/nodes_lt_audio.py

coderabbitai · 2026-04-21T03:05:30Z

+            elif "vocoder.resblocks.0.convs1.0.weight" in sd or "vocoder.vocoder.resblocks.0.convs1.0.weight" in sd: # LTX Audio
+                self.first_stage_model = comfy.ldm.lightricks.vae.audio_vae.AudioVAE(metadata=metadata)
+                self.memory_used_encode = lambda shape, dtype: (shape[2] * 330) * model_management.dtype_size(dtype)
+                self.memory_used_decode = lambda shape, dtype: (shape[2] * shape[3] * 87000) * model_management.dtype_size(dtype)
+                self.latent_channels = self.first_stage_model.latent_channels
+                self.audio_sample_rate_output = self.first_stage_model.output_sample_rate
+                self.autoencoder = self.first_stage_model.autoencoder  # TODO: remove hack for ltxv custom nodes
+                self.output_channels = 2
+                self.pad_channel_value = "replicate"
+                self.upscale_ratio = 4096
+                self.downscale_ratio = 4096
+                self.latent_dim = 2
+                self.process_output = lambda audio: audio
+                self.process_input = lambda audio: audio
+                self.working_dtypes = [torch.float32]
+                self.disable_offload = True
+                self.extra_1d_channel = 16


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Look for any other call-sites that construct the VAE for LTX audio checkpoints. rg -nP --type=py -C3 '\bvocoder\.resblocks\.0\.convs1\.0\.weight\b|audio_vae\.'

Repository: Comfy-Org/ComfyUI

Length of output: 4008

🏁 Script executed:

#!/bin/bash # Check all places where VAE class is instantiated with state_dict rg -nP --type=py "VAE\(" -A 3 | head -60

Repository: Comfy-Org/ComfyUI

Length of output: 4603

🏁 Script executed:

#!/bin/bash # Look for other places that might load LTX audio or use the detection keys rg -nP --type=py "load_state_dict|AudioVAE" -B 2 -A 2

Repository: Comfy-Org/ComfyUI

Length of output: 22218

🏁 Script executed:

#!/bin/bash # Check if there are other loaders or entry points that might construct VAE with vocoder keys rg -nP --type=py "class.*VAE.*Loader|def.*load.*vae" -i

Repository: Comfy-Org/ComfyUI

Length of output: 1254

🏁 Script executed:

#!/bin/bash # Check the generic VAELoader implementation sed -n '729,850p' nodes.py

Repository: Comfy-Org/ComfyUI

Length of output: 4756

🏁 Script executed:

#!/bin/bash # Check if there's any special handling for audio VAE detection in the generic loader rg -nP --type=py "class VAELoader" -A 120 nodes.py | head -150

Repository: Comfy-Org/ComfyUI

Length of output: 5211

🏁 Script executed:

#!/bin/bash # Check the full VAE class __init__ to understand when detection happens sed -n '500,900p' comfy/sd.py | grep -A 200 "class VAE"

Repository: Comfy-Org/ComfyUI

Length of output: 43

🏁 Script executed:

#!/bin/bash # Check if this code appears in git history or is newly added git log --oneline -n 20 -- comfy/sd.py | head -10

Repository: Comfy-Org/ComfyUI

Length of output: 104

🏁 Script executed:

#!/bin/bash # Check for any recent additions to the LTX audio detection git log -p --all -S "vocoder.resblocks.0.convs1.0.weight" -- comfy/sd.py | head -100

Repository: Comfy-Org/ComfyUI

Length of output: 43

🏁 Script executed:

#!/bin/bash # Show the details of that commit to see what changed git show bfc70a7 --stat | head -50

Repository: Comfy-Org/ComfyUI

Length of output: 3016

🏁 Script executed:

#!/bin/bash # See the actual changes in that commit for sd.py git show bfc70a7 -- comfy/sd.py | head -200

Repository: Comfy-Org/ComfyUI

Length of output: 6850

🏁 Script executed:

#!/bin/bash # Confirm the exact lines of the LTX audio block in this PR git show bfc70a7:comfy/sd.py | sed -n '809,825p'

Repository: Comfy-Org/ComfyUI

Length of output: 1293

LTX Audio detection succeeds but state dict loading will fail via generic VAELoader due to missing prefix remapping.

The detection at line 809 matches vocoder.resblocks.0.convs1.0.weight, but AudioVAE's submodule is named self.autoencoder. The specialized LTXVAudioVAELoader handles this by remapping audio_vae. → autoencoder. before VAE construction. However, if a user loads the same checkpoint through the generic VAELoader (which calls VAE(sd=sd, ...) at nodes.py:830 without remapping), the state dict keys won't match the expected submodule structure—load_state_dict at line 850 will fail silently with "Missing VAE keys" warnings and a non-functional audio VAE.

Either restrict the detection to trigger only when appropriate prefix remapping is applied, or add the same prefix normalization to this branch to make the generic loader path robust.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@comfy/sd.py` around lines 809 - 825, The branch that detects LTX audio keys (checking "vocoder.resblocks.0.convs1.0.weight") must normalize the state-dict prefixes like LTXVAudioVAELoader does so AudioVAE's expected submodule name (autoencoder) matches the keys; update the branch that constructs comfy.ldm.lightricks.vae.audio_vae.AudioVAE to first remap sd keys replacing the "audio_vae." (or "vocoder.vocoder." variants) prefix to "autoencoder." (or the exact submodule names AudioVAE expects) before calling VAE/AudioVAE and load_state_dict, or alternatively tighten the detection so it only triggers when the remapping has already been applied; reference AudioVAE, LTXVAudioVAELoader, VAELoader, VAE, load_state_dict and the sd dict when implementing the fix.

coderabbitai · 2026-04-21T03:05:30Z

+                self.memory_used_encode = lambda shape, dtype: (shape[2] * 330) * model_management.dtype_size(dtype)
+                self.memory_used_decode = lambda shape, dtype: (shape[2] * shape[3] * 87000) * model_management.dtype_size(dtype)
+                self.latent_channels = self.first_stage_model.latent_channels
+                self.audio_sample_rate_output = self.first_stage_model.output_sample_rate


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

LTX Audio VAE autoencoder sampling rate

💡 Result:

The LTX Audio VAE autoencoder processes input waveforms converted to stereo audio at a sampling rate of 16 kHz. Mel-spectrograms are computed from this 16 kHz audio for encoding into latent representations. The decoder reconstructs mel-spectrograms at the same rate, which are then upsampled by a HiFi-GAN-based vocoder to a 24 kHz output waveform.

Citations:

1: https://arxiv.org/html/2601.03233v1

2: https://arxiv.org/pdf/2601.03233

3: https://github.com/Lightricks/LTX-2/blob/main/packages/ltx-core/README.md

4: https://videos.ltx.io/LTX-2/grants/LTX_2_Technical_Report_compressed.pdf

5: https://d6108366.hf-mirror.com/papers/2601.03233

6: https://www.aifasthub.com/papers/2601.03233

🏁 Script executed:

# Find and examine the AudioVAE class around the mentioned lines cd comfy && grep -n "class AudioVAE" sd.py && head -n 830 sd.py | tail -n 100

Repository: Comfy-Org/ComfyUI

Length of output: 43

🏁 Script executed:

# Check line 469 to see the default audio_sample_rate sed -n '465,475p' comfy/sd.py

Repository: Comfy-Org/ComfyUI

Length of output: 628

🏁 Script executed:

# Find the VAEEncodeAudio implementation fd "nodes_audio" --type f && grep -n "class VAEEncodeAudio" comfy_extras/nodes_audio.py

Repository: Comfy-Org/ComfyUI

Length of output: 163

🏁 Script executed:

# Look at the VAEEncodeAudio.execute method grep -A 30 "class VAEEncodeAudio" comfy_extras/nodes_audio.py

Repository: Comfy-Org/ComfyUI

Length of output: 1076

🏁 Script executed:

sed -n '800,830p' comfy/sd.py

Repository: Comfy-Org/ComfyUI

Length of output: 2426

🏁 Script executed:

# Find where first_stage_model.output_sample_rate comes from grep -n "output_sample_rate" comfy/sd.py | head -20

Repository: Comfy-Org/ComfyUI

Length of output: 154

🏁 Script executed:

# Search for LTX or audio VAE model references grep -n "LTX\|latent_channels\|first_stage_model" comfy/sd.py | grep -A 2 -B 2 "811\|812\|813\|814"

Repository: Comfy-Org/ComfyUI

Length of output: 644

🏁 Script executed:

# Find the AudioVAE class definition find . -name "audio_vae.py" -type f

Repository: Comfy-Org/ComfyUI

Length of output: 100

🏁 Script executed:

# Check the AudioVAE class and encode method fd "audio_vae" -type f && cat $(fd "audio_vae" -type f)

Repository: Comfy-Org/ComfyUI

Length of output: 231

🏁 Script executed:

# Read the audio_vae.py file cat comfy/ldm/lightricks/vae/audio_vae.py

Repository: Comfy-Org/ComfyUI

Length of output: 9012

🏁 Script executed:

# Also check for any sample_rate or output_sample_rate references in that file grep -n "sample_rate\|output_sample_rate" comfy/ldm/lightricks/vae/audio_vae.py

Repository: Comfy-Org/ComfyUI

Length of output: 1266

🏁 Script executed:

# Search for AudioOobleckVAE or similar pattern in comfy/sd.py grep -n "AudioOobleckVAE\|Oobleck" comfy/sd.py | head -20

Repository: Comfy-Org/ComfyUI

Length of output: 185

🏁 Script executed:

# Look for similar audio VAE branches to see the pattern grep -B 5 -A 10 "audio_sample_rate = " comfy/sd.py | head -50

Repository: Comfy-Org/ComfyUI

Length of output: 2175

Set audio_sample_rate to match the LTX Audio autoencoder's actual sampling rate.

The LTX Audio branch sets audio_sample_rate_output from first_stage_model.output_sample_rate but leaves self.audio_sample_rate at the inherited default of 44100. Since VAEEncodeAudio.execute uses vae.audio_sample_rate to determine whether to resample input audio, and AudioVAE.encode defaults its sample_rate parameter to 44100, this mismatch causes inefficient double-resampling: the preprocessor resamples to 44100 (per VAEEncodeAudio), then the autoencoder internally resamples again to its actual native rate (16 kHz per the model specification).

Mirror the AudioOobleckVAE branch pattern by setting:

+ self.audio_sample_rate = int(self.first_stage_model.sample_rate)

This ensures the encode path resamples directly to the autoencoder's native rate in a single step.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self.memory_used_encode = lambda shape, dtype: (shape[2] * 330) * model_management.dtype_size(dtype)

self.memory_used_decode = lambda shape, dtype: (shape[2] * shape[3] * 87000) * model_management.dtype_size(dtype)

self.latent_channels = self.first_stage_model.latent_channels

self.audio_sample_rate_output = self.first_stage_model.output_sample_rate

self.memory_used_encode = lambda shape, dtype: (shape[2] * 330) * model_management.dtype_size(dtype)

self.memory_used_decode = lambda shape, dtype: (shape[2] * shape[3] * 87000) * model_management.dtype_size(dtype)

self.latent_channels = self.first_stage_model.latent_channels

self.audio_sample_rate_output = self.first_stage_model.output_sample_rate

self.audio_sample_rate = int(self.first_stage_model.sample_rate)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@comfy/sd.py` around lines 811 - 814, The audio pipeline is resampling twice because self.audio_sample_rate remains at the 44100 default while audio_sample_rate_output is set from first_stage_model.output_sample_rate; to fix, set self.audio_sample_rate to the autoencoder's native rate too (i.e., assign self.audio_sample_rate = self.first_stage_model.output_sample_rate adjacent to where audio_sample_rate_output is set) so VAEEncodeAudio.execute and AudioVAE.encode will resample directly to the model's native sample rate (mirror the AudioOobleckVAE pattern).

alexisrolland · 2026-04-21T04:50:06Z

Tested both example workflows t2v and i2v, before/after comparison give the same results.

zwukong · 2026-04-22T04:41:11Z

video vae is loaded by VAELoader KJ, maybe need a conversion too

* fix: pin SQLAlchemy>=2.0 in requirements.txt (fixes #13036) (#13316) * Refactor io to IO in nodes_ace.py (#13485) * Bump comfyui-frontend-package to 1.42.12 (#13489) * Make the ltx audio vae more native. (#13486) * feat(api-nodes): add automatic downscaling of videos for ByteDance 2 nodes (#13465) * Support standalone LTXV audio VAEs (#13499) * [Partner Nodes] added 4K resolution for Veo models; added Veo 3 Lite model (#13330) * feat(api nodes): added 4K resolution for Veo models; added Veo 3 Lite model Signed-off-by: bigcat88 <bigcat88@icloud.com> * increase poll_interval from 5 to 9 --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> * Bump comfyui-frontend-package to 1.42.14 (#13493) * Add gpt-image-2 as version option (#13501) * Allow logging in comfy app files. (#13505) * chore: update workflow templates to v0.9.59 (#13507) * fix(veo): reject 4K resolution for veo-3.0 models in Veo3VideoGenerationNode (#13504) The tooltip on the resolution input states that 4K is not available for veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the lite combination. Selecting 4K with veo-3.0-generate-001 or veo-3.0-fast-generate-001 would fall through and hit the upstream API with an invalid request. Broaden the guard to match the documented behavior and update the error message accordingly. Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> * feat: RIFE and FILM frame interpolation model support (CORE-29) (#13258) * initial RIFE support * Also support FILM * Better RAM usage, reduce FILM VRAM peak * Add model folder placeholder * Fix oom fallback frame loss * Remove torch.compile for now * Rename model input * Shorter input type name --------- * fix: use Parameter assignment for Stable_Zero123 cc_projection weights (fixes #13492) (#13518) On Windows with aimdo enabled, disable_weight_init.Linear uses lazy initialization that sets weight and bias to None to avoid unnecessary memory allocation. This caused a crash when copy_() was called on the None weight attribute in Stable_Zero123.__init__. Replace copy_() with direct torch.nn.Parameter assignment, which works correctly on both Windows (aimdo enabled) and other platforms. * Derive InterruptProcessingException from BaseException (#13523) * bump manager version to 4.2.1 (#13516) * ModelPatcherDynamic: force cast stray weights on comfy layers (#13487) the mixed_precision ops can have input_scale parameters that are used in tensor math but arent a weight or bias so dont get proper VRAM management. Treat these as force-castable parameters like the non comfy weight, random params are buffers already are. * Update logging level for invalid version format (#13526) * [Partner Nodes] add SD2 real human support (#13509) * feat(api-nodes): add SD2 real human support Signed-off-by: bigcat88 <bigcat88@icloud.com> * fix: add validation before uploading Assets Signed-off-by: bigcat88 <bigcat88@icloud.com> * Add asset_id and group_id displaying on the node Signed-off-by: bigcat88 <bigcat88@icloud.com> * extend poll_op to use instead of custom async cycle Signed-off-by: bigcat88 <bigcat88@icloud.com> * added the polling for the "Active" status after asset creation Signed-off-by: bigcat88 <bigcat88@icloud.com> * updated tooltip for group_id * allow usage of real human in the ByteDance2FirstLastFrame node * add reference count limits * corrected price in status when input assets contain video Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> * feat: SAM (segment anything) 3.1 support (CORE-34) (#13408) * [Partner Nodes] GPTImage: fix price badges, add new resolutions (#13519) * fix(api-nodes): fixed price badges, add new resolutions Signed-off-by: bigcat88 <bigcat88@icloud.com> * proper calculate the total run cost when "n > 1" Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> * chore: update workflow templates to v0.9.61 (#13533) * chore: update embedded docs to v0.4.4 (#13535) * add 4K resolution to Kling nodes (#13536) Signed-off-by: bigcat88 <bigcat88@icloud.com> * Fix LTXV Reference Audio node (#13531) * comfy-aimdo 0.2.14: Hotfix async allocator estimations (#13534) This was doing an over-estimate of VRAM used by the async allocator when lots of little small tensors were in play. Also change the versioning scheme to == so we can roll forward aimdo without worrying about stable regressions downstream in comfyUI core. * Disable sageattention for SAM3 (#13529) Causes Nans * execution: Add anti-cycle validation (#13169) Currently if the graph contains a cycle, the just inifitiate recursions, hits a catch all then throws a generic error against the output node that seeded the validation. Instead, fail the offending cycling mode chain and handlng it as an error in its own right. Co-authored-by: guill <jacob.e.segal@gmail.com> * chore: update workflow templates to v0.9.62 (#13539) --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Octopus <liyuan851277048@icloud.com> Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com> Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Co-authored-by: AustinMroz <austin@comfy.org> Co-authored-by: Daxiong (Lin) <contact@comfyui-wiki.com> Co-authored-by: Matt Miller <matt@miller-media.com> Co-authored-by: blepping <157360029+blepping@users.noreply.github.com> Co-authored-by: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com> Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com> Co-authored-by: guill <jacob.e.segal@gmail.com>

Make the ltx audio vae more native.

bfc70a7

comfyanonymous requested review from Kosinkadink and guill as code owners April 21, 2026 02:58

comfyanonymous mentioned this pull request Apr 21, 2026

VocoderWithBWE: run forward pass in fp32. #13219

Open

alexisrolland changed the title ~~Make the ltx audio vae more native.~~ Make the ltx audio vae more native (CORE-76) Apr 21, 2026

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

rattus128 approved these changes Apr 21, 2026

View reviewed changes

comfyanonymous merged commit ad94d47 into master Apr 21, 2026
16 checks passed

comfyanonymous deleted the temp_pr branch April 26, 2026 00:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the ltx audio vae more native (CORE-76)#13486

Make the ltx audio vae more native (CORE-76)#13486
comfyanonymous merged 1 commit into
masterfrom
temp_pr

comfyanonymous commented Apr 21, 2026

Uh oh!

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Uh oh!

coderabbitai Bot Apr 21, 2026

Uh oh!

alexisrolland commented Apr 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

zwukong commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

comfyanonymous commented Apr 21, 2026

Uh oh!

coderabbitai Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

alexisrolland commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

zwukong commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading

alexisrolland commented Apr 21, 2026 •

edited

Loading