Nemotron super support by angeloskath · Pull Request #992 · ml-explore/mlx-lm · GitHub

angeloskath · 2026-03-13T18:25:59Z

As title. Will add more as the port progresses.

Thump604 · 2026-03-14T18:44:17Z

Tested alongside PR #988 on Mac Studio M2 Ultra (128GB) with inferencerlabs/NVIDIA-Nemotron-3-Super-120B-A12B-MLX-4.5bit.

Combined branch (this PR + #988) loads and runs correctly:

Model loads without errors, 88 layers parsed from hybrid_override_pattern string
LatentMoE projection (4096 → 1024 → experts → 1024 → 4096) working correctly
512 routed experts with top-22 selection functional
48.7 tok/s generation, 68.2 GB peak RAM on M2 Ultra
Also tested via vllm-mlx serving stack: chat, tool calling, streaming all pass

Note: This PR requires #988 (SSM precision fix) to produce coherent output. Without it, the Metal decode kernel degenerates after ~15 tokens.

Minor note: the hybrid_override_pattern config field comes as a string from the HuggingFace config.json (not a list), but Python iteration handles it correctly since List[str] iterates chars from a string. Works fine as-is.

Thump604 · 2026-03-14T19:10:09Z

Here are unit tests for the LatentMoE additions. 11 tests, all passing on mlx 0.31.1 / Python 3.12.

Tests cover:

ModelArgs parsing: moe_latent_size, layers_block_type → hybrid_override_pattern normalization, hybrid_override_pattern as string (HuggingFace config format), time_step_limit inf upper bound
NemotronHMoE forward pass: latent projection shapes (hidden→latent→experts→latent→hidden), no-projection fallback, fc1/fc2_latent_proj layer shapes, shared expert receiving original residuals (not latent-projected input)
sanitize(): MTP weight stripping

Test file: tests/test_nemotron_latentmoe.py

"""Tests for Nemotron-H LatentMoE support (PR #992).

Tests the additions to nemotron_h.py:
- ModelArgs: moe_latent_size, layers_block_type normalization, time_step_limit defaults
- NemotronHMoE: latent projection forward pass
- Model.sanitize: MTP weight stripping
"""
import unittest

import mlx.core as mx
import mlx.nn as nn

from mlx_lm.models.nemotron_h import Model, ModelArgs, NemotronHMoE


class TestModelArgsLatentMoE(unittest.TestCase):
    """Test ModelArgs parsing for Nemotron Super config fields."""

    def _base_args(self, **overrides):
        cfg = {
            "model_type": "nemotron_h",
            "vocab_size": 1000,
            "hidden_size": 128,
            "intermediate_size": 64,
            "num_hidden_layers": 4,
            "max_position_embeddings": 1000,
            "num_attention_heads": 4,
            "num_key_value_heads": 2,
            "attention_bias": False,
            "mamba_num_heads": 4,
            "mamba_head_dim": 32,
            "mamba_proj_bias": False,
            "ssm_state_size": 32,
            "conv_kernel": 4,
            "n_groups": 2,
            "time_step_min": 0.001,
            "mlp_bias": False,
            "layer_norm_epsilon": 1e-5,
            "use_bias": False,
            "use_conv_bias": True,
            "hybrid_override_pattern": ["M", "E", "*", "E"],
            "n_routed_experts": 8,
            "num_experts_per_tok": 2,
            "moe_intermediate_size": 64,
        }
        cfg.update(overrides)
        return ModelArgs(**cfg)

    def test_moe_latent_size_parsed(self):
        args = self._base_args(moe_latent_size=32)
        self.assertEqual(args.moe_latent_size, 32)

    def test_moe_latent_size_none_by_default(self):
        args = self._base_args()
        self.assertIsNone(args.moe_latent_size)

    def test_layers_block_type_normalization(self):
        args = self._base_args(
            hybrid_override_pattern=None,
            layers_block_type=["mamba", "moe", "attention", "moe"],
        )
        self.assertEqual(args.hybrid_override_pattern, ["M", "E", "*", "E"])
        self.assertEqual(args.num_hidden_layers, 4)

    def test_hybrid_override_pattern_string(self):
        args = self._base_args(hybrid_override_pattern="ME*E")
        self.assertEqual(len(args.hybrid_override_pattern), 4)
        self.assertEqual(list(args.hybrid_override_pattern), ["M", "E", "*", "E"])

    def test_time_step_limit_no_upper_bound(self):
        args = self._base_args(time_step_min=0.001)
        self.assertEqual(args.time_step_limit[0], 0.001)
        self.assertEqual(args.time_step_limit[1], float("inf"))

    def test_time_step_limit_explicit_overrides(self):
        args = self._base_args(time_step_limit=(0.01, 0.5), time_step_min=0.001)
        self.assertEqual(args.time_step_limit, (0.01, 0.5))


class TestNemotronHMoELatent(unittest.TestCase):

    def _make_config(self, moe_latent_size=None):
        return self._base_args(moe_latent_size=moe_latent_size)

    def _base_args(self, **overrides):
        cfg = {
            "model_type": "nemotron_h",
            "vocab_size": 1000,
            "hidden_size": 64,
            "intermediate_size": 32,
            "num_hidden_layers": 2,
            "max_position_embeddings": 512,
            "num_attention_heads": 4,
            "num_key_value_heads": 2,
            "attention_bias": False,
            "mamba_num_heads": 4,
            "mamba_head_dim": 16,
            "mamba_proj_bias": False,
            "ssm_state_size": 16,
            "conv_kernel": 4,
            "n_groups": 2,
            "time_step_min": 0.001,
            "mlp_bias": False,
            "layer_norm_epsilon": 1e-5,
            "use_bias": False,
            "use_conv_bias": True,
            "hybrid_override_pattern": ["E", "E"],
            "n_routed_experts": 4,
            "num_experts_per_tok": 2,
            "moe_intermediate_size": 32,
            "n_group": 1,
            "topk_group": 1,
            "routed_scaling_factor": 1.0,
            "norm_topk_prob": True,
        }
        cfg.update(overrides)
        return ModelArgs(**cfg)

    def test_latent_projection_shapes(self):
        config = self._make_config(moe_latent_size=16)
        moe = NemotronHMoE(config)
        mx.eval(moe.parameters())
        x = mx.random.normal((1, 1, 64))
        y = moe(x)
        mx.eval(y)
        self.assertEqual(y.shape, (1, 1, 64))

    def test_no_latent_projection(self):
        config = self._make_config(moe_latent_size=None)
        moe = NemotronHMoE(config)
        mx.eval(moe.parameters())
        x = mx.random.normal((1, 1, 64))
        y = moe(x)
        mx.eval(y)
        self.assertEqual(y.shape, (1, 1, 64))

    def test_latent_projection_has_layers(self):
        config = self._make_config(moe_latent_size=16)
        moe = NemotronHMoE(config)
        self.assertTrue(hasattr(moe, "fc1_latent_proj"))
        self.assertTrue(hasattr(moe, "fc2_latent_proj"))
        self.assertEqual(moe.fc1_latent_proj.weight.shape, (16, 64))
        self.assertEqual(moe.fc2_latent_proj.weight.shape, (64, 16))

    def test_shared_expert_gets_original_input(self):
        config = self._make_config(moe_latent_size=16)
        config.n_shared_experts = 1
        config.moe_shared_expert_intermediate_size = 32
        moe = NemotronHMoE(config)
        mx.eval(moe.parameters())
        x = mx.random.normal((1, 1, 64))
        y = moe(x)
        mx.eval(y)
        self.assertEqual(y.shape, (1, 1, 64))


class TestSanitizeMTP(unittest.TestCase):

    def test_mtp_weights_stripped(self):
        config = ModelArgs(
            model_type="nemotron_h", vocab_size=100, hidden_size=64,
            intermediate_size=32, num_hidden_layers=2, max_position_embeddings=256,
            num_attention_heads=4, num_key_value_heads=2, attention_bias=False,
            mamba_num_heads=4, mamba_head_dim=16, mamba_proj_bias=False,
            ssm_state_size=16, conv_kernel=4, n_groups=2, time_step_min=0.001,
            mlp_bias=False, layer_norm_epsilon=1e-5, use_bias=False,
            use_conv_bias=True, hybrid_override_pattern=["*", "M"],
        )
        model = Model(config)
        weights = {
            "model.embed_tokens.weight": mx.zeros((100, 64)),
            "model.layers.0.norm.weight": mx.zeros((64,)),
            "mtp.layers.0.weight": mx.zeros((64, 64)),
            "mtp.head.weight": mx.zeros((100, 64)),
        }
        sanitized = model.sanitize(weights)
        self.assertNotIn("mtp.layers.0.weight", sanitized)
        self.assertNotIn("mtp.head.weight", sanitized)
        self.assertIn("model.embed_tokens.weight", sanitized)

angeloskath · 2026-03-15T22:36:21Z

Hi @Thump604 I don't see the same thing, #988 is not needed to produce coherent output. I am trying to find a way to gauge whether keeping the state in fp32 has any effect at all.

Contrary to #997 there is a speed regression by moving the SSM to fp32 because the batched path is not via a custom kernel. I will likely still move it to fp32 in this PR just to be more compatible to other implementations.

Thump604 · 2026-03-16T00:48:51Z

Thanks for consolidating into this PR.

On the fp32 state question — I'll put together a comparison with before/after output from the 4.5-bit Nemotron quant on M2 Ultra. The degradation we observed was during autoregressive generation: output became incoherent after ~15 tokens with bf16 state, coherent with fp32. It's possible the effect is more pronounced with aggressive quantization (4.5-bit) than with higher-precision weights — I'll capture concrete samples and share them here.

Glad to hear you're planning to adopt fp32 regardless for cross-implementation compatibility.

angeloskath · 2026-03-16T01:15:22Z

I did try 4.5, 5, 6.5 and 8.5 bpw didn't really make a noticeable difference so I would love to have a reproducible issue.

Thump604 · 2026-03-16T01:17:34Z

Here's the bf16 vs fp32 state comparison you asked about. Tested on M2 Ultra 128GB with the 5-bit Nemotron-3-Super-120B-A12B quant, temp=1.0, top_p=0.95, 200 tokens, 3 trials each. The monkey-patch reverts only the state precision (dt cast + output dtype + Metal kernel static_cast<T>), keeping the dt lower-bound-only clamp from this PR.

Prompt: "Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots."

fp32 state (3/3 coherent)

All three trials immediately begin solving the equation with correct math:

Trial 1: Directly applies quadratic formula → correct roots (5/3 and -4)
Trial 2: Step-by-step with correct discriminant (289), correct coefficients
Trial 3: Structured step-by-step, correct discriminant calculation

bf16 state (1/3 failed, 1/3 noisy, 1/3 coherent)

Trial 1 — failure: Model never solves the equation. Instead generates additional prompt-like text:

If the equation has no real roots, explain why.

Calculate:
1. The discriminant (D)
2. The first root (x₁)
3. The second root (x₂)

Answer format:
Step-by-step explanation:
[your explanation]
...

Then meta-reasons about the format rather than answering.

Trial 3 — noisy: Blurts answer first, then generates user-like instructions ("I need to see the step-by-step solution clearly. Use the quadratic formula or factoring method.") before re-solving.

Trial 2 — correct: Coherent step-by-step solution.

Analysis

The bf16 failure mode is "prompt regeneration" — the model generates text that looks like additional user instructions rather than its own response, as if it loses track of the conversation boundary. This is consistent with SSM recurrence state precision loss: small errors compound across decode steps, causing the model's internal state to drift enough that it confuses the user/assistant boundary.

The effect is intermittent (1-2 out of 3 trials), not deterministic, and manifests as contextual confusion rather than gibberish. It may be more pronounced with aggressive quantization (5-bit weights amplify the bf16 state precision gap).

fp32 state shows no instances of this behavior across all trials.

angeloskath · 2026-03-16T02:39:07Z

Well in all honesty I don't see a difference at all. Let me know if you can tell which is which without looking at the inference speed from the two runs below 🤷‍♂️

Run 1

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic equation 3x^2 + 7x -20 =0. Show steps: discriminant, roots using quadratic formula. Provide step-by-step. Should be straightforward. Ensure we show discriminant D = b^2 -4ac = 7^2 -4*3*(-20)=49+240=289. sqrt(D)=17. Then x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(6). So solutions: (-7+17)/6 =10/6=5/3. (-7-17)/6 = -24/6 = -4. So roots: x =5/3, x = -4. Provide explanation. Also maybe factorization: (3x-5)(x+4)=0? Check: (3x-5)(x+4)=3x^2+12x-5x-20=3x^2+7x-20 correct. So roots x=5/3, x=-4. Provide steps.

</think>

**Quadratic equation**

\[
3x^{2}+7x-20=0
\]

We will solve it step‑by‑step using the quadratic formula and verify the result by factoring.

---

### 1. Identify the coefficients  

For a quadratic \(ax^{2}+bx+c=0\),

\[
a = 3,\qquad b = 7,\qquad c = -20 .
\]

---

### 2. Compute the discriminant  

\[
\Delta = b^{2}-4ac
       = 7^{2}-4(3)(-20)
       = 49 + 240
       = 289 .
\]

The discriminant is positive, so we expect two distinct real roots.

---

### 3. Take the square‑root of the discriminant  

\[
\sqrt{\Delta}= \sqrt{289}=17 .
\]

---

### 4. Apply the quadratic formula  

\[
x=\frac{-b\pm\sqrt{\Delta}}{2a}
   =\frac{-7\pm 17}{2\cdot 3}
   =\frac{-7\pm 17}{6}.
\]

Now evaluate the two possibilities:

* **Plus sign**

\[
x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

* **Minus sign**

\[
x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4.
\]

---

### 5. Verify by factoring (optional)

If the roots are \(x=\frac{5}{3}\) and \(x=-4\), the quadratic can be written as

\[
3\Bigl(x-\frac{5}{3}\Bigr)(x+4)=0
\]

Multiplying out:

\[
3\Bigl(x-\frac{5}{3}\Bigr)(x+4)=3\Bigl(x^{2}+4x-\frac{5}{3}x-\frac{20}{3}\Bigr)
=3x^{2}+12x-5x-20=3x^{2}+7x-20,
\]

which matches the original equation, confirming the solutions.

---

## Final answer

\[
\boxed{x=\frac{5}{3}\quad\text{or}\quad x=-4}
\]
==========
Prompt: 56 tokens, 238.093 tokens-per-sec
Generation: 737 tokens, 57.041 tokens-per-sec
Peak memory: 68.670 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 + 7x -20 =0. Provide step-by-step: compute discriminant D = b^2 -4ac = 7^2 -4*3*(-20) =49 +240 =289. sqrt D =17. Then roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(2*3) = (-7+17)/6 =10/6=5/3; (-7-17)/6 = -24/6 = -4. So roots x=5/3 and x=-4. Show working. Also mention factoring maybe: (3x-5)(x+4)=0? Let's check: (3x-5)(x+4)=3x^2+12x-5x-20 =3x^2+7x-20 correct. So roots: x=5/3, x=-4. Provide step-by-step. Ensure clarity.

</think>

**Quadratic equation**

\[
3x^{2}+7x-20=0
\]

We will solve it using the quadratic‑formula method, showing every intermediate step, including the discriminant.

---

### 1. Identify the coefficients  

For a quadratic \(ax^{2}+bx+c=0\),

\[
a = 3,\qquad b = 7,\qquad c = -20 .
\]

---

### 2. Compute the discriminant  

The discriminant \( \Delta \) tells us the nature of the roots:

\[
\Delta = b^{2}-4ac .
\]

Plug in the values:

\[
\begin{aligned}
\Delta &= 7^{2}-4\,(3)\,(-20) \\
       &= 49 - 4\cdot3\cdot(-20) \\
       &= 49 - (-240) \\
       &= 49 + 240 \\
       &= 289 .
\end{aligned}
\]

\[
\boxed{\Delta = 289}
\]

Since \(\Delta>0\) and is a perfect square, we will obtain two distinct rational roots.

---

### 3. Take the square‑root of the discriminant  

\[
\sqrt{\Delta}= \sqrt{289}=17 .
\]

---

### 4. Apply the quadratic formula  

\[
x=\frac{-b\pm\sqrt{\Delta}}{2a}.
\]

Insert \(b=7\), \(\sqrt{\Delta}=17\), and \(a=3\):

\[
x=\frac{-7\pm 17}{2\cdot 3}
   =\frac{-7\pm 17}{6}.
\]

Now evaluate the two possibilities.

* **Plus sign**  

\[
x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

* **Minus sign**  

\[
x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4.
\]

---

### 5. Solution set  

\[
\boxed{x=\frac{5}{3}\quad\text{or}\quad x=-4}
\]

---

### 6. Quick check (optional)  

Factor the original quadratic using the roots:

\[
3x^{2}+7x-20 = 3\bigl(x-\tfrac{5}{3}\bigr)(x+4)
             = (3x-5)(x+4),
\]

which expands back to \(3x^{2}+7x-20\). The factorization confirms the roots are correct.
==========
Prompt: 56 tokens, 238.656 tokens-per-sec
Generation: 780 tokens, 57.256 tokens-per-sec
Peak memory: 68.673 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 + 7x - 20 = 0. Compute discriminant: b^2 - 4ac = 7^2 - 4*3*(-20) = 49 + 240 = 289 = 17^2. Then roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(6). So x1 = (-7 + 17)/6 = 10/6 = 5/3 ≈1.6667. x2 = (-7 - 17)/6 = -24/6 = -4. So answer. Show steps. Provide explanation.

</think>

**Quadratic equation**

\[
3x^{2}+7x-20=0
\]

A quadratic \(ax^{2}+bx+c=0\) is solved with the quadratic formula  

\[
x=\frac{-b\pm\sqrt{b^{2}-4ac}}{2a},
\]

where the term under the square‑root, \(b^{2}-4ac\), is called the **discriminant** (Δ).

---

### 1. Identify the coefficients  

\[
a = 3,\qquad b = 7,\qquad c = -20
\]

---

### 2. Compute the discriminant  

\[
\Delta = b^{2}-4ac = 7^{2}-4(3)(-20)
\]

\[
\begin{aligned}
7^{2} &= 49\\[2pt]
-4ac &= -4\cdot 3 \cdot (-20)= +240\\[2pt]
\Delta &= 49 + 240 = 289
\end{aligned}
\]

\[
\boxed{\Delta = 289 = 17^{2}}
\]

Because the discriminant is a perfect square, the roots will be rational numbers.

---

### 3. Apply the quadratic formula  

\[
x = \frac{-b\pm\sqrt{\Delta}}{2a}
   = \frac{-7\pm 17}{2\cdot 3}
   = \frac{-7\pm 17}{6}
\]

---

### 4. Compute the two possible values  

**First root (using +):**

\[
x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}\approx 1.6667
\]

**Second root (using −):**

\[
x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4
\]

---

## Solution

\[
\boxed{x_{1}= \frac{5}{3}\quad\text{or}\quad x_{2}= -4}
\]

Both satisfy the original equation:

- For \(x=\frac{5}{3}\): \(3\left(\frac{5}{3}\right)^{2}+7\left(\frac{5}{3}\right)-20 = 3\cdot\frac{25}{9}+\frac{35}{3}-20 = \frac{75}{9}+\frac{105}{9}-\frac{180}{9}=0\).

- For \(x=-4\): \(3(-4)^{2}+7(-4)-20 = 3\cdot16-28-20 = 48-48=0\).
==========
Prompt: 56 tokens, 230.539 tokens-per-sec
Generation: 737 tokens, 56.958 tokens-per-sec
Peak memory: 68.677 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2+7x-20=0. Provide step-by-step: compute discriminant D = b^2-4ac = 7^2-4*3*(-20)=49+240=289. sqrt(D)=17. Roots = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(2*3)=[-7+17]/6=10/6=5/3; [-7-17]/6=-24/6=-4. So solutions: x=5/3 and x=-4. Provide explanation. Also maybe factor? 3x^2+7x-20 = (3x-5)(x+4)? Let's check: (3x-5)(x+4)=3x^2+12x-5x-20=3x^2+7x-20 correct. So roots: x=5/3, x=-4.

Will provide step-by-step.

</think>

**Quadratic equation**

\[
3x^{2}+7x-20=0
\]

A quadratic \(ax^{2}+bx+c=0\) is solved with the quadratic formula  

\[
x=\frac{-b\pm\sqrt{b^{2}-4ac}}{2a},
\]

where the term under the square‑root, \(D=b^{2}-4ac\), is called the **discriminant**.

---

### 1. Identify the coefficients  

\[
a = 3,\qquad b = 7,\qquad c = -20.
\]

---

### 2. Compute the discriminant  

\[
\begin{aligned}
D &= b^{2}-4ac \\
  &= 7^{2}-4\cdot 3\cdot (-20) \\
  &= 49 - (-240) \\
  &= 49 + 240 \\
  &= 289.
\end{aligned}
\]

\(289\) is a perfect square: \(\sqrt{289}=17\).

---

### 3. Apply the quadratic formula  

\[
\begin{aligned}
x &= \frac{-b\pm\sqrt{D}}{2a} \\
  &= \frac{-7\pm 17}{2\cdot 3} \\
  &= \frac{-7\pm 17}{6}.
\end{aligned}
\]

Now evaluate the two possibilities:

* **Plus sign**

\[
x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

* **Minus sign**

\[
x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4.
\]

---

### 4. Result  

\[
\boxed{x = \frac{5}{3}\quad\text{or}\quad x = -4}
\]

(You can verify by factoring: \(3x^{2}+7x-20 = (3x-5)(x+4)=0\), giving the same roots.)
==========
Prompt: 56 tokens, 236.719 tokens-per-sec
Generation: 657 tokens, 57.259 tokens-per-sec
Peak memory: 68.677 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic: 3x^2 + 7x -20 =0. Compute discriminant: b^2 -4ac = 7^2 -4*3*(-20)=49 +240=289. sqrt =17. Roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(6). So x1 = (-7 +17)/6 =10/6=5/3. x2 = (-7 -17)/6 = -24/6 = -4. Show steps. Provide explanation.

</think>

**Equation:**  

\[
3x^{2}+7x-20=0
\]

This is a quadratic equation of the form \(ax^{2}+bx+c=0\) with  

\[
a = 3,\qquad b = 7,\qquad c = -20 .
\]

---

### 1. Compute the discriminant  

The discriminant \( \Delta \) tells us how many real solutions the quadratic has:

\[
\Delta = b^{2}-4ac .
\]

Plug in the coefficients:

\[
\begin{aligned}
\Delta &= 7^{2} - 4(3)(-20) \\
       &= 49 - \bigl( -240 \bigr) \\
       &= 49 + 240 \\
       &= 289 .
\end{aligned}
\]

\[
\boxed{\Delta = 289}
\]

Since \(\Delta>0\) and is a perfect square, there are two distinct rational roots.

---

### 2. Take the square root of the discriminant  

\[
\sqrt{\Delta}= \sqrt{289}=17 .
\]

---

### 3. Apply the quadratic formula  

\[
x=\frac{-b\pm\sqrt{\Delta}}{2a}.
\]

Insert \(b=7\), \(\sqrt{\Delta}=17\), and \(a=3\):

\[
x=\frac{-7\pm 17}{2\cdot 3}= \frac{-7\pm 17}{6}.
\]

---

### 4. Compute the two possible values  

*Using the plus sign:*  

\[
x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

*Using the minus sign:*  

\[
x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4.
\]

---

## Solution set  

\[
\boxed{\displaystyle x = \frac{5}{3}\quad\text{or}\quad x = -4}
\]

Both values satisfy the original equation:

- For \(x=\frac{5}{3}\): \(3\left(\frac{5}{3}\right)^{2}+7\left(\frac{5}{3}\right)-20 = 3\cdot\frac{25}{9}+\frac{35}{3}-20 = \frac{75}{9}+\frac{105}{9}-\frac{180}{9}=0\).

- For \(x=-4\): \(3(-4)^{2}+7(-4)-20 = 3\cdot16-28-20 = 48-48=0\).

Thus the equation has the two real roots \(\displaystyle \frac{5}{3}\) and \(-4\).
==========
Prompt: 56 tokens, 238.276 tokens-per-sec
Generation: 725 tokens, 57.041 tokens-per-sec
Peak memory: 68.673 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 +7x -20 =0. Compute discriminant D = b^2 -4ac = 7^2 -4*3*(-20) =49 +240 =289. sqrt(D)=17. Roots: x = [-b ± sqrt(D)]/(2a) = [-7 ±17]/(6). So x1 = (-7+17)/6 =10/6=5/3. x2 = (-7-17)/6 = -24/6 = -4. So answer with steps.

We need to show discriminant calculation and both roots. Provide step-by-step.

</think>

**Step‑by‑step solution of \(3x^{2}+7x-20=0\)**  

---

### 1. Identify the coefficients  
For a quadratic \(ax^{2}+bx+c=0\):

\[
a = 3,\qquad b = 7,\qquad c = -20
\]

---

### 2. Compute the discriminant \(D\)

\[
D = b^{2}-4ac
\]

\[
\begin{aligned}
D &= 7^{2} - 4\,(3)\,(-20)   \\
  &= 49 - (12)(-20)        \\
  &= 49 + 240              \\
  &= 289
\end{aligned}
\]

\(D = 289\) is a perfect square, so the roots will be rational.

---

### 3. Take the square‑root of the discriminant  

\[
\sqrt{D}= \sqrt{289}= 17
\]

---

### 4. Apply the quadratic formula  

\[
x = \frac{-b \pm \sqrt{D}}{2a}
\]

Insert the values:

\[
x = \frac{-7 \pm 17}{2\cdot 3}= \frac{-7 \pm 17}{6}
\]

---

### 5. Compute the two roots  

* **First root (using \(+\)):**  

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}
\]

* **Second root (using \(-\)):**  

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4
\]

---

### 6. Final answer  

\[
\boxed{x = \dfrac{5}{3}\quad\text{or}\quad x = -4}
\]

Both satisfy the original equation:

- For \(x=\frac{5}{3}\): \(3\left(\frac{5}{3}\right)^{2}+7\left(\frac{5}{3}\right)-20 = 3\cdot\frac{25}{9}+ \frac{35}{3}-20 = \frac{75}{9}+\frac{105}{9}-\frac{180}{9}=0\).

- For \(x=-4\): \(3(-4)^{2}+7(-4)-20 = 3\cdot16-28-20 = 48-48=0\).
==========
Prompt: 56 tokens, 237.646 tokens-per-sec
Generation: 690 tokens, 56.946 tokens-per-sec
Peak memory: 68.673 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic: 3x^2 + 7x -20 =0. Compute discriminant: b^2 -4ac = 7^2 -4*3*(-20)=49 +240=289. sqrt=17. Roots: (-b ± sqrt)/2a = (-7 ±17)/(2*3)=(-7+17)/6 =10/6=5/3; (-7-17)/6 = -24/6 = -4. So roots: x = 5/3, -4.

We need to show step-by-step, discriminant calculation, both roots. Provide explanation.

</think>

**Equation**

\[
3x^{2}+7x-20=0
\]

This is a quadratic equation of the form \(ax^{2}+bx+c=0\) with  

\[
a=3,\qquad b=7,\qquad c=-20 .
\]

---

### 1. Compute the discriminant  

The discriminant \( \Delta \) tells us how many real solutions exist:

\[
\Delta = b^{2}-4ac .
\]

Plugging in the coefficients:

\[
\begin{aligned}
\Delta &= 7^{2} - 4\,(3)\,(-20)  \\
       &= 49 - ( -240 )          \\
       &= 49 + 240               \\
       &= 289 .
\end{aligned}
\]

Since \(\Delta > 0\) and is a perfect square (\(289 = 17^{2}\)), the equation has **two distinct rational roots**.

---

### 2. Apply the quadratic formula  

\[
x = \frac{-b \pm \sqrt{\Delta}}{2a}.
\]

Insert \(b=7\), \(\sqrt{\Delta}=17\), and \(a=3\):

\[
x = \frac{-7 \pm 17}{2\cdot 3}
   = \frac{-7 \pm 17}{6}.
\]

---

### 3. Compute each root  

- **Using the plus sign**:

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

- **Using the minus sign**:

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4.
\]

---

## Solution

\[
\boxed{x = \frac{5}{3}\quad \text{or}\quad x = -4}
\]

Both values satisfy the original equation:

- For \(x=\frac{5}{3}\): \(3\left(\frac{5}{3}\right)^{2}+7\left(\frac{5}{3}\right)-20 = 3\cdot\frac{25}{9}+ \frac{35}{3}-20 = \frac{75}{9}+ \frac{105}{9}-\frac{180}{9}=0\).

- For \(x=-4\): \(3(-4)^{2}+7(-4)-20 = 3\cdot16-28-20 = 48-48=0\).
==========
Prompt: 56 tokens, 236.769 tokens-per-sec
Generation: 683 tokens, 57.203 tokens-per-sec
Peak memory: 68.673 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic: 3x^2 + 7x - 20 = 0. Compute discriminant: b^2 - 4ac = 7^2 - 4*3*(-20) = 49 + 240 = 289 = 17^2. Roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(6). So x1 = (-7 + 17)/6 = 10/6 = 5/3 ≈1.6667. x2 = (-7 - 17)/6 = -24/6 = -4. So answer with steps. Provide explanation.

</think>

**Step‑by‑step solution of**  

\[
3x^{2}+7x-20=0
\]

---

### 1. Identify the coefficients  

For a quadratic \(ax^{2}+bx+c=0\),

\[
a = 3,\qquad b = 7,\qquad c = -20.
\]

---

### 2. Compute the discriminant  

The discriminant \( \Delta \) tells us the nature of the roots:

\[
\Delta = b^{2}-4ac.
\]

Plugging in the values:

\[
\begin{aligned}
\Delta &= 7^{2} - 4(3)(-20) \\
       &= 49 - ( -240) \\
       &= 49 + 240 \\
       &= 289.
\end{aligned}
\]

Since \(289 = 17^{2}\) is a perfect square, the equation will have two **real and rational** roots.

---

### 3. Apply the quadratic formula  

\[
x = \frac{-b \pm \sqrt{\Delta}}{2a}.
\]

Insert \(b = 7\), \(\sqrt{\Delta}=17\), and \(a = 3\):

\[
x = \frac{-7 \pm 17}{2\cdot 3} = \frac{-7 \pm 17}{6}.
\]

---

### 4. Compute the two roots  

- **Using the plus sign**  

\[
x_{1} = \frac{-7 + 17}{6} = \frac{10}{6} = \frac{5}{3}\approx 1.6667.
\]

- **Using the minus sign**  

\[
x_{2} = \frac{-7 - 17}{6} = \frac{-24}{6} = -4.
\]

---

### 5. Final answer  

\[
\boxed{x = \frac{5}{3}\quad\text{or}\quad x = -4}
\]

Both solutions satisfy the original equation:

- For \(x = \frac{5}{3}\): \(3\left(\frac{5}{3}\right)^{2}+7\left(\frac{5}{3}\right)-20 = 3\cdot\frac{25}{9}+\frac{35}{3}-20 = \frac{75}{9}+\frac{105}{9}-\frac{180}{9}=0\).

- For \(x = -4\): \(3(-4)^{2}+7(-4)-20 = 3\cdot16-28-20 = 48-48 = 0\).

Thus the quadratic equation \(3x^{2}+7x-20=0\) has the two real roots \(\displaystyle \frac{5}{3}\) and \(-4\).
==========
Prompt: 56 tokens, 237.612 tokens-per-sec
Generation: 763 tokens, 56.986 tokens-per-sec
Peak memory: 68.677 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 + 7x -20 =0. Provide step-by-step explanation, discriminant, roots. Should be clear. No special constraints. Provide final answer.

</think>

**Step‑by‑step solution of**  

\[
3x^{2}+7x-20=0
\]

---

### 1. Identify the coefficients  

For a quadratic \(ax^{2}+bx+c=0\),

\[
a = 3,\qquad b = 7,\qquad c = -20.
\]

---

### 2. Compute the discriminant  

The discriminant \( \Delta \) tells us the nature of the roots:

\[
\Delta = b^{2}-4ac.
\]

Plug in the values:

\[
\begin{aligned}
\Delta &= 7^{2}-4(3)(-20) \\
       &= 49 - (-240) \\
       &= 49 + 240 \\
       &= 289.
\end{aligned}
\]

\[
\boxed{\Delta = 289}
\]

Since \( \Delta > 0\) and is a perfect square (\(289 = 17^{2}\)), the equation has two distinct real rational roots.

---

### 3. Apply the quadratic formula  

\[
x = \frac{-b \pm \sqrt{\Delta}}{2a}.
\]

Insert \(a, b,\) and \(\sqrt{\Delta}=17\):

\[
\begin{aligned}
x &= \frac{-7 \pm 17}{2\cdot 3} \\
  &= \frac{-7 \pm 17}{6}.
\end{aligned}
\]

---

### 4. Compute the two roots  

- **Using the plus sign** (\(+\)):

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

- **Using the minus sign** (\(-\)):

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4.
\]

---

### 5. Final answer  

\[
\boxed{x = \frac{5}{3}\quad \text{or}\quad x = -4}
\]

You can verify by substituting each root back into the original equation:

- For \(x = \frac{5}{3}\): \(3\left(\frac{5}{3}\right)^{2}+7\left(\frac{5}{3}\right)-20 = 3\cdot\frac{25}{9}+\frac{35}{3}-20 = \frac{75}{9}+\frac{105}{9}-\frac{180}{9}=0.\)

- For \(x = -4\): \(3(-4)^{2}+7(-4)-20 = 3\cdot16-28-20 = 48-48 =0.\)

Both satisfy the equation, confirming the solution is correct.
==========
Prompt: 56 tokens, 230.323 tokens-per-sec
Generation: 639 tokens, 57.328 tokens-per-sec
Peak memory: 68.673 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to explain step by step solving quadratic equation: 3x^2 + 7x - 20 = 0. Compute discriminant: b^2 - 4ac = 7^2 - 4*3*(-20) = 49 + 240 = 289. sqrt discriminant = sqrt(289) = 17. Then roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(2*3) = (-7+17)/6 = 10/6 = 5/3; (-7-17)/6 = -24/6 = -4. So roots: x = 5/3 and x = -4. Show steps. Also mention factorization maybe: (3x-5)(x+4)=0? Let's check: (3x-5)(x+4) = 3x^2 +12x -5x -20 = 3x^2 +7x -20 correct. So factorization yields roots x = 5/3 and x = -4. Provide full working. Ensure step-by-step. Also mention discriminant. Provide final answer.

</think>

**Equation**

\[
3x^{2}+7x-20=0
\]

This is a quadratic equation of the form \(ax^{2}+bx+c=0\) with  

\[
a=3,\qquad b=7,\qquad c=-20 .
\]

---

### 1. Compute the discriminant  

\[
\Delta = b^{2}-4ac
\]

\[
\begin{aligned}
\Delta &= 7^{2}-4(3)(-20) \\
       &= 49 - (-240) \\
       &= 49 + 240 \\
       &= 289 .
\end{aligned}
\]

\[
\boxed{\Delta = 289}
\]

Since \(\Delta>0\) and is a perfect square, the equation has two distinct rational roots.

---

### 2. Take the square‑root of the discriminant  

\[
\sqrt{\Delta}= \sqrt{289}=17 .
\]

---

### 3. Apply the quadratic formula  

\[
x=\frac{-b\pm\sqrt{\Delta}}{2a}
\]

\[
\begin{aligned}
x &=\frac{-7\pm 17}{2\cdot 3} \\
  &=\frac{-7\pm 17}{6}.
\end{aligned}
\]

Now evaluate the two possibilities.

* **Plus sign**

\[
x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

* **Minus sign**

\[
x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4.
\]

---

### 4. (Optional) Verify by factoring  

Since the roots are \(\frac{5}{3}\) and \(-4\),

\[
3x^{2}+7x-20 = 3\left(x-\frac{5}{3}\right)(x+4)
            = (3x-5)(x+4).
\]

Expanding \((3x-5)(x+4)\) gives \(3x^{2}+12x-5x-20 = 3x^{2}+7x-20\), confirming the factorization.

---

## Final Answer

\[
\boxed{x=\frac{5}{3}\quad\text{or}\quad x=-4}
\]
==========
Prompt: 56 tokens, 41.378 tokens-per-sec
Generation: 759 tokens, 57.083 tokens-per-sec
Peak memory: 68.670 GB

Run 2

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 +7x -20 =0. Compute discriminant: b^2-4ac = 7^2 -4*3*(-20)=49+240=289=17^2. Roots: x = [-b ± sqrt(D)]/(2a) = [-7 ±17]/(6). So x1 = (-7+17)/6 =10/6=5/3. x2 = (-7-17)/6 = -24/6 = -4. Provide step-by-step explanation.

We must comply with policy. It's fine.

</think>

**Step‑by‑step solution of the quadratic equation**

\[
3x^{2}+7x-20=0
\]

---

### 1. Identify the coefficients  
For a quadratic \(ax^{2}+bx+c=0\):

\[
a = 3,\qquad b = 7,\qquad c = -20
\]

---

### 2. Compute the discriminant \( \Delta \)

\[
\Delta = b^{2}-4ac
\]

\[
\Delta = 7^{2} - 4\,(3)\,(-20)
       = 49 - (-240)
       = 49 + 240
       = 289
\]

Since \(289 = 17^{2}\), the discriminant is a perfect square, so the roots will be rational.

---

### 3. Apply the quadratic formula  

\[
x = \frac{-b \pm \sqrt{\Delta}}{2a}
\]

Plug in the values:

\[
x = \frac{-7 \pm \sqrt{289}}{2\cdot 3}
   = \frac{-7 \pm 17}{6}
\]

---

### 4. Find the two roots  

- **Using the plus sign**  

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}
\]

- **Using the minus sign**  

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4
\]

---

### 5. Result  

\[
\boxed{x = \frac{5}{3}\quad\text{or}\quad x = -4}
\]

Both values satisfy the original equation:

- For \(x = \frac{5}{3}\): \(3\left(\frac{5}{3}\right)^{2}+7\left(\frac{5}{3}\right)-20 = 3\cdot\frac{25}{9}+\frac{35}{3}-20 = \frac{75}{9}+\frac{105}{9}-\frac{180}{9}=0\).

- For \(x = -4\): \(3(-4)^{2}+7(-4)-20 = 3\cdot16-28-20 = 48-48 = 0\).

Thus the solution set is \(\displaystyle \left\{\frac{5}{3},\,-4\right\}\).
==========
Prompt: 56 tokens, 239.462 tokens-per-sec
Generation: 673 tokens, 58.342 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 + 7x -20 =0. Compute discriminant D = b^2 -4ac = 7^2 -4*3*(-20)=49 +240=289. sqrt(D)=17. Roots: x = [-b ± sqrt(D)]/(2a)=[-7 ± 17]/(6). So x1 = (-7+17)/6 =10/6=5/3. x2 = (-7-17)/6 = -24/6 = -4. So answer with steps.

We need to show discriminant calculation and both roots. Provide step-by-step explanation.

</think>

**Quadratic equation**

\[
3x^{2}+7x-20=0
\]

A quadratic in the form \(ax^{2}+bx+c=0\) has solutions

\[
x=\frac{-b\pm\sqrt{b^{2}-4ac}}{2a},
\]

where the expression under the square‑root, \(b^{2}-4ac\), is called the **discriminant** (often denoted \(\Delta\)).  

---

### 1. Identify the coefficients  

\[
a = 3,\qquad b = 7,\qquad c = -20
\]

---

### 2. Compute the discriminant  

\[
\Delta = b^{2}-4ac
        = 7^{2}-4\,(3)\,(-20)
        = 49 - (12)(-20)
        = 49 + 240
        = 289.
\]

Because \(\Delta>0\) and is a perfect square, the equation has two distinct real rational roots.

---

### 3. Take the square‑root of the discriminant  

\[
\sqrt{\Delta}= \sqrt{289}=17.
\]

---

### 4. Apply the quadratic formula  

\[
x = \frac{-b\pm\sqrt{\Delta}}{2a}
   = \frac{-7\pm 17}{2\cdot 3}
   = \frac{-7\pm 17}{6}.
\]

---

### 5. Separate the “\(+\)” and “\(-\)” cases  

* **Plus sign**  

  \[
  x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}.
  \]

* **Minus sign**  

  \[
  x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4.
  \]

---

## Solution set  

\[
\boxed{\,x = \frac{5}{3}\quad\text{or}\quad x = -4\,}
\]

Both values satisfy the original equation (you can check by substitution if desired).
==========
Prompt: 56 tokens, 42.914 tokens-per-sec
Generation: 605 tokens, 58.193 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic: 3x^2 + 7x - 20 = 0. Compute discriminant: b^2 - 4ac = 7^2 - 4*3*(-20) = 49 + 240 = 289. sqrt discriminant = 17. Roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(6). So x1 = (-7 + 17)/6 = 10/6 = 5/3. x2 = (-7 - 17)/6 = -24/6 = -4. Provide step-by-step. Also maybe factor: (3x -5)(x+4)=0. Show both. Provide explanation.

</think>

**Step‑by‑step solution of**  

\[
3x^{2}+7x-20=0
\]

---

### 1. Identify the coefficients  

For a quadratic \(ax^{2}+bx+c=0\):

\[
a = 3,\qquad b = 7,\qquad c = -20
\]

---

### 2. Compute the discriminant  

\[
\Delta = b^{2}-4ac
\]

\[
\begin{aligned}
\Delta &= 7^{2}-4(3)(-20) \\
       &= 49-(-240) \\
       &= 49+240 \\
       &= 289
\end{aligned}
\]

Since \(\Delta>0\) the equation has two distinct real roots.

---

### 3. Take the square‑root of the discriminant  

\[
\sqrt{\Delta}= \sqrt{289}=17
\]

---

### 4. Apply the quadratic formula  

\[
x=\frac{-b\pm\sqrt{\Delta}}{2a}
\]

\[
\begin{aligned}
x_1 &= \frac{-7+17}{2\cdot3}= \frac{10}{6}= \frac{5}{3}\\[4pt]
x_2 &= \frac{-7-17}{2\cdot3}= \frac{-24}{6}= -4
\end{aligned}
\]

---

### 5. (Optional) Verify by factoring  

Because the discriminant is a perfect square, the quadratic factors nicely:

\[
3x^{2}+7x-20=(3x-5)(x+4)=0
\]

Setting each factor to zero gives the same solutions:

\[
3x-5=0 \;\Rightarrow\; x=\frac{5}{3},\qquad
x+4=0 \;\Rightarrow\; x=-4
\]

---

## Final answer  

\[
\boxed{x=\frac{5}{3}\quad\text{or}\quad x=-4}
\]
==========
Prompt: 56 tokens, 240.497 tokens-per-sec
Generation: 595 tokens, 58.002 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic: 3x^2 +7x -20 =0. Compute discriminant: b^2 -4ac = 7^2 -4*3*(-20) =49 +240 =289. sqrt =17. Roots: x = [-7 ± 17]/(2*3)=(-7+17)/6=10/6=5/3; (-7-17)/6 = -24/6 = -4. Provide step-by-step. Also mention discriminant positive, two real distinct roots. Provide full working. Ensure format.

</think>

**Step‑by‑step solution of**  

\[
3x^{2}+7x-20=0
\]

---

### 1. Identify the coefficients  
For a quadratic \(ax^{2}+bx+c=0\):

\[
a = 3,\qquad b = 7,\qquad c = -20
\]

---

### 2. Compute the discriminant \( \Delta \)  

\[
\Delta = b^{2}-4ac
\]

\[
\begin{aligned}
\Delta &= 7^{2}-4\cdot 3 \cdot (-20) \\
       &= 49 - (-240) \\
       &= 49 + 240 \\
       &= 289
\end{aligned}
\]

The discriminant is **positive** (\(\Delta = 289 > 0\)), so the equation has two distinct real roots.

---

### 3. Take the square root of the discriminant  

\[
\sqrt{\Delta}= \sqrt{289}=17
\]

---

### 4. Apply the quadratic formula  

\[
x = \frac{-b \pm \sqrt{\Delta}}{2a}
\]

Insert the values:

\[
x = \frac{-7 \pm 17}{2\cdot 3}= \frac{-7 \pm 17}{6}
\]

---

### 5. Separate the “+” and “–” cases  

- **Plus sign**  

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}
\]

- **Minus sign**  

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4
\]

---

### 6. Write the solutions  

\[
\boxed{x = \frac{5}{3}\quad \text{or}\quad x = -4}
\]

These are the two real roots of the equation \(3x^{2}+7x-20=0\).
==========
Prompt: 56 tokens, 240.827 tokens-per-sec
Generation: 571 tokens, 58.275 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 +7x -20 =0. Provide step-by-step, discriminant, roots. Use quadratic formula. Show discriminant b^2-4ac = 7^2 -4*3*(-20)=49+240=289. sqrt=17. Then x = [-7 ±17]/(2*3)=[-7+17]/6 =10/6=5/3, and [-7-17]/6 = -24/6 = -4. So roots: x=5/3 and x=-4. Provide explanation. Also maybe factorization: (3x-5)(x+4)=0. Provide steps. Ensure complete working.

</think>

**Given equation**

\[
3x^{2}+7x-20=0
\]

This is a quadratic equation of the form \(ax^{2}+bx+c=0\) with  

\[
a=3,\qquad b=7,\qquad c=-20 .
\]

We will solve it using the **quadratic formula**  

\[
x=\frac{-b\pm\sqrt{b^{2}-4ac}}{2a}.
\]

---

### 1. Compute the discriminant \( \Delta \)

\[
\Delta = b^{2}-4ac
      = 7^{2}-4(3)(-20)
      = 49-(-240)
      = 49+240
      = 289 .
\]

The discriminant is positive, so we expect two distinct real roots.

---

### 2. Take the square‑root of the discriminant

\[
\sqrt{\Delta}= \sqrt{289}=17 .
\]

---

### 3. Plug into the quadratic formula

\[
x=\frac{-b\pm\sqrt{\Delta}}{2a}
  =\frac{-7\pm 17}{2\cdot 3}
  =\frac{-7\pm 17}{6}.
\]

Now evaluate the two possibilities:

* **Using the plus sign**

\[
x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

* **Using the minus sign**

\[
x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4 .
\]

---

### 4. Result

\[
\boxed{x=\frac{5}{3}\quad\text{or}\quad x=-4}
\]

---

### (Optional) Verify by factoring

Since the roots are \(x=\frac{5}{3}\) and \(x=-4\),

\[
3x^{2}+7x-20 = 3\left(x-\frac{5}{3}\right)(x+4)
             = (3x-5)(x+4)=0,
\]

which indeed expands back to the original polynomial. Hence the solutions are correct.
==========
Prompt: 56 tokens, 42.859 tokens-per-sec
Generation: 622 tokens, 57.972 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 + 7x -20 = 0. Compute discriminant: b^2 - 4ac = 7^2 - 4*3*(-20) = 49 + 240 = 289. sqrt = 17. Roots: (-b ± sqrt)/2a = (-7 ± 17)/(6). So x1 = (-7 + 17)/6 = 10/6 = 5/3. x2 = (-7 - 17)/6 = -24/6 = -4. Provide step-by-step explanation.

We must follow style? Not specified. Just give clear steps.

</think>

**Step 1: Write the quadratic in standard form**  
The equation is already in the form \(ax^{2}+bx+c=0\) with  

\[
a = 3,\qquad b = 7,\qquad c = -20 .
\]

**Step 2: Compute the discriminant**  
The discriminant \(D\) tells us how many real solutions the quadratic has and is given by  

\[
D = b^{2}-4ac .
\]

Plugging in the values:

\[
\begin{aligned}
D &= 7^{2} - 4(3)(-20) \\
  &= 49 - (-240) \\
  &= 49 + 240 \\
  &= 289 .
\end{aligned}
\]

Since \(D>0\) and is a perfect square, we will have two distinct rational roots.

**Step 3: Take the square‑root of the discriminant**  

\[
\sqrt{D}= \sqrt{289}=17 .
\]

**Step 4: Apply the quadratic formula**  

\[
x = \frac{-b \pm \sqrt{D}}{2a}.
\]

Substituting \(b=7\), \(\sqrt{D}=17\), and \(a=3\):

\[
x = \frac{-7 \pm 17}{2\cdot 3}= \frac{-7 \pm 17}{6}.
\]

**Step 5: Compute the two possible values**

*Using the plus sign*  

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

*Using the minus sign*  

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4.
\]

---

### Final answer
\[
\boxed{x = \frac{5}{3}\quad\text{or}\quad x = -4}
\]
==========
Prompt: 56 tokens, 240.954 tokens-per-sec
Generation: 573 tokens, 58.092 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic 3x^2 + 7x - 20 = 0. Compute discriminant: b^2 - 4ac = 7^2 - 4*3*(-20) = 49 + 240 = 289 = 17^2. Roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(6). So x1 = (-7 + 17)/6 = 10/6 = 5/3. x2 = (-7 - 17)/6 = -24/6 = -4. Provide step-by-step. Also could factor? 3x^2+7x-20 = (3x-5)(x+4)?? Let's check: (3x-5)(x+4) = 3x^2+12x-5x-20 = 3x^2+7x-20 yes. So roots 5/3 and -4. Provide explanation. Ensure step-by-step.

</think>

**Equation to solve**

\[
3x^{2}+7x-20=0
\]

---

### 1. Identify the coefficients  

For a quadratic \(ax^{2}+bx+c=0\),

- \(a = 3\)
- \(b = 7\)
- \(c = -20\)

---

### 2. Compute the discriminant  

\[
\Delta = b^{2}-4ac
\]

\[
\Delta = 7^{2}-4\cdot 3\cdot (-20)
        = 49 - (-240)
        = 49 + 240
        = 289
\]

The discriminant is a perfect square:

\[
\sqrt{\Delta}= \sqrt{289}=17
\]

---

### 3. Apply the quadratic formula  

\[
x=\frac{-b\pm\sqrt{\Delta}}{2a}
\]

Insert the values:

\[
x=\frac{-7\pm 17}{2\cdot 3}
   =\frac{-7\pm 17}{6}
\]

---

### 4. Find the two roots  

- **Using the plus sign**

\[
x_{1}= \frac{-7+17}{6}= \frac{10}{6}= \frac{5}{3}
\]

- **Using the minus sign**

\[
x_{2}= \frac{-7-17}{6}= \frac{-24}{6}= -4
\]

---

### 5. (Optional) Verify by factoring  

Because the discriminant is a perfect square, the quadratic can be factored:

\[
3x^{2}+7x-20=(3x-5)(x+4)=0
\]

Setting each factor to zero gives the same solutions:

\[
3x-5=0 \;\Rightarrow\; x=\frac{5}{3},\qquad
x+4=0 \;\Rightarrow\; x=-4.
\]

---

## Final answer  

\[
\boxed{x=\frac{5}{3}\quad\text{or}\quad x=-4}
\]
==========
Prompt: 56 tokens, 241.214 tokens-per-sec
Generation: 675 tokens, 58.180 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic equation 3x^2 + 7x -20 = 0. Provide step-by-step explanation, discriminant, roots. Should be straightforward. Use quadratic formula: x = [-b ± sqrt(b^2 - 4ac)]/(2a). Compute discriminant: b^2 - 4ac = 7^2 - 4*3*(-20) = 49 + 240 = 289. sqrt = 17. So roots: (-7 ± 17)/(2*3) = (-7 + 17)/6 = 10/6 = 5/3; (-7 - 17)/6 = -24/6 = -4. So roots are x = 5/3 and x = -4. Show steps. Also maybe factoring: (3x -5)(x+4) = 0. Provide verification. Provide final answer.

</think>

**Equation:**  

\[
3x^{2}+7x-20=0
\]

We will solve it using the **quadratic formula** and show every intermediate step, including the discriminant.

---

### 1. Identify the coefficients  

For a quadratic \(ax^{2}+bx+c=0\),

\[
a = 3,\qquad b = 7,\qquad c = -20 .
\]

---

### 2. Compute the discriminant  

The discriminant \(\Delta\) tells us the nature of the roots:

\[
\Delta = b^{2}-4ac .
\]

Plug in the numbers:

\[
\begin{aligned}
\Delta &= 7^{2}-4(3)(-20) \\
       &= 49 - 4\cdot 3 \cdot (-20) \\
       &= 49 - (-240) \\
       &= 49 + 240 \\
       &= 289 .
\end{aligned}
\]

\[
\boxed{\Delta = 289}
\]

Since \(\Delta > 0\) and is a perfect square, we expect **two distinct rational roots**.

---

### 3. Take the square root of the discriminant  

\[
\sqrt{\Delta}= \sqrt{289}=17 .
\]

---

### 4. Apply the quadratic formula  

\[
x = \frac{-b \pm \sqrt{\Delta}}{2a}.
\]

Substitute \(b=7\), \(\sqrt{\Delta}=17\), and \(a=3\):

\[
x = \frac{-7 \pm 17}{2\cdot 3}
   = \frac{-7 \pm 17}{6}.
\]

Now handle the “\(\pm\)” separately.

* **Plus sign** (\(+\)):

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

* **Minus sign** (\(-\)):

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4 .
\]

---

### 5. Write the solution set  

\[
\boxed{x = \frac{5}{3}\quad\text{or}\quad x = -4 }.
\]

---

### 6. (Optional) Verify by factoring  

If the roots are correct, the quadratic should factor as  

\[
3x^{2}+7x-20 = 3\left(x-\frac{5}{3}\right)(x+4) = (3x-5)(x+4).
\]

Expanding \((3x-5)(x+4)\):

\[
(3x-5)(x+4)=3x^{2}+12x-5x-20 = 3x^{2}+7x-20,
\]

which matches the original expression, confirming the roots.

---

**Summary of steps**

1. Identify \(a, b, c\).  
2. Compute \(\Delta = b^{2}-4ac\).  
3. Find \(\sqrt{\Delta}\).  
4. Plug into \(\displaystyle x=\frac{-b\pm\sqrt{\Delta}}{2a}\).  
5. Simplify the two possibilities to obtain the roots.  

The equation \(3x^{2}+7x-20=0\) therefore has the solutions \(x = \frac{5}{3}\) and \(x = -4\).
==========
Prompt: 56 tokens, 239.620 tokens-per-sec
Generation: 937 tokens, 57.772 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic: 3x^2 + 7x -20 =0. Compute discriminant D = b^2 - 4ac = 7^2 -4*3*(-20)=49 +240=289. sqrt(289)=17. Roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(6). So x1 = (-7 + 17)/6 = 10/6 = 5/3. x2 = (-7 - 17)/6 = -24/6 = -4. Provide step-by-step explanation. Ensure show discriminant calculation, both roots. Provide final answer.

</think>

**Equation to solve**

\[
3x^{2}+7x-20=0
\]

This is a quadratic equation of the form  

\[
ax^{2}+bx+c=0\qquad\text{with}\qquad a=3,\;b=7,\;c=-20 .
\]

---

### 1. Compute the discriminant  

The discriminant \( \Delta \) tells us the nature of the roots:

\[
\Delta = b^{2}-4ac .
\]

Plug in the coefficients:

\[
\begin{aligned}
\Delta &= 7^{2}-4\,(3)\,(-20) \\
       &= 49 - 4\cdot 3 \cdot (-20) \\
       &= 49 - (-240) \\
       &= 49 + 240 \\
       &= 289 .
\end{aligned}
\]

\[
\boxed{\Delta = 289}
\]

Since \( \Delta > 0 \) and is a perfect square, the equation has **two distinct rational roots**.

---

### 2. Apply the quadratic formula  

\[
x = \frac{-b \pm \sqrt{\Delta}}{2a}.
\]

First find \( \sqrt{\Delta} \):

\[
\sqrt{289}=17.
\]

Now substitute \( b=7,\;a=3,\;\sqrt{\Delta}=17 \):

\[
x = \frac{-7 \pm 17}{2\cdot 3}= \frac{-7 \pm 17}{6}.
\]

---

### 3. Compute the two roots  

**Root 1 (using the plus sign):**

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}.
\]

**Root 2 (using the minus sign):**

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4.
\]

---

### 4. Verify (optional)

- For \(x=\frac{5}{3}\):
  \[
  3\left(\frac{5}{3}\right)^{2}+7\left(\frac{5}{3}\right)-20
  =3\left(\frac{25}{9}\right)+\frac{35}{3}-20
  =\frac{75}{9}+\frac{105}{9}-\frac{180}{9}=0.
  \]

- For \(x=-4\):
  \[
  3(-4)^{2}+7(-4)-20
  =3(16)-28-20
  =48-48=0.
  \]

Both satisfy the original equation.

---

## Final answer

\[
\boxed{x = \frac{5}{3}\quad\text{or}\quad x = -4}
\]
==========
Prompt: 56 tokens, 241.884 tokens-per-sec
Generation: 771 tokens, 58.147 tokens-per-sec
Peak memory: 68.404 GB

mlx_lm.generate --model /tmp/nemotron-super-4bit --prompt Explain step by step how to solve this equation: 3x² + 7x - 20 = 0. Show your complete working, including the discriminant calculation and both roots. --max-tokens 2048 --temp 1.0 --top-p 0.95
==========
We need to solve quadratic: 3x^2 + 7x - 20 = 0. Compute discriminant: b^2 - 4ac = 7^2 - 4*3*(-20) = 49 + 240 = 289 = 17^2. Roots: x = [-b ± sqrt(D)]/(2a) = [-7 ± 17]/(6). So x1 = (-7 + 17)/6 = 10/6 = 5/3 ≈1.666..., x2 = (-7 - 17)/6 = -24/6 = -4. So roots: x = 5/3 and x = -4. Show step-by-step.

We need to respond with explanation, showing discriminant calculation and both roots. Use proper formatting. Ensure step-by-step.

</think>

**Step 1: Identify the coefficients**  
The quadratic is in the standard form \(ax^{2}+bx+c=0\).

\[
a = 3,\qquad b = 7,\qquad c = -20
\]

---

**Step 2: Write the quadratic formula**  

\[
x=\frac{-b\pm\sqrt{b^{2}-4ac}}{2a}
\]

The term under the square‑root, \(b^{2}-4ac\), is called the **discriminant** (Δ).

---

**Step 3: Compute the discriminant**

\[
\Delta = b^{2}-4ac = 7^{2}-4(3)(-20)
\]

\[
\Delta = 49 - ( -240 ) = 49 + 240 = 289
\]

Since \(289 = 17^{2}\), the discriminant is a perfect square, so the roots will be rational.

---

**Step 4: Plug the discriminant into the formula**

\[
x = \frac{-7 \pm \sqrt{289}}{2\cdot 3}
   = \frac{-7 \pm 17}{6}
\]

---

**Step 5: Separate the “±” into the two possible roots**

* **Root 1 (using the plus sign):**

\[
x_{1}= \frac{-7 + 17}{6}= \frac{10}{6}= \frac{5}{3}\approx 1.666\ldots
\]

* **Root 2 (using the minus sign):**

\[
x_{2}= \frac{-7 - 17}{6}= \frac{-24}{6}= -4
\]

---

**Result**

The solutions of \(3x^{2}+7x-20=0\) are  

\[
\boxed{x = \frac{5}{3}\quad\text{or}\quad x = -4}
\]
==========
Prompt: 56 tokens, 240.511 tokens-per-sec
Generation: 607 tokens, 57.897 tokens-per-sec
Peak memory: 68.404 GB

Here is the impact on speed.

M3 Ultra prompt 8k 4 bits

Before
Averages: prompt_tps=728.086, generation_tps=57.049, peak_memory=74.719

After
Averages: prompt_tps=688.754, generation_tps=55.717, peak_memory=76.979

nastya236

Thanks for adding this!!

angeloskath added 3 commits March 13, 2026 17:31

Start nemotron super port

66be75c

Remove unused mtp

4799bce

Fix the reading of the config

406d9df

angeloskath force-pushed the nemotron-super branch from 3f1de93 to 406d9df Compare March 14, 2026 00:31

angeloskath mentioned this pull request Mar 15, 2026

fix(ssm): correct SSM decode precision #988

Closed

Do the SSM in fp32

b28e70d

angeloskath marked this pull request as ready for review March 16, 2026 02:39

angeloskath requested review from andresy and nastya236 March 16, 2026 02:39

andresy approved these changes Mar 16, 2026

View reviewed changes

jundot mentioned this pull request Mar 16, 2026

[Bug/Model Support] Unexpected parameters error when loading Nemotron-3-Super-120B-A12B-MXFP4-MLX jundot/omlx#255

Closed

nastya236 approved these changes Mar 16, 2026

View reviewed changes

angeloskath merged commit 73c8550 into main Mar 16, 2026
2 checks passed

angeloskath deleted the nemotron-super branch March 16, 2026 17:59

Thump604 mentioned this pull request Mar 21, 2026

Fix SSM dt clamp default for Nemotron-H #1026

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nemotron super support#992

Nemotron super support#992
angeloskath merged 4 commits into
mainfrom
nemotron-super

angeloskath commented Mar 13, 2026 •

edited

Loading

Uh oh!

Thump604 commented Mar 14, 2026

Uh oh!

Thump604 commented Mar 14, 2026

Uh oh!

angeloskath commented Mar 15, 2026

Uh oh!

Thump604 commented Mar 16, 2026

Uh oh!

angeloskath commented Mar 16, 2026

Uh oh!

Thump604 commented Mar 16, 2026

Uh oh!

angeloskath commented Mar 16, 2026

Uh oh!

nastya236 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

angeloskath commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Thump604 commented Mar 14, 2026

Uh oh!

Thump604 commented Mar 14, 2026

Uh oh!

angeloskath commented Mar 15, 2026

Uh oh!

Thump604 commented Mar 16, 2026

Uh oh!

angeloskath commented Mar 16, 2026

Uh oh!

Thump604 commented Mar 16, 2026

fp32 state (3/3 coherent)

bf16 state (1/3 failed, 1/3 noisy, 1/3 coherent)

Analysis

Uh oh!

angeloskath commented Mar 16, 2026

Uh oh!

nastya236 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

angeloskath commented Mar 13, 2026 •

edited

Loading