Skip to content

Refactor tokenizer error handling to use warnings instead of exceptio…#744

Merged
awni merged 3 commits into
ml-explore:mainfrom
cubist38:fix/tool-call-token-warnings
Jan 9, 2026
Merged

Refactor tokenizer error handling to use warnings instead of exceptio…#744
awni merged 3 commits into
ml-explore:mainfrom
cubist38:fix/tool-call-token-warnings

Conversation

@cubist38
Copy link
Copy Markdown
Contributor

@cubist38 cubist38 commented Jan 9, 2026

Description

Gracefully handle missing tool call tokens by replacing hard ValueError exceptions with warnings.

Previously, initialization would fail if tool call tokens were not found in the tokenizer vocabulary. This change allows the system to continue running while still notifying users via warnings.

Changes

  • Updated tool call token validation in TokenizerWrapper.__init__() to use warnings.warn() instead of raising a ValueError
  • Added import warnings to tokenizer_utils.py

Motivation & Context

This change fixes the model loading failure for mlx-community/Ring-mini-linear-2.0-4bit, as reported in
issue #140.

After this fix, the model loads successfully.

Verification

The following prompt and output demonstrate that tool calling still works correctly after the change.

Input

<role>SYSTEM</role># Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "get_weather", "description": "Get the weather for a given city", "parameters": {"type": "object", "properties": {"city": {"type": "string", "description": "The city to get the weather for"}}}, "required": ["city"]}}
</tools>

If none of the functions can be used, point it out. If the given question lacks the parameters required by the function, also point it out. 
If you need to use a function, for each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>
<role>HUMAN</role>What is the weather in New York?<role>ASSISTANT</role><think>

Output

Okay, let me see. The user is asking about the weather in New York. I need to figure out which function to use here.

Looking at the provided tools, there's a function called get_weather that takes a city parameter. The required parameter is "city", and it's a string.

The user mentioned "New York" as the city, so I have all the necessary info. I should call get_weather with city set to "New York". No other parameters are needed since the function only requires the city. Let me structure the tool call correctly.

So the tool_call should be a JSON object with "name" as "get_weather" and "arguments" as {"city": "New York"}. Then wrap it in the <tool_call> tags as specified.
</think>

<tool_call>
{"name": "get_weather", "arguments": {"city": "New York"}}
</tool_call>

@cubist38
Copy link
Copy Markdown
Contributor Author

cubist38 commented Jan 9, 2026

Hi @awni ,
Could you take a look at this PR when you get a chance? It fixes an issue where some models break in an unexpected way during loading.

For example, mlx-community/Ring-mini-linear-2.0-4bit fails because it doesn’t define special tokens for tool calls. In these cases, raising an exception feels a bit too strict — switching to a warning lets the model load normally and tool calls still work as expected.

I think this makes the behavior a bit more robust overall. Thanks a lot!

Copy link
Copy Markdown
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx!

@awni awni merged commit a20eefd into ml-explore:main Jan 9, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants