Fix Step 3.5 Flash model conversion#840
Conversation
|
Yes you can just check if any of the original (un-mapped) keys are still in the weight keys. If they are it means it hasn't been converted yet to MLX format so it's safe to apply the +1.0 |
|
Hi Guys,just some feedback. This time I run the 8bit uploaded by myself and also the 4bit model uploaded by kernelpool under many different chat situation rather than the given test command. It looks like they all have repetition problems. They might (almost 100% when the question is long) repeat certain words forever, the words is context related. Some times a whole short sentence is repeated. It's not production ready for now. Don't know why, I'm just a test user, sorry. |
8bbec69 to
166ebb9
Compare
|
@awni What do you think about the < 0.5 check? Otherwise we need to re-upload the existing models. |
What model parameters (temperature, etc) are you using? |
I don't think we should do it that way. It's somewhat brittle to the mean of the weights and also breaks lazy loading to some extent. I think we should just check the presence of a pattern in the weight keys to determine if it's already been converted. And I will re-uploading the models, that's not so difficult. (But others will have to reconvert or re-download). |
I tried different temperatures, like 0.6, 1, they all behave the same way. top-p 0.95/1 also. |
|
I will re-upload the models as soon as this lands. |
|
Yup I confirm the old models are not working for the new commit.(Output nonsense again.) reupload required. |
Fix to avoid applying the RMSNorm delta twice, at conversion and subsequently at load.
This simply reverts back to the original approach from b8c4549. Maybe theres a better way?
More info: #836 (comment)