[Fix] Fix Method to Obtain Prefix Token ID#18317
[Fix] Fix Method to Obtain Prefix Token ID#18317anzr299 wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18317
Note: Links to docs will display an error until the docs builds have been completed. ❌ 6 Awaiting Approval, 1 New FailureAs of commit 55fd4d1 with merge base 1925873 ( AWAITING APPROVAL - The following workflows need approval before CI can run:
NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
Fixes Llama evaluation behavior by returning a proper prefix token id (BOS when available) instead of incorrectly defaulting to the end-of-text/end-of-sequence token, aligning perplexity results with Hugging Face’s eval flow.
Changes:
- Update
prefix_token_idto prefertokenizer.bos_idwhen present. - Preserve prior fallback behavior by using
eot_token_idwhen BOS is unavailable.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Summary
Fix
prefix_token_idto return the BOS token ID instead of the EOT token ID.EagerEvalWrapper.prefix_token_idwas incorrectly returning the EOT token ID.Since lm-eval prepends
prefix_token_idto every evaluation sequence, this causedLlama 3's <|end_of_text|> (token 128001) to be used instead of <|begin_of_text|>
(token 128000), resulting in higher perplexity scores.
Llama 3 8B Wikitext PPL before fix : 9.18
Llama 3 8B Wikitext PPL after fix : 7.793
The result after the fix matches the expected perplexity when evaluating
the same model directly via HuggingFace.