FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
Paper
•
2502.14856
•
Published
•
8
Token frequency statistics based on SlimPajama-627B, used for FR-Spec (https://arxiv.org/abs/2502.14856), see more at https://github.com/thunlp/FR-Spec.
freq_32768.pt can be loaded by torch.load(), and it is a list of high-frequency tokens.
config.json and pytorch_model.bin are the same as https://huggingface.co/yuhuili/EAGLE-Qwen2-7B-Instruct, and can be downloaded from their repo.