Skip to content

Latest commit

 

History

History
74 lines (63 loc) · 2.72 KB

File metadata and controls

74 lines (63 loc) · 2.72 KB

Logits-Based Finetuning

• 🤗 Data • 🤗 ScienceLLaMA-3B • 🤗 ScienceLLaMA-1B • 🐱 Code • 📃 Paper

Logits-Based Finetuning integrates the strengths of supervised learning and knowledge distillation by combining teacher logits with ground truth labels, preserving both correctness and linguistic diversity. This ensures more reliable and effective training.

example

Performance

performance

Train

git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"
  • Run
# 1b
llamafactory-cli train llamafactory/scripts/llama3.2_1b_instruct_pkl_1300k_e1_warmup0.1_cosinelr1e-6_seed42_maxl2048_a0.9_t1.0_logp5_freqt_0_b1.0_r1.0.yaml
# 3b
llamafactory-cli train llamafactory/scripts/llama3.2_3b_instruct_pkl_1300k_e1_warmup0.1_cosinelr1e-6_seed42_maxl2048_a0.9_t1.0_logp5_freqt_0_b1.0_r1.0.yaml
  • Hyperparatemers
Parameter Type Default Description
use_distill bool False Whether to enable distillation.
distill_alpha float 0.9 Balance weight for the distillation loss.
distill_t float 1.0 Temperature for the distillation loss.
distill_gamma float 1.0 Balance weight for teacher model logits.

Evaluation

  • Installation
cd evaluation/latex2sympy
pip install -e .
cd ..
pip install -r requirements.txt 
pip install vllm==0.5.1 --no-build-isolation
pip install transformers==4.42.3
  • Run
bash evaluation/sh/eval.sh "qwen25-math-cot" $MODEL_NAME_OR_PATH

Citation

If you find this project useful in your research, please consider citing:

@article{li2025logits,
  title={Logits-Based Finetuning},
  author={Li, Jingyao and Yang, Senqiao and Wu, Sitong and Shi, Han and Zheng, Chuanyang and Xu, Hong and Jia, Jiaya},
  journal={arXiv preprint arXiv:2505.24461},
  year={2025}
}