Fine-Tuning a customized LLM

  • Base Model: Qwen-2.5-1.5B-Instruct
  • GPU: RTX2060S
  • Framework
    • Unsloth (recommended for beginer) or
    • Hugging Face TRL
  • Process:
    • Quantization
    • LoRA(Low-Rank Adaptation): using two low rank matrices to represent the adating matrix, which fit to the new task based on frozen pre-trained model. This is the layer learned SFT knowledges.
    • PEFT(Parameter-Efficient Fine-Tuning):
    • SFT(supervised Training):

3. Fine-Tune

  • LLM Quantization

3. The Data Fuel

4. The Training Loop

5. Evaluation