Fine-Tuning a customized LLM

2025-12-16

Base Model: Qwen-2.5-1.5B-Instruct
GPU: RTX2060S
Framework
- Unsloth (recommended for beginer) or
- Hugging Face TRL
Process:
- Quantization
- LoRA(Low-Rank Adaptation): using two low rank matrices to represent the adating matrix, which fit to the new task based on frozen pre-trained model. This is the layer learned SFT knowledges.
- PEFT(Parameter-Efficient Fine-Tuning):
- SFT(supervised Training):

3. Fine-Tune

LLM Quantization

3. The Data Fuel

4. The Training Loop

5. Evaluation