- Base Model: Qwen-2.5-1.5B-Instruct
- GPU: RTX2060S
- Framework
- Unsloth (recommended for beginer) or
- Hugging Face TRL
- Process:
- Quantization
- LoRA(Low-Rank Adaptation): using two low rank matrices to represent the adating matrix, which fit to the new task based on frozen pre-trained model. This is the layer learned SFT knowledges.
- PEFT(Parameter-Efficient Fine-Tuning):
- SFT(supervised Training):
3. Fine-Tune
- LLM Quantization