Fine-tune a 7B model (Mistral-7B or Llama-3 8B) using LoRA on a domain of your choice (customer support, medical Q&A, code review, etc.). Use at least 500 training examples. Evaluate against the base model on a custom eval set of 20 examples using an LLM judge. Deploy the merged model behind a vLLM server. The fine-tuned model must win on your eval set by ≥60% win rate.
DataCollatorForCompletionOnlyLM and verify labels are -100 for prompt tokens.model.merge_and_unload() after training to merge LoRA weights into the base model before deploying to vLLM. vLLM doesn't natively support unmerged LoRA for all model types.$ python fine_tune.py
Loading model: mistralai/Mistral-7B-Instruct-v0.3 in 4-bit...
Trainable parameters: 20,185,088 (0.29% of total)
Training on 500 examples, 3 epochs...
Epoch 1/3: loss=1.24 lr=1.8e-4
Epoch 2/3: loss=0.89 lr=1.2e-4
Epoch 3/3: loss=0.71 lr=2.1e-5
Peak GPU memory: 14.2 GB
Saved checkpoint: ./checkpoints/mistral-7b-lora-customer-support
$ python eval.py
Evaluating 20 examples (base model vs fine-tuned)...
Fine-tuned wins: 14/20 (70%)
Ties: 3/20 (15%)
Base wins: 3/20 (15%)
Win+tie rate: 85%