A fine-tuned version of the Mistral-7B model developed by Snorkel using PairRM for response ranking and Direct Preference Optimization (DPO) for model adaptation and refinement.
Fine-tuningDocs | Snorkel Mistral PairRM DPO can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for Snorkel Mistral PairRM DPO using Fireworks' reliable, high-performance system with no rate limits. |