A fine-tuned version of the Mistral-7B model developed by Snorkel using PairRM for response ranking and Direct Preference Optimization (DPO) for model adaptation and refinement.
Fine-tuningDocs | Snorkel Mistral PairRM DPO can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
On-demand DeploymentDocs | On-demand deployments allow you to use Snorkel Mistral PairRM DPO on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits. |