A fine-tuned version of the Mistral-7B model developed by Snorkel using PairRM for response ranking and Direct Preference Optimization (DPO) for model adaptation and refinement.
Snorkel Mistral PairRM DPO can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
On-demand deployments give you dedicated GPUs for Snorkel Mistral PairRM DPO using Fireworks' reliable, high-performance system with no rate limits.
Snorkel
32768
Available
$0.2