Here are key areas to troubleshoot for custom model deployments:

1. Deployment hanging or crashing

Common causes:

  • Missing model files, especially when using Hugging Face models
  • Symlinked files not uploaded correctly
  • Outdated firectl version

Solutions:

  • Download models without symlinks using:
    huggingface-cli download model_name --local-dir=/path --local-dir-use-symlinks=False
    
  • Update firectl to the latest version

2. LoRA adapters vs full models

  • Compatibility: LoRA adapters work with specific base models.
  • Performance: May experience slightly lower speed with LoRA, but quality should remain similar to the original model.
  • Troubleshooting quality drops:
    • Check model configuration
    • Review conversation template
    • Add echo: true to debug requests

3. Performance optimization factors

Consider adjusting the following for improved performance:

  • Accelerator count and accelerator type
  • Long prompt settings to handle complex inputs