GLM 5.2 is live! Opus-level intelligence at open-source rates. Pay per token on serverless. Try it today.

Model Library
/Qwen/Qwen2 7B Instruct
Quen Logo Mark

Qwen2 7B Instruct

Ready
model path:accounts/fireworks/models/qwen2-7b-instruct

Qwen2 7B Instruct is a 7-billion-parameter instruction-tuned language model developed by the Qwen team. Optimized for following instructions, it excels at tasks like question answering, dialogue generation, and summarization. The model is designed to provide accurate and contextually appropriate responses, making it suitable for a wide range of natural language processing applications.

Qwen2 7B Instruct API Features

Fine-tuning

Docs

Qwen2 7B Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

On-demand Deployment

Docs

On-demand deployments allow you to use Qwen2 7B Instruct on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Qwen2 7B Instruct FAQs

What is Qwen2 7B Instruct and who developed it?

Qwen2 7B Instruct is an instruction-tuned language model developed by Qwen, a team at Alibaba Cloud. It is based on the Qwen2 architecture and optimized for general-purpose tasks such as question answering, summarization, dialogue generation, and reasoning.

What applications and use cases does Qwen2 7B Instruct excel at?

Qwen2 7B Instruct is tuned for:

  • Instruction-following and dialogue agents
  • Text summarization and generation
  • Code generation (e.g., HumanEval, MBPP, MultiPL-E)
  • Reasoning and mathematics
  • Multilingual tasks, especially English and Chinese
What is the maximum context length for Qwen2 7B Instruct?

131,072 tokens when using YaRN extrapolation techniques. Default context length is 32,768 tokens.

What is the usable context window for Qwen2 7B Instruct?

The supported context length for this model is 32.8K tokens on Fireworks. For longer sequences, YaRN must be explicitly configured using rope_scaling in the model config.

What are known failure modes of Qwen2 7B Instruct?

The model may underperform on long context processing without proper configuration of YaRN. On short inputs, performance can degrade if YaRN is enabled unnecessarily. Some benchmarks (e.g., GPQA) show slightly weaker performance compared to larger models.

Does Qwen2 7B Instruct support streaming responses and function-calling schemas?

Streaming responses and function calling are not supported for this model.

Is fine-tuning supported for Qwen2 7B Instruct?

Yes. Fireworks supports LoRA-based fine-tuning for this model.

What rate limits apply on the shared endpoint?

On-demand deployments are supported with no rate limits.

What license governs commercial use of Qwen2 7B Instruct?

Qwen2 7B Instruct is released under the Apache 2.0 license, which permits commercial use and modifications.

Metadata

State
Ready
Created on
6/6/2024
Kind
Base model
Provider
Qwen

Specification

Calibrated
No
Mixture-of-Experts
No
Parameters
7.61B

Supported Functionality

Fine-tuning
Supported
Serverless
Not supported
Context Length
32.7k tokens
Function Calling
Not supported
Embeddings
Not supported
Rerankers
Not supported
Support image input
Not supported