DeepSeek Coder 7B Base

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. Deepseek Coder 6.7B Base is a 6.7B parameter model with Multi-Head Attention trained on 2 trillion tokens by employing a window size of 16K and an extra fill-in-the-blank task

Fireworks Features

Fine-tuning Docs	DeepSeek Coder 7B Base can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
On-demand Deployment Docs	On-demand deployments give you dedicated GPUs for DeepSeek Coder 7B Base using Fireworks' reliable, high-performance system with no rate limits.

Metadata

State

Ready

Created on

3/15/2024

Kind

Base model

Provider

Deepseek

Hugging Face

deepseek-coder-6.7b-base

Specification

Calibrated

Mixture-of-Experts

Parameters

6.9B

Supported Functionality

Fine-tuning

Supported

Serverless

Not supported

Serverless LoRA

Supported

Context Length

4.1k tokens

Function Calling

Not supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Not supported

DeepSeek Coder 7B Base

Fireworks Features

Fine-tuning

On-demand Deployment

Metadata

Specification

Supported Functionality