POST
/
v1
/
accounts
/
{account_id}
/
batchInferenceJobs
Create Batch Inference Job
curl --request POST \
  --url https://api.fireworks.ai/v1/accounts/{account_id}/batchInferenceJobs \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "displayName": "<string>",
  "model": "<string>",
  "inputDatasetId": "<string>",
  "outputDatasetId": "<string>",
  "inferenceParameters": {
    "maxTokens": 123,
    "temperature": 123,
    "topP": 123,
    "n": 123,
    "extraBody": "<string>",
    "topK": 123
  },
  "precision": "PRECISION_UNSPECIFIED",
  "continuedFromJobName": "<string>"
}'
{
  "name": "<string>",
  "displayName": "<string>",
  "createTime": "2023-11-07T05:31:56Z",
  "createdBy": "<string>",
  "state": "JOB_STATE_UNSPECIFIED",
  "status": {
    "code": "OK",
    "message": "<string>"
  },
  "model": "<string>",
  "inputDatasetId": "<string>",
  "outputDatasetId": "<string>",
  "inferenceParameters": {
    "maxTokens": 123,
    "temperature": 123,
    "topP": 123,
    "n": 123,
    "extraBody": "<string>",
    "topK": 123
  },
  "updateTime": "2023-11-07T05:31:56Z",
  "precision": "PRECISION_UNSPECIFIED",
  "jobProgress": {
    "percent": 123,
    "epoch": 123,
    "totalInputRequests": 123,
    "totalProcessedRequests": 123,
    "successfullyProcessedRequests": 123,
    "failedRequests": 123,
    "outputRows": 123,
    "inputTokens": 123,
    "outputTokens": 123,
    "cachedInputTokenCount": 123
  },
  "continuedFromJobName": "<string>"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

account_id
string
required

The Account Id

Query Parameters

batchInferenceJobId
string

ID of the batch inference job.

Body

application/json
displayName
string
state
enum<string>
default:JOB_STATE_UNSPECIFIED

JobState represents the state an asynchronous job can be in.

Available options:
JOB_STATE_UNSPECIFIED,
JOB_STATE_CREATING,
JOB_STATE_RUNNING,
JOB_STATE_COMPLETED,
JOB_STATE_FAILED,
JOB_STATE_CANCELLED,
JOB_STATE_DELETING,
JOB_STATE_WRITING_RESULTS,
JOB_STATE_VALIDATING,
JOB_STATE_DELETING_CLEANING_UP,
JOB_STATE_PENDING,
JOB_STATE_EXPIRED,
JOB_STATE_RE_QUEUEING,
JOB_STATE_CREATING_INPUT_DATASET
status
object
model
string

The name of the model to use for inference. This is required, except when continued_from_job_name is specified.

inputDatasetId
string

The name of the dataset used for inference. This is required, except when continued_from_job_name is specified.

outputDatasetId
string

The name of the dataset used for storing the results. This will also contain the error file.

inferenceParameters
object

Parameters controlling the inference process.

precision
enum<string>
default:PRECISION_UNSPECIFIED

The precision with which the model should be served. If PRECISION_UNSPECIFIED, a default will be chosen based on the model.

Available options:
PRECISION_UNSPECIFIED,
FP16,
FP8,
FP8_MM,
FP8_AR,
FP8_MM_KV_ATTN,
FP8_KV,
FP8_MM_V2,
FP8_V2,
FP8_MM_KV_ATTN_V2,
NF4,
FP4,
BF16,
FP4_BLOCKSCALED_MM,
FP4_MX_MOE
jobProgress
object

Job progress.

continuedFromJobName
string

The resource name of the batch inference job that this job continues from. Used for lineage tracking to understand job continuation chains.

Response

200 - application/json

A successful response.

name
string
displayName
string
createTime
string<date-time>

The creation time of the batch inference job.

createdBy
string

The email address of the user who initiated this batch inference job.

state
enum<string>
default:JOB_STATE_UNSPECIFIED

JobState represents the state an asynchronous job can be in.

Available options:
JOB_STATE_UNSPECIFIED,
JOB_STATE_CREATING,
JOB_STATE_RUNNING,
JOB_STATE_COMPLETED,
JOB_STATE_FAILED,
JOB_STATE_CANCELLED,
JOB_STATE_DELETING,
JOB_STATE_WRITING_RESULTS,
JOB_STATE_VALIDATING,
JOB_STATE_DELETING_CLEANING_UP,
JOB_STATE_PENDING,
JOB_STATE_EXPIRED,
JOB_STATE_RE_QUEUEING,
JOB_STATE_CREATING_INPUT_DATASET
status
object
model
string

The name of the model to use for inference. This is required, except when continued_from_job_name is specified.

inputDatasetId
string

The name of the dataset used for inference. This is required, except when continued_from_job_name is specified.

outputDatasetId
string

The name of the dataset used for storing the results. This will also contain the error file.

inferenceParameters
object

Parameters controlling the inference process.

updateTime
string<date-time>

The update time for the batch inference job.

precision
enum<string>
default:PRECISION_UNSPECIFIED

The precision with which the model should be served. If PRECISION_UNSPECIFIED, a default will be chosen based on the model.

Available options:
PRECISION_UNSPECIFIED,
FP16,
FP8,
FP8_MM,
FP8_AR,
FP8_MM_KV_ATTN,
FP8_KV,
FP8_MM_V2,
FP8_V2,
FP8_MM_KV_ATTN_V2,
NF4,
FP4,
BF16,
FP4_BLOCKSCALED_MM,
FP4_MX_MOE
jobProgress
object

Job progress.

continuedFromJobName
string

The resource name of the batch inference job that this job continues from. Used for lineage tracking to understand job continuation chains.