pugio
yesterday at 4:20 AM
Hah, I've been wrestling with this ALL DAY. Another example of Phenomenal Cosmic Powers (AI) combined with itty bitty docs (typical of Google). The main endpoint ("https://generativelanguage.googleapis.com/v1beta/models/gemi...") doesn't even have actual REST documentation in the API. The Python API has 3 different versions of the same types. One of the main ones (`GenerateContentRequest`) isn't available in the newest path (`google.genai.types`) so you need to find it in an older version, but then you start getting version mismatch errors, and then pydantic errors, until you finally decide to just cross your fingers and submit raw JSON, only to get opaque API errors.
So, if anybody else is frustrated and not finding anything online about this, here are a few things I learned, specifically for structured output generation (which is a main use case for batching) - the individual request JSON should resolve to this:
```json
{
"request": {
"contents": [
{
"parts": [
{
"text": "Give me the main output please"
}
]
}
],
"system_instruction": {
"parts": [
{
"text": "You are a main output maker."
}
]
},
"generation_config": {
"response_mime_type": "application/json",
"response_json_schema": {
"type": "object",
"properties": {
"output1": {
"type": "string"
},
"output2": {
"type": "string"
}
},
"required": [
"output1",
"output2"
]
}
}
},
"metadata": {
"key": "my_id"
}
}
```
To get actual structured output, don't just do `generation_config.response_schema`, you need to include the mime-type, and the key should be `response_json_schema`. Any other combination will either throw opaque errors or won't trigger Structured Output (and will contain the usual LLM intros "I'm happy to do this for you...").
So you upload a .jsonl file with the above JSON, and then you try to submit it for a batch job. If something is wrong with your file, you'll get a "400" and no other info. If something is wrong with the request submission you'll get a 400 with "Invalid JSON payload received. Unknown name \"file_name\" at 'batch.input_config.requests': Cannot find field."
I got the above error endless times when trying their exact sample code:
```
BATCH_INPUT_FILE='files/123456' # File ID
curl https://generativelanguage.googleapis.com/v1beta/models/gemi... \
-X POST \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type:application/json" \
-d "{
'batch': {
'display_name': 'my-batch-requests',
'input_config': {
'requests': {
'file_name': ${BATCH_INPUT_FILE}
}
}
}
}"
```
Finally got the job submission working via the python api (`file_batch_job = client.batches.create()`), but remember, if something is wrong with the file you're submitting, they won't tell you what, or how.
TheTaytay
yesterday at 8:39 AM
Thank you for posting this! (When I run into errors with posted sample code, I spend WAY too long assuming it’s my fault.)