Override the generated LlmapiChatCompletionResponse so legacy top-level fields (tools_checksum, tool_call_events, inference_id, …) and cost-on-usage (usage.cost_micro_usd, usage.credits_used) remain readable by SDK consumers that haven’t migrated to the new portal envelope.
The pre-migration usage shape: standard OpenAI token counts plus the portal’s cost/credit fields all on one flat object. The new schema splits these — OpenAI tokens stay in usage, portal cost fields move to the portal envelope — so this type no longer appears in the generated client.
Portal carries non-OpenAI fields scoped to the portal under a single key so they don’t collide with the embedded SDK type’s custom JSON marshaling.
Portal carries non-OpenAI fields scoped to the portal under a single key so they don’t collide with the embedded SDK type’s custom JSON marshaling.
Input can be a simple text string or an array of messages for multi-turn conversations. When continuing after client tool calls, pass the messages array from the previous response.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is true. Learn more about Structured Outputs in the function calling guide .
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.
Model ID used to generate the response, like gpt-4o or o3. OpenAI offers a wide range of models with different capabilities, performance characteristics, and price points. Refer to the model guide to browse and compare available models.
The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.