Local shell
Enable agents to run commands in a local shell.
Local shell is a tool that allows agents to run shell commands locally on a machine you or the user provides. It's designed to work with Codex CLI and codex-mini-latest
. Commands are executed inside your own runtime, you are fully in control of which commands actually run —the API only returns the instructions, but does not execute them on OpenAI infrastructure.
Local shell is available through the Responses API for use with codex-mini-latest
. It is not available on other models, or via the Chat Completions API.
Running arbitrary shell commands can be dangerous. Always sandbox execution or add strict allow- / deny-lists before forwarding a command to the system shell.
See Codex CLI for reference implementation.
How it works
The local shell tool enables agents to run in a continuous loop with access to a terminal.
It sends shell commands, which your code executes on a local machine and then returns the output back to the model. This loop allows the model to complete the build-test-run loop without additional intervention by a user.
As part of your code, you'll need to implement a loop that listens for local_shell_call
output items and executes the commands they contain. We strongly recommend sandboxing the execution of these commands to prevent any unexpected commands from being executed.
Integrating the local shell tool
These are the high-level steps you need to follow to integrate the computer use tool in your application:
Send a request to the model: Include the
local_shell
tool as part of the available tools.Receive a response from the model: Check if the response has any
local_shell_call
items. This tool call contains an action likeexec
with a command to execute.Execute the requested action: Execute through code the corresponding action in the computer or container environment.
Return the action output: After executing the action, return the command output and metadata like status code to the model.
Repeat: Send a new request with the updated state as a
local_shell_call_output
, and repeat this loop until the model stops requesting actions or you decide to stop.
Example workflow
Below is a minimal (Python) example showing the request/response loop. For brevity, error handling and security checks are omitted—do not execute untrusted commands in production without additional safeguards.
import subprocess, os
from openai import OpenAI
client = OpenAI()
# 1) Create the initial response request with the tool enabled
response = client.responses.create(
model="codex-mini-latest",
tools=[{"type": "local_shell"}],
inputs=[
{
"type": "message",
"role": "user",
"content": [{"type": "text", "text": "List files in the current directory"}],
}
],
)
while True:
# 2) Look for a local_shell_call in the model's output items
shell_calls = [item for item in response.output if item["type"] == "local_shell_call"]
if not shell_calls:
# No more commands — the assistant is done.
break
call = shell_calls[0]
args = call["action"]
# 3) Execute the command locally (here we just trust the command!)
# The command is already split into argv tokens.
completed = subprocess.run(
args["command"],
cwd=args.get("working_directory") or os.getcwd(),
env={**os.environ, **args.get("env", {})},
capture_output=True,
text=True,
timeout=(args["timeout_ms"] / 1000) if args["timeout_ms"] else None,
)
output_item = {
"type": "local_shell_call_output",
"call_id": call["call_id"],
"output": completed.stdout + completed.stderr,
}
# 4) Send the output back to the model to continue the conversation
response = client.responses.create(
model="codex-mini-latest",
tools=[{"type": "local_shell"}],
previous_response_id=response.id,
inputs=[output_item],
)
# Print the assistant's final answer
final_message = next(
item for item in response.output if item["type"] == "message" and item["role"] == "assistant"
)
print(final_message["content"][0]["text"])
Best practices
- Sandbox or containerize execution. Consider using Docker, firejail, or a jailed user account.
- Impose resource limits (time, memory, network). The
timeout_ms
provided by the model is only a hint—you should enforce your own limits. - Filter or scrutinize high-risk commands (e.g.
rm
,curl
, network utilities). - Log every command and its output for auditability and debugging.
Error handling
If the command fails on your side (non-zero exit code, timeout, etc.) you can still send a local_shell_call_output
; include the error message in the output
field.
The model can choose to recover or try executing a different command. If you send malformed data (e.g. missing call_id
) the API returns a standard 400
validation error.