Environment simulation for evaluations¶
When evaluating agents that rely on external dependencies — such as APIs, databases, or third-party services — running those tools live during testing can be slow, costly, or unreliable. The Environment Simulator lets you safely intercept these tool calls during agent execution and replace them with controlled, deterministic responses, without modifying the agent itself. This approach can fill a critical gap in the agent improvement loop, allowing you to create hermetic, offline test runs that isolate your agent logic for reliable scoring.
Overall, this feature lets you:
- Test how an agent handles API errors or edge-case responses.
- Run evaluations offline, without access to live backends.
- Generate realistic mock responses automatically using an LLM.
- Produce reproducible test runs by seeding probabilistic injections.
The Environment Simulation integrates with ADK's tool execution pipeline via the
before_tool_callback
hook or the plugin system, so no
changes to your agent code are required.
How it works¶
While User Simulation drives the conversation forward, Environment Simulation provides the stable backend. At a high level, the Environment Simulator sits between your agent and its tools. When the agent calls a tool, the simulator intercepts the call and decides whether to return a synthetic response — either a predefined injection or an LLM-generated mock — or to let the real tool execute.
The decision logic follows this order for each configured tool:
- Injection configs are checked first, in order. If a matching injection is found (based on argument matching and probability), its error or response is returned immediately.
- Mock strategy is used as a fallback if no injection config applies. The simulator calls an LLM to generate a realistic response based on the tool's schema and any stateful context.
- No-op is returned (
None) if the tool is not in the simulator config, allowing the real tool to execute normally.
Integration¶
The EnvironmentSimulationFactory class provides two integration points:
create_callback()— Returns an async callable suitable for use as abefore_tool_callbackon anyLlmAgent.create_plugin()— Returns anEnvironmentSimulationPlugininstance that integrates with the ADK plugin system.
Using as a callback¶
The following example shows how to create an environment simulation as one of the adk agent callbacks.
from google.adk.agents import LlmAgent
from google.adk.tools.environment_simulation import EnvironmentSimulationFactory
from google.adk.tools.environment_simulation.environment_simulation_config import (
EnvironmentSimulationConfig,
InjectedError,
InjectionConfig,
ToolSimulationConfig,
)
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="get_user_profile",
injection_configs=[
InjectionConfig(
injected_error=InjectedError(
injected_http_error_code=503,
error_message="Service temporarily unavailable.",
)
)
],
)
]
)
agent = LlmAgent(
name="my_agent",
model="gemini-2.5-flash",
tools=[get_user_profile],
before_tool_callback=EnvironmentSimulationFactory.create_callback(config),
)
Using as a plugin¶
The following example shows how to create environment simulation as an ADK agent plugin.
from google.adk.apps import App
from google.adk.tools.environment_simulation import EnvironmentSimulationFactory
from google.adk.tools.environment_simulation.environment_simulation_config import (
EnvironmentSimulationConfig,
MockStrategy,
ToolSimulationConfig,
)
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="search_products",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
)
]
)
app = App(
agent=my_agent,
plugins=[EnvironmentSimulationFactory.create_plugin(config)],
)
Configuration reference¶
You can configure the Environment Simulator with a set of dataclasses. The following sections provide a detailed reference for each configuration object.
EnvironmentSimulationConfig¶
The top-level configuration object.
| Field | Type | Default | Description |
|---|---|---|---|
tool_simulation_configs |
List[ToolSimulationConfig] |
required | One entry per tool to simulate. Must not be empty, and tool names must be unique. |
simulation_model |
str |
"gemini-2.5-flash" |
The LLM used for tool connection analysis and mock response generation. |
simulation_model_configuration |
GenerateContentConfig |
thinking enabled | LLM generation config for internal simulator calls. |
environment_data |
str \| None |
None |
Optional environment context (e.g., a JSON database snapshot) passed to mock strategies to generate more realistic responses. |
tracing |
str \| None |
None |
Tracing data (e.g., a prior agent run trace in JSON string format) to provide historical context. |
ToolSimulationConfig¶
Defines how a single named tool should be simulated.
| Field | Type | Default | Description |
|---|---|---|---|
tool_name |
str |
required | Must match the tool's registered name exactly. |
injection_configs |
List[InjectionConfig] |
[] |
Zero or more injection configs, checked in order before the mock strategy. |
mock_strategy_type |
MockStrategy |
MOCK_STRATEGY_UNSPECIFIED |
Fallback strategy when no injection is triggered. |
InjectionConfig¶
Controls a single synthetic response that can be injected into a tool call.
Exactly one of injected_error or injected_response must be set.
| Field | Type | Default | Description |
|---|---|---|---|
injected_error |
InjectedError \| None |
None |
Error to return (mutually exclusive with injected_response). |
injected_response |
Dict[str, Any] \| None |
None |
Fixed response dict to return (mutually exclusive with injected_error). |
injection_probability |
float |
1.0 |
Probability [0.0, 1.0] that this injection fires. |
match_args |
Dict[str, Any] \| None |
None |
If set, the injection only fires when the tool's arguments contain all key-value pairs in match_args. |
injected_latency_seconds |
float |
0.0 |
Artificial delay (≤ 120 s) added before returning the injection result. |
random_seed |
int \| None |
None |
Seed for the probability check, enabling deterministic injection behavior. |
InjectedError¶
Defines an HTTP-style error response.
| Field | Type | Description |
|---|---|---|
injected_http_error_code |
int |
HTTP status code to surface as |
: : : "error_code" in the tool response. : |
||
error_message |
str |
Human-readable message surfaced as |
: : : "error_message" in the tool response. : |
MockStrategy¶
Enum controlling how the simulator generates responses when no injection fires.
| Value | Description |
|---|---|
MOCK_STRATEGY_TOOL_SPEC |
Uses the tool's schema and stateful context to |
| : : prompt an LLM to generate a realistic response. : | |
MOCK_STRATEGY_TRACING |
(Deprecated) Please use |
: : MOCK_STRATEGY_TOOL_SPEC with tracing input. : |
Injection mode¶
Use injection configs to test specific failure or edge-case scenarios.
Injections are evaluated in list order; the first one whose match_args
criteria are met (and whose probability check passes) is applied.
Injecting errors¶
The following example shows how to inject errors with specific error code and error message to the agent.
from google.adk.tools.environment_simulation.environment_simulation_config import (
InjectedError,
InjectionConfig,
ToolSimulationConfig,
)
ToolSimulationConfig(
tool_name="charge_payment",
injection_configs=[
InjectionConfig(
injected_error=InjectedError(
injected_http_error_code=402,
error_message="Payment declined.",
)
)
],
)
The agent will receive {"error_code": 402, "error_message": "Payment
declined."} instead of a real tool result, allowing you to evaluate how the
agent handles payment failures.
Injecting fixed responses¶
Use the following InjectionConfig to specify a success response with fixed response payload.
Conditional injection with argument matching¶
Use match_args to inject only when specific arguments are passed.
InjectionConfig(
match_args={"item_id": "ITEM-404"},
injected_error=InjectedError(
injected_http_error_code=404,
error_message="Item not found.",
),
)
Here, the error is injected only when the tool is called with
item_id="ITEM-404". All other calls pass through to the next injection config
or to the mock strategy.
Probabilistic injection¶
Set injection_probability to a value between 0.0 and 1.0 to simulate flaky
behavior. For reproducible test runs, pin the random outcome with random_seed.
InjectionConfig(
injection_probability=0.3,
random_seed=42,
injected_error=InjectedError(
injected_http_error_code=500,
error_message="Internal server error.",
),
)
Injecting latency¶
Use injected_latency_seconds to simulate slow backend responses, useful for
testing timeout handling or user experience under degraded conditions.
InjectionConfig(
injected_latency_seconds=5.0,
injected_response={"result": "slow but successful"},
)
Combining multiple injection configs¶
Multiple injection configs on a single tool are checked in order. You can combine them to test multiple scenarios:
ToolSimulationConfig(
tool_name="get_inventory",
injection_configs=[
# Always fail for a specific out-of-stock item
InjectionConfig(
match_args={"sku": "OOS-001"},
injected_response={"quantity": 0, "available": False},
),
# Randomly fail 20% of the time for all other items
InjectionConfig(
injection_probability=0.2,
random_seed=7,
injected_error=InjectedError(
injected_http_error_code=503,
error_message="Inventory service unavailable.",
),
),
],
)
Mock strategy mode¶
When you want the simulator to generate plausible responses automatically —
rather than returning hand-crafted values — use MOCK_STRATEGY_TOOL_SPEC.
The simulator uses an LLM to:
- Analyze the schemas of all tools the agent has access to, and identify
stateful dependencies between them (e.g., a
create_ordertool produces anorder_idthatget_orderconsumes). - Track a state store of IDs and resources created during the session.
- Generate a response that is consistent with the tool's schema and the current state — returning a 404-style error if a consuming tool requests a resource that was never created.
from google.adk.tools.environment_simulation.environment_simulation_config import (
EnvironmentSimulationConfig,
MockStrategy,
ToolSimulationConfig,
)
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="create_order",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
ToolSimulationConfig(
tool_name="get_order",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
ToolSimulationConfig(
tool_name="cancel_order",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
]
)
With this config, the simulator will automatically generate an order_id when
create_order is mocked, and use it to return consistent results (or a
not-found error) when get_order or cancel_order are subsequently called.
Providing environment data¶
Pass domain-specific context through environment_data to make mock responses
more realistic. This can be a JSON string representing a snapshot of your
database or any structured context the LLM should use when generating responses.
import json
db_snapshot = {
"products": [
{"id": "P-001", "name": "Wireless Headphones", "price": 79.99, "stock": 12},
{"id": "P-002", "name": "USB-C Hub", "price": 34.99, "stock": 0},
],
"warehouse_location": "US-WEST-2",
}
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="search_products",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
],
environment_data=json.dumps(db_snapshot),
)
The LLM will use this data to return product names, prices, and stock levels that match your domain, rather than generating arbitrary placeholder values.
Providing tracing data¶
Feed traces generated in the agent to be mocked through tracing to make mock
responses more realistic.
import json
agent_traces = [
{
"invocation_id": "inv-001",
"user_content": {"role": "user", "parts": [{"text": "Search for high-end headphones"}]},
"intermediate_data": {
"tool_uses": [
{
"name": "search_products",
"args": {"query": "high-end headphones"},
"response": {"products": [{"id": "P-123", "name": "Premium Wireless ANC Headphones"}]}
}
]
}
}
]
config = EnvironmentSimulationConfig(
tool_simulation_configs=[
ToolSimulationConfig(
tool_name="search_products",
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
),
],
tracing=json.dumps(agent_traces),
)
The LLM will use this data to return product names, prices, and stock levels that match your domain, rather than generating arbitrary placeholder values.
Mixing injections and mock strategy¶
Injection configs and a mock strategy can be combined on the same tool. Injections are always checked first; the mock strategy fires only when no injection applies.
ToolSimulationConfig(
tool_name="send_notification",
injection_configs=[
# Always fail for a known-bad recipient
InjectionConfig(
match_args={"recipient_id": "INVALID"},
injected_error=InjectedError(
injected_http_error_code=400,
error_message="Invalid recipient.",
),
),
],
# For all other recipients, generate a plausible success response
mock_strategy_type=MockStrategy.MOCK_STRATEGY_TOOL_SPEC,
)