⚡ DriftPA

OpenEnv 0.2.1 · Track 3.2 Personalized Tasks · Patronus AI Partner Track

Trains LLM agents to act as personal executive assistants in a world that changes mid-task. The agent manages cascading real-life conflicts — calendar clashes, urgent emails, dinner bookings, ride scheduling — while four failure modes fire without warning.

−9.55Untrained mean reward

+22.0Optimal episode reward

31 ptsTraining gap

24,000GRPO rollouts (H100)

Four Novel Mechanics

Schema Drift — API field names change mid-episode (party_size → guests). Agent must call list_tools() to discover new schema or get penalised.

Time Pressure — Tasks expire if not resolved within N steps. Boss email expires at step 4. Missing it triggers a cascade.

Irreversible Actions — reply_message, book_restaurant, book_ride cannot be undone. Wrong commits create cascade failures.

Policy Drift — Cancellation window tightens from 2hr → 4hr post-drift. Late cancellation = policy violation.

API Endpoints

GET/health— liveness check

POST/reset— start episode {"seed": 0}

POST/step— take action {"action": {"tool_name": "list_tools", "payload": {}}}

GET/state— episode metadata

Quick Start

curl /health
curl -X POST /reset -H "Content-Type: application/json" -d '{"seed": 0}'
curl -X POST /step -H "Content-Type: application/json" -d '{"action": {"tool_name": "list_tools", "payload": {}}}'

Links

✓ Health Check ⌥ GitHub ▶ Training Notebook