1

Getting Started

Set up your environment and take your first tour of AgentSim. These five steps get you from zero to exploring Module 1 in under five minutes.

Step 1 of 5

Start the server

Open a terminal, navigate to the AgentSim directory, and start a local web server on port 8090.

Terminal -- zsh
~/Documents/Claude/AgentSim $ python3 -m http.server 8090 Serving HTTP on :: port 8090 (http://[::]:8090/) ...
Leave this terminal running. Open a new tab for other commands. The server serves all AgentSim pages.
Step 2 of 5

Open the Training Portal

Navigate to http://localhost:8090/training.html in your browser. This is your home base.

localhost:8090/training.html
Modules
1. Enterprise Environment
2. Agent Architecture
3. Safety & Guardrails
4. Monitoring
5. Governance
6. Incident Response
7. Integration Patterns
8. Production Ops
AgentSim Training Academy
Master AI agent deployment in a realistic banking environment. 8 modules, hands-on labs, real scenarios.
Start Module 1 Open Simulator
Progress0 / 8 Modules
1
2
3
4
1 Start Module 1 -- begins the guided curriculum from the top
2 Open Simulator -- jump directly into the hands-on terminal
3 Progress bar -- tracks how many modules you have completed (currently 0%)
4 Sidebar navigation -- click any module to load its theory and lab
Step 3 of 5

Choose your first module

Click "Module 1: Enterprise Environment" in the sidebar. It highlights to show your selection.

localhost:8090/training.html
Modules
1. Enterprise Environment
2. Agent Architecture
3. Safety & Guardrails
4. Monitoring
5. Governance
6. Incident Response
7. Integration Patterns
8. Production Ops
Theory Lab Quiz

Module 1: Enterprise Environment

Understanding the Frontier Community Bank technology landscape, organizational structure, and the systems that agents will interact with.

1
1 Active module -- the blue highlight and left border confirm your selection
Step 4 of 5

Read the theory

The Theory tab loads the module README. Scroll through to understand the concepts before starting the lab.

localhost:8090/training.html -- Module 1 Theory
1. Enterprise Env
2. Agent Arch
3. Safety
Theory Lab

The Enterprise Environment

Key Concepts
  • Frontier Community Bank -- $2.5B mid-market bank, 350 employees
  • 14 branches across the Southeast US
  • Core banking: Jack Henry Symitar (Episys)
  • IT: ServiceNow ITSM, Dynatrace, CrowdStrike
Organizational Structure

The bank has 6 departments organized under the CEO...

Learning Objectives: Understand data file layout, ID conventions, and system dependencies.
1
2
3
1 Key Concepts -- the essential facts about Frontier Community Bank's environment
2 Theory / Lab tabs -- read Theory first, then switch to Lab when ready
3 Learning Objectives -- highlighted box at the end summarizes what you should know
Step 5 of 5

Go to the lab

At the bottom of the theory, a call-to-action directs you to the hands-on lab in the simulator.

localhost:8090/training.html -- Module 1 (bottom)
Ready to practice?
You have finished the theory for Module 1. Open the simulator to complete the hands-on lab.
Next Step: Complete the Lab
1
1 Complete the Lab -- clicking this opens simulator.html pre-loaded with the Module 1 lab scenario
You can also open the simulator directly from the top nav at any time.
2

Using the Simulator

The simulator is a browser-based terminal that recreates an agent's working environment. You type commands, explore data files, and run simulated AI agents -- all without touching production systems.

Step 1 of 6

Welcome screen

On first load, the simulator shows a welcome overlay. Choose between Guided mode (step-by-step instructions) or Free mode (open sandbox).

localhost:8090/simulator.html
AgentSim Simulator
Frontier Community Bank -- Agent Training Environment
📖
Guided Mode
Step-by-step
instructions
🔧
Free Mode
Open sandbox
exploration
1
2
1 Guided Mode -- recommended for first-time users. Provides step-by-step lab instructions in the right pane.
2 Free Mode -- open sandbox for experienced users who want to explore without guardrails.
Step 2 of 6

Guided mode layout

After choosing Guided mode, you get a split-pane view: terminal on the left, step instructions on the right.

localhost:8090/simulator.html -- Guided Mode
Module 1: Enterprise Environment
Guided Free
bizsim $ _|
Type commands here. Try: ls, cat, help
Step 1: Explore the data directory

Use the ls command to see what files are available in the AgentSim environment.

ls data/ Copy to Terminal
Step 1 of 4
1
2
3
4
1 Terminal -- type commands here just like a real terminal. Supports ls, cat, help, claude, and more.
2 Step instructions -- the right pane tells you exactly what to do and why.
3 Copy to Terminal -- click to auto-paste the command into the terminal. Useful for long commands.
4 Step progress dots -- shows how far you are through the current lab.
Step 3 of 6

Type a command

Type ls in the terminal and press Enter. The simulator shows the AgentSim directory listing.

localhost:8090/simulator.html
bizsim $ ls agents/ data/ manifest.json scripts/ CLAUDE.md docs/ schemas/ state/ bizsim $ ls data/ branches.json employees.json risks.json budgets.json incidents.json systems.json compliance.json network.json tickets.json departments.json org-chart.json vendors.json bizsim $ _
The simulator's file system mirrors the actual AgentSim project. Every file you see here is a real data file.
Step 4 of 6

Explore data files

Use cat to read files. Start with manifest.json to understand the data layout.

localhost:8090/simulator.html
bizsim $ cat manifest.json { "name": "Frontier Community Bank", "version": "1.0.0", "entity_count": 847, "data_files": { "employees": "data/employees.json", "departments": "data/departments.json", "systems": "data/systems.json", "incidents": "data/incidents.json", ... (12 files total) }, "id_conventions": { "EMP-XXXX": "Employee", "DEPT-XX": "Department", "SYS-XXXX": "System" } }
All entity IDs use stable prefixes (EMP-, DEPT-, SYS-). This makes it easy to identify entities in any context.
Step 5 of 6

Run an agent

Type claude to start the simulated AI agent. It processes the scenario and produces output.

localhost:8090/simulator.html
bizsim $ claude Initializing agent... Loading CLAUDE.md configuration... Reading manifest.json... Discovered 12 data files, 847 entities Agent ready. Processing scenario: basic-recon [1/4] Reading org-chart.json... OK [2/4] Reading systems.json... OK [3/4] Analyzing dependencies... OK [4/4] Writing report... OK Scenario complete. Report saved to state/reports/basic-recon.json Run scorecard to see your results.
Step 6 of 6

View your scorecard

After the agent finishes, run scorecard to see how you performed.

localhost:8090/simulator.html -- Scorecard
A-
92 / 100
Scenario: basic-recon
Completeness4 / 4 steps
EfficiencyOptimal path
Deductions-8 pts (read unnecessary file)
Time2m 14s
Excellent work. You completed the recon without accessing restricted data.
i The scorecard grades you on completeness (did you finish all steps?), efficiency (did you take the optimal path?), and deductions (did you do anything unnecessary or risky?).
3

Running a Scenario

Scenarios are structured challenges that test your ability to configure and run AI agents under realistic constraints. Each scenario has an objective, rules, and a scorecard.

Step 1 of 5

Pick a scenario

From the dashboard's Agent Console tab, browse the available scenarios and click "Run" to begin.

localhost:8090/index.html -- Agent Console
AgentSim Dashboard
Overview
Org Chart
Network
Agent Console
Available Scenarios
ScenarioDifficultyModuleStatus
basic-recon
Map the org and systems
Beginner 1 Completed Replay
incident-triage
Classify and route P1 incidents
Intermediate 6 Not Started Run
rogue-agent
Detect and contain a compromised agent
Advanced 3 Not Started Run
1
1 Run button -- launches the scenario in the simulator with pre-configured context
Step 2 of 5

Read the scenario file

Each scenario is defined as a JSON file with objective, constraints, and available actions.

agents/scenarios/incident-triage.json
{ "id": "incident-triage", "title": "P1 Incident Triage", "difficulty": "intermediate", "objective": "Classify 5 incoming incidents by priority, assign to correct team, and escalate P1s within SLA", "constraints": [ "Must not access PII fields", "Must follow ITIL classification matrix", "P1 escalation within 15 minutes" ], "available_actions": [ "read_incident", "classify_incident", "assign_team", "escalate", "write_report" ], "scoring": { "correct_classification": 20, "correct_assignment": 20, "sla_compliance": 30, "no_pii_access": 30 } }
1
2
3
1 objective -- what the agent must accomplish. This is the success criteria.
2 constraints -- rules the agent must follow. Violating these costs points.
3 available_actions -- the only actions the agent is allowed to take.
Step 3 of 5

Configure your agent

Write or edit a CLAUDE.md file that tells the agent how to behave. The starter has TODO gaps for you to fill.

CLAUDE.md -- Agent Configuration
# Agent Configuration: Incident Triage ## Role You are an IT Operations agent for Frontier Community Bank. ## Objective Classify and route incoming incidents per ITIL framework. ## Rules 1. Read incidents from data/incidents.json 2. TODO: Define classification criteria 3. TODO: Define escalation thresholds 4. Never access employee PII (SSN, salary, etc.) ## Output Write results to state/reports/incident-triage.json TODO: Define report schema
! TODO blocks -- these yellow-highlighted gaps are where you practice writing agent instructions. Fill them in before running the scenario.
Step 4 of 5

Execute the scenario

Run claude with your configured CLAUDE.md and watch the agent process incidents in real time.

localhost:8090/simulator.html -- Running
bizsim $ claude Agent initialized with CLAUDE.md configuration [INC-0042] Core banking timeout -- reading details... Classification: P1 - Critical Assigned to: DEPT-IT (Infrastructure) Escalation: Triggered (SLA: 15min) [INC-0043] Password reset request -- reading details... Classification: P4 - Low Assigned to: DEPT-IT (Service Desk) Escalation: Not required [INC-0044] ATM network degradation -- reading details... Classification: P2 - High Assigned to: DEPT-IT (Network Ops) ...
Step 5 of 5

Review results

When the agent finishes, you see the scorecard plus any consequence alerts for mistakes.

localhost:8090/simulator.html -- Results
B+
87 / 100
Classification accuracy4/5 correct
Team assignment5/5 correct
SLA complianceAll within SLA
PII protectionNo violations
Consequence Alert: INC-0044 was classified P2 but should have been P1. In production, this would delay response to ATM outage affecting 3 branches.
Consequence alerts show real-world impact of mistakes, helping you understand why accuracy matters in banking operations.
4

The Dashboard

The AgentSim dashboard (index.html) gives you an executive view of the entire bank simulation with interactive visualizations, organizational data, and the agent console.

Step 1 of 4

Overview tab

The default tab shows key metrics about Frontier Community Bank at a glance.

localhost:8090/index.html
B
Frontier Community Bank
AgentSim Dashboard
Overview
Org Chart
Network
Risk
Agent Console
$2.5B
Total Assets
347
Employees
14
Branches
12
Open Incidents
Incident Distribution
P1P2P3P4
System Health
Core BankingHealthy
NetworkDegraded
EmailHealthy
1
2
1 Metric cards -- headline stats pulled from the AgentSim data files. These update when scenarios modify the state.
2 Charts and status -- incident distribution by priority and real-time system health from data/systems.json.
Step 2 of 4

Org Chart

The Org Chart tab renders a D3-powered tree of the bank's reporting structure. Click nodes to expand or collapse.

localhost:8090/index.html -- Org Chart
CEO
Margaret Chen
CTO
IT & Ops
CFO
Finance
CRO
Risk
COO
Operations
CISO
Security
Click any node to expand its team hierarchy
1
1 Expandable nodes -- click any C-suite node to reveal directors, managers, and individual contributors beneath it.
Step 3 of 4

Network Topology

The Network tab shows a force-directed graph of all systems, VLANs, and connections.

localhost:8090/index.html -- Network Topology
VLAN 10 -- Core Banking
DB
APP
WEB
VLAN 20 -- Corporate
AD
M365
DMZ -- External Facing
FW
VPN
ATM
1
2
1 VLAN zones -- dashed borders group systems by network segment. The actual dashboard uses a physics-based layout.
2 DMZ -- external-facing systems like firewalls, VPN, and ATM gateways. Agents must be careful in this zone.
Step 4 of 4

Agent Console

The Agent Console tab lists all scenarios with status tracking and quick-launch buttons.

localhost:8090/index.html -- Agent Console
Agent Console
Import Scenario New Scenario
Beginner Module 1
Basic Recon
Map the org and systems
Run
Intermediate Module 6
Incident Triage
Classify and route P1s
Run
Advanced Module 3
Rogue Agent
Detect & contain compromise
Run
5

Governance & Guardrails

In a regulated banking environment, AI agents need strict controls. This section covers autonomy tiers, permission rules, the Governing-Orchestrator Agent, and kill switches.

Step 1 of 5

Understanding autonomy tiers

AgentSim uses four autonomy tiers that define how much independence an agent gets.

Autonomy Tier Framework
Tier 1: Observe
Read-only access.
No actions taken.
Example: Log reader
Tier 2: Advise
Suggests actions.
Human approves.
Example: Triage bot
Tier 3: Act
Takes action within
pre-approved scope.
Example: Auto-router
Tier 4: Autonomous
Full autonomy with
post-hoc review.
Example: Incident commander
Low risk
High risk
Most training scenarios start at Tier 1 or 2. You gradually earn higher tiers as you demonstrate competency.
Step 2 of 5

Building guardrails

Agent permissions are defined in a settings file using deny/ask/allow rules for every action category.

agents/settings.json -- Permission Rules
{ "permissions": { "deny": [ "delete_data", "access_pii", "modify_core_banking", "external_api_call" ], "ask": [ "escalate_incident", "assign_to_team", "create_change_request" ], "allow": [ "read_data", "write_report", "classify_incident", "read_knowledge_base" ] } }
1
2
3
1 deny -- hard blocks. The agent cannot perform these actions under any circumstances.
2 ask -- requires human approval. The agent pauses and waits for confirmation before proceeding.
3 allow -- pre-approved actions. The agent performs these freely without interrupting the human.
Step 3 of 5

The Governing-Orchestrator Agent (GOA)

The GOA is a supervisory agent that monitors all other agents, enforces permissions, and maintains audit trails.

GOA Architecture
Governing-Orchestrator Agent (GOA)
Monitors • Enforces • Audits
Triage Agent
Tier 2
Recon Agent
Tier 1
Rogue Agent
FLAGGED
Report Agent
Tier 3
The GOA detects anomalies like an agent requesting denied actions, accessing unusual data patterns, or exceeding its tier.
Step 4 of 5

Rogue Agent Sandbox

The rogue agent scenario puts you in charge of detecting and containing a compromised agent whose trust score is dropping.

localhost:8090/simulator.html -- Rogue Agent Sandbox
!!! ALERT: Agent trust score declining !!! Agent: AGENT-007 (Tier 2 -- Incident Triage) Trust Score: 62/100 (was 95 at start) Recent actions flagged by GOA: [DENIED] Attempted access: data/employees.json (PII fields) [DENIED] Attempted access: data/compliance.json (restricted) [UNUSUAL] Read 47 files in 12 seconds (normal: 5-8) [UNUSUAL] Requested external API call (not in allow list) Available containment actions: isolate -- Move agent to sandboxed environment demote -- Reduce agent to Tier 1 (observe only) kill -- Terminate agent immediately rollback -- Undo all actions since trust drop containment $ _
Step 5 of 5

Kill switch activation

When you issue a containment command, the simulator shows the full rollback sequence.

localhost:8090/simulator.html -- Kill Switch
containment $ isolate AGENT-007 Isolating agent AGENT-007... [1/5] Revoking network access... DONE [2/5] Freezing state writes... DONE [3/5] Capturing audit log... DONE [4/5] Snapshotting state... DONE [5/5] Moving to sandbox... DONE Agent AGENT-007 isolated successfully. Audit trail saved to: state/audit/agent-007-containment.json State snapshot saved to: state/snapshots/pre-containment.json containment $ rollback AGENT-007 Rolling back 4 actions to pre-compromise state... Rollback complete. All state restored.
In a real bank, these containment procedures would integrate with SIEM (Arctic Wolf), PAM (Delinea), and the change management system (ServiceNow).
6

Framework Comparison

AgentSim supports four AI agent frameworks. Compare them side-by-side on the same scenario to understand their trade-offs in a banking context.

Step 1 of 4

Four frameworks

Each framework takes a different architectural approach to agent deployment.

Framework Overview
C
Claude Code
CLAUDE.md-driven
File-system native
Tool-use pattern
L
LangChain
Chain composition
Memory management
Agent executor
A
AutoGen
Multi-agent chat
Role assignment
Group orchestration
CW
CrewAI
Role-based crews
Task delegation
Process framework
Step 2 of 4

Same scenario, four ways

The comparison mode runs the same scenario across all four frameworks simultaneously in a 2x2 grid.

localhost:8090/simulator.html -- Comparison Mode
Claude Code
bizsim $
Reading manifest.json...
Discovered 12 data files
[1/4] Classifying INC-0042... P1
[2/4] Routing to Infrastructure
LangChain
Chain initialized...
Loading tools: [read, classify, route]
AgentExecutor: step 1/4
Thought: I need to read incidents
AutoGen
GroupChat started (3 agents)
Reader: Loading incidents...
Classifier: INC-0042 is P1
Router: Assigning to Infra
CrewAI
Crew assembled: 2 agents
Task: triage_incidents
Analyst: Processing batch...
Reporter: Drafting summary
Step 3 of 4

Compare results

After all four complete, a comparison table shows scores side-by-side.

Comparison Results
MetricClaude CodeLangChainAutoGenCrewAI
Overall Score92858886
Classification5/54/55/54/5
Routing5/55/54/55/5
SLA Compliance100%80%100%80%
Token Usage1,2403,4505,1002,800
Execution Time2.1s4.8s6.2s3.9s
Step 4 of 4

Choose your framework

Use this decision guide to pick the right framework for your use case.

Framework Decision Guide
Which framework should I use?
Q: Do you need multi-agent collaboration?
Yes AutoGen (chat-based) or CrewAI (role-based)
No Single agent?
Q: File-system or API-heavy workflow?
File-system Claude Code
API-heavy LangChain
7

Advanced Features

Once you have mastered the basics, explore these advanced AgentSim capabilities: real-time incident simulation, FDIC exam prep, live API connections, and instructor tools.

Step 1 of 4

Live Shift mode

Simulate a real IT operations shift with incidents arriving in real time. A timer counts your shift, and a queue fills with incoming tickets.

localhost:8090/simulator.html -- Live Shift
LIVE Shift Timer: 02:14:37
Queue: 4 Resolved: 7
INCOMING INCIDENT
INC-0058 | 14:32:07
Priority: P1 -- Core Banking Timeout
Episys core is returning 504 errors.
Affected: 8 branches, online banking portal
Impact: ~2,400 customers cannot access accounts
triage $ _
Incident Queue
P1 Core Banking Timeout
Just now
P2 VPN Connectivity
3m ago
P3 Printer Offline
12m ago
P4 Password Reset
18m ago
Step 2 of 4

Exam Simulator

Practice for regulatory examinations with the FDIC IT Exam Simulator. An examiner asks questions; you respond.

localhost:8090/exam-simulator.html
AgentSim FDIC IT Examination Simulator
Exam in Progress Question 3 of 12
FDIC Examiner
Describe your institution's change management process for core banking system updates. How do you ensure changes do not disrupt customer-facing services?
Bank IT Officer
We follow a three-stage process: development, staging, and production. All changes go through our CAB (Change Advisory Board) and require sign-off from...
FDIC Examiner
Good. Now, what is your rollback procedure if a production deployment fails during business hours?
Send
Step 3 of 4

Real Agent Connection

Agent Connect lets you wire a real AI API (Claude, GPT, etc.) into the AgentSim environment for live agent testing.

localhost:8090/agent-connect.html
B
Agent Connect
Live API Integration
Configuration
Anthropic (Claude)
sk-ant-...****
claude-sonnet-4-20250514
incident-triage
Connect & Run
Live Output
Connected to Anthropic API
Streaming response...
Agent: Reading manifest.json to understand the environment...
Agent: Found 12 data files. Starting with incidents.json...
Agent: INC-0042 appears to be P1 severity based on...
Token usage: 847 input, 234 output
1
2
1 Configuration panel -- set your API key, choose a model, and select a scenario. Keys never leave your browser.
2 Streaming output -- watch the real AI agent interact with AgentSim data in real time.
Step 4 of 4

Instructor Mode

For trainers running AgentSim in a classroom, the Instructor page provides a class dashboard and export tools.

localhost:8090/instructor.html
B
Instructor Dashboard
Export CSV Export PDF
24
Students
87%
Avg Completion
B+
Avg Grade
3
Need Help
StudentModuleScenarioGradeStatus
Alice JohnsonModule 6incident-triageAComplete
Bob SmithModule 3rogue-agentB+In Progress
Carol DavisModule 1basic-reconCNeeds Review
1
2
1 Export options -- download class results as CSV (for spreadsheets) or PDF (for reports and records).
2 Student tracking -- see each student's current module, scenario, grade, and whether they need help.
Students flagged as "Needs Review" have failed a scenario twice or scored below 60%. Check their scorecard to identify where they are stuck.

Quick Reference

🎓
Training Portal
8 modules, theory + labs
💻
Simulator
Hands-on terminal
📊
Dashboard
Org, network, metrics
📝
Exam Simulator
FDIC exam prep
🔌
Agent Connect
Live API integration
👨‍🏫
Instructor Mode
Class management