Engine Architecture
The SimSwarm engine (simswarm/) is a pure-Python, async, framework-free simulation core.
It has no infrastructure dependencies — no database, no Celery, no HTTP framework — so the
library can be driven standalone or wrapped by the SaaS worker. The only external calls are
to an OpenAI-compatible chat endpoint via simswarm.llm.LLMClient.
The public surface is simswarm.engine.Engine, which orchestrates rounds, agents, and
environments. The plain dataclasses it operates on live in simswarm/types.py.
Core types
simswarm/types.py defines everything as plain dataclasses (no framework deps):
Agent—id,name,persona(a system-prompt string),environments(list of env names it acts in),belief_state,config(AgentActivityConfig), and amemorylist.BeliefState—positions(topic →[-1, 1]),confidence(topic →[0, 1]),trust(author name →[0, 1]), andexposure_history(a set of content hashes).Action/ActionResult— an agent's intended action and an environment's response.ActionRecord— the logged row appended to the chat log (round, agent, action_type, platform,action_args,success,action_result).SimulationConfig— the full run input:seed_text,goal,entities,environments,rounds,concurrency,variables,scheduled_events,enrichment.EngineConfig— engine-level knobs:max_memory_rounds=20,concurrency=32,context_budget=16384,flush_interval=10,checkpoint_interval=50.SimulationResult— the output:chat_log,graph_data(aGraphSnapshot),trajectories,market_data,raw_state.Tool— an action exposed by an environment as an LLM tool, withto_openai_schema().
Constructing the engine
class Engine:
def __init__(self, fast_llm: LLMClient, smart_llm: LLMClient,
engine_config: EngineConfig | None = None): ...
Two LLM clients are injected. The fast client drives the per-round agent loop (it makes
the bulk of the calls); the smart client is reserved for the heavier offline analysis
steps (entity/relation/persona extraction and report writing) that run outside Engine.run.
The round loop
Engine.run(config, on_progress=None, on_round=None) is the heart of the engine. Setup:
_create_environments(config.environments)instantiates one environment perEnvironmentConfig(social,market,economic). If none are configured it defaults to a singleSocialEnvironment._create_agents(config.entities, env_names)turns eachEntityinto anAgent. The persona is seeded inline asf"You are {entity.name}. {entity.summary}", and every agent is granted access to every environment.- A
Bridge(see below), thechat_log, thesnapshotslist, and anasyncio.Semaphore(config.concurrency)are created. belief_topicis derived once from the goal:(config.goal or "topic").strip()[:200] or "topic". Belief dynamics treat the whole sim as a single topic.
Then, for each round 1..config.rounds:
- Inject scheduled events —
bridge.inject_scheduled(config.scheduled_events, round_num)queues anyScheduledEventwhoseroundmatches. - Gather observations — for every agent, collect one
Observationper environment it belongs to (env.get_observations(agent)), plus a bridge digest of cross-environment events, plus ascenarioobservation renderingconfig.variablesif present. These are stored inagent_observations[agent.id]before any LLM call, so all agents observe the same pre-step world state (synchronous within a round). - Concurrency-gated agent steps —
step_agentis defined as a coroutine and run for all agents viaasyncio.gather. Each invocation acquires the semaphore (async with semaphore:) so at mostconfig.concurrencyLLM calls are in flight at once. Inside:- Build the tool list by union-ing
env.get_tools()across the agent's environments and converting each to an OpenAI schema (Tool.to_openai_schema()). - Build the message list with
build_context(agent, obs)and callself.fast_llm.chat(messages, tools=tool_schemas). - For each returned tool call, resolve the owning environment with
_find_env_for_action, build anAction, execute it (env.execute_action), and append anActionRecordcapturingsuccessandaction_result. - Append a memory line
f"Round {n}: {action}({args})", then trim memory to the lastmax_memory_rounds. - If the LLM returned no tool calls, a synthetic
do_nothingActionRecordis logged.
- Build the tool list by union-ing
- Belief update — gather a
{post_id: (likes, dislikes)}lookup by callingenv.current_engagement()on any environment that exposes it, thenapply_belief_updates(agents, round_records, belief_topic, likes_lookup=...). See Belief formulation. - Tick —
env.tick()on every environment (advancescurrent_round, recomputes metrics, queues virality/price-move/metric-change events). - Bridge events — collect
env.publish_events()from all environments and hand them tobridge.receive_events(...)for next round's digests. - Snapshot — append a
RoundSnapshotwithmetrics={"actions": <count>}. - Callbacks —
await on_round(round_num, chat_log)andawait on_progress(round_num, config.rounds, metrics)if provided. - Clear —
bridge.clear()empties pending events so digests don't accumulate.
After the last round, run returns a SimulationResult whose graph_data is built inline
via build_graph(list(config.entities), chat_log) (no LLM relations at this stage — those
are merged on-pod by the job runner after Engine.run returns; see
Graph build) and whose raw_state carries
the final agents, environments, and snapshots.
Action → environment routing
_find_env_for_action(action_name, environments, agent) walks the agent's environments in
order and returns the first one whose get_tools() set contains a tool named action_name.
If no environment claims the action, it falls back to the agent's first environment (or
"unknown"). This is why tool names must be unique enough across environments to route
correctly — the first matching environment wins.
Concurrency model
There is exactly one semaphore, sized by config.concurrency (defaulting to
EngineConfig.concurrency = 32). All agents for a round are dispatched at once via
asyncio.gather, but only concurrency of them hold the semaphore — and therefore an
in-flight LLM request — simultaneously. Observations are computed up front for the whole
round, so an agent never sees another agent's same-round action; cross-round visibility is
what drives the dynamics. Environment state mutations from execute_action happen as each
agent's tool calls resolve, but because the feed each agent saw was snapshotted before the
gather, the round is effectively simultaneous from each agent's perspective.
The cross-environment bridge
simswarm/bridge.py decouples environments from each other. Each environment publishes
typed Events (viral_post, price_move, policy_change, metric_change); the Bridge
collects them and, in the next round, renders a per-agent digest of events whose source
is an environment the agent is not directly in (get_digest filters out same-source events
so an agent isn't told twice about its own platform). _format_event renders human-readable
one-liners, e.g. [Social] Trending: "..." by <author> or [Market] <q> moved up to 63%.
Scheduled events are injected with source="scheduled".
The output adapter
simswarm/adapter.py is the contract bridge to the SaaS worker. adapt_chat_log and
adapt_graph_data serialize the dataclasses to the exact {...} / {nodes, edges, metadata} shapes the frontend consumes (agent_id stays a string). adapt_structured
assembles the final results dict by merging an LLM brief/verdict/findings with the
deterministic signals from build_story_signals(...) — see Story signals.
FINDING_COLORS supplies fallback accent colors when the LLM omits them.
Sweeps
Engine.run_sweep(sweep, on_progress=None) expands a ScenarioSweep into configs via
generate_sweep_configs and runs them sequentially, returning
list[tuple[key, SimulationResult]]. See Sweeps.