Simulation Architecture¶

This page explains the design decisions behind idfkit's simulation module and why certain approaches were chosen.

Subprocess Execution¶

idfkit runs EnergyPlus as a subprocess rather than linking to its libraries directly. This approach has several benefits:

Isolation — Each simulation runs in its own directory with clean state
Compatibility — Works with any EnergyPlus version (8.9+)
Robustness — Crashes in EnergyPlus don't crash your Python process
Simplicity — No C++ bindings or version-specific compilation needed

The simulate() function:

Creates an isolated temporary directory
Copies the model (as IDF) and weather file
Injects Output:SQLite if not present
Invokes the EnergyPlus executable
Returns a SimulationResult with access to all outputs

from idfkit.simulation import simulate

result = simulate(model, "weather.epw", design_day=True)
print(f"Outputs in: {result.run_dir}")

SQLite Over ESO¶

idfkit's result parsers focus on the SQLite output database rather than the traditional ESO/MTR text files. The SQLite format:

Contains all simulation data in one queryable file
Provides structured access to time-series and tabular data
Is faster to parse than text formats
Includes metadata (environments, variables, units)

The module automatically ensures Output:SQLite is present in your model, so you don't need to add it manually.

# All time-series data is accessible via SQL queries
ts = result.sql.get_timeseries(
    variable_name="Zone Mean Air Temperature",
    key_value="ZONE 1",
)

# Tabular reports (normally in HTML) are also in SQLite
tables = result.sql.get_tabular_data("AnnualBuildingUtilityPerformanceSummary")

What About ESO/HTML?¶

The following output formats are not parsed because SQLite provides the same data more reliably:

Format	Alternative
ESO/MTR (time-series)	`result.sql.get_timeseries()`
HTML (tabular reports)	`result.sql.get_tabular_data()`
EIO (metadata)	SQLite metadata tables

If you have a specific need for these formats, please open an issue.

Lazy Loading¶

SimulationResult uses lazy loading — output files are only parsed when you access them:

result = simulate(model, weather)  # Fast: just runs EnergyPlus

# These are lazy — parsed on first access:
result.errors  # Parses ERR file
result.sql  # Opens SQLite database
result.variables  # Parses RDD file

This keeps memory usage low and startup fast, especially for batch simulations where you might only need specific outputs.

Model Immutability¶

The simulate() function copies your model before simulation:

result = simulate(model, weather)

# model is unchanged — Output:SQLite was added to a copy
assert "Output:SQLite" not in model

This ensures:

Your original model isn't mutated
Multiple simulations can run concurrently with the same base model
No unexpected side effects

EnergyPlus Discovery¶

idfkit auto-discovers EnergyPlus installations using a priority chain:

Explicit path — Pass energyplus_dir to simulate() or find_energyplus()
Environment variable — Set ENERGYPLUS_DIR
System PATH — Looks for energyplus executable
Platform defaults:
- macOS: /Applications/EnergyPlus-*/
- Linux: /usr/local/EnergyPlus-*/
- Windows: C:\EnergyPlusV*/

When multiple versions are found in the default directories, the most recent version is selected.

from idfkit.simulation import find_energyplus

config = find_energyplus()
print(f"Version: {config.version}")
print(f"Path: {config.executable}")

Concurrent Execution¶

For parametric studies, simulate_batch() runs simulations in parallel using a thread pool:

from idfkit.simulation import simulate_batch, SimulationJob

jobs = [
    SimulationJob(model=variant1, weather="weather.epw", label="case-1"),
    SimulationJob(model=variant2, weather="weather.epw", label="case-2"),
]

batch = simulate_batch(jobs, max_workers=4)

Each simulation runs in its own subprocess and directory, so there are no conflicts between concurrent runs.

Async Execution¶

The async simulation API (async_simulate, async_simulate_batch, async_simulate_batch_stream) provides non-blocking counterparts to the sync API using Python's asyncio module.

Why Async?¶

The sync API blocks the calling thread during subprocess.run(). This is fine for scripts but problematic when:

Running inside an async web server (FastAPI, aiohttp)
Mixing simulations with other async I/O (network, database)
Wanting streaming progress without callbacks

How It Works¶

The async runner replaces subprocess.run() with asyncio.create_subprocess_exec(). All preparation steps (model copy, directory setup, cache lookup) are synchronous and fast — only the EnergyPlus subprocess execution is truly async.

Preprocessing (ExpandObjects, Slab, Basement) uses subprocess.run() internally. Rather than rewriting the entire preprocessor stack, these are delegated to a thread via asyncio.to_thread() so they don't block the event loop.

Concurrency Model¶

API	Concurrency mechanism
`simulate_batch()`	`ThreadPoolExecutor` with `max_workers`
`async_simulate_batch()`	`asyncio.Semaphore` with `max_concurrent`

Both achieve the same effect: limiting the number of concurrent EnergyPlus subprocesses to avoid overwhelming the system.

Streaming¶

async_simulate_batch_stream() uses an asyncio.Queue to decouple producer tasks from the consumer's async for loop. Events arrive in completion order. Breaking out of the loop cancels remaining tasks.

from idfkit.simulation import async_simulate_batch_stream

async for event in async_simulate_batch_stream(jobs, max_concurrent=4):
    print(f"[{event.completed}/{event.total}] {event.label}")