Skip to content

Weather data

idfkit.weather provides everything you need to attach realistic weather to a model: a bundled index of ~55,000 climate.onebuilding.org TMYx datasets across ~17,300 unique physical stations, an EPW/DDY downloader with local caching, ASHRAE design-day injection, and a Nominatim-backed geocoder.

When to use

  • You need an .epw weather file for a simulation.
  • You're picking the nearest weather station to a building site (by name, address, or coordinates).
  • You need to inject sizing design days into a model so EnergyPlus can autosize HVAC.
  • You're building a batch over multiple climates.

Quick start

from idfkit.weather import StationIndex, WeatherDownloader

index = StationIndex.load()  # local, instant
results = index.search("chicago ohare")
station = results[0].station

downloader = WeatherDownloader()
files = downloader.download(station)
print(files.epw, files.ddy)

Core API

from idfkit.weather import (
    StationIndex,  # bundled station index
    WeatherStation,  # a single station
    WeatherDownloader,  # EPW/DDY fetcher with cache
    geocode,  # address → (lat, lon)
    detect_location,  # heuristic IP-based location
    DesignDayManager,  # DDY parser
    DesignDayType,  # ASHRAE 90.1 design conditions
    apply_ashrae_sizing,  # one-call ASHRAE design day injection
)

Searching for a station

By name (fuzzy)

index = StationIndex.load()
hits = index.search("chicago ohare")
for hit in hits[:3]:
    print(hit.score, hit.station.display_name)

search is whitespace-tokenised and case-insensitive. Use it for "I know roughly what the station is called."

By address (geocode + nearest)

from idfkit.weather import StationIndex, geocode

index = StationIndex.load()
lat, lon = geocode("350 Fifth Avenue, New York, NY")
hits = index.nearest(lat, lon)
for hit in hits[:3]:
    print(f"{hit.station.display_name}: {hit.distance_km:.0f} km")

# Or one-liner:
results = index.nearest(*geocode("350 Fifth Avenue, New York, NY"))

geocode uses Nominatim (OpenStreetMap) — no API key, but rate-limited to 1 req/second. For batch geocoding, sleep between calls or pre-compute.

By coordinates

hits = index.nearest(41.978, -87.904, limit=10)

By distance

# All stations within 200 km of a point
hits = index.nearest(41.0, -73.5, max_distance_km=200)

Inspecting a station

station = hits[0].station
station.display_name  # "Chicago O'Hare AP, IL, USA"
station.country  # "USA"
station.state  # "IL"
station.wmo  # "725300" (string, preserves leading zeros)
station.latitude, station.longitude
station.elevation  # metres above sea level
station.timezone  # hours offset from GMT, e.g. -6.0
station.source  # dataset source, e.g. "TMYx.2009-2023"
station.url  # download URL for the ZIP (EPW + DDY)

Multiple WeatherStation entries can share the same wmo — each TMYx year-range variant is a separate entry.

Downloading EPW + DDY

from idfkit.weather import WeatherDownloader

downloader = WeatherDownloader(cache_dir=Path("~/.cache/idfkit/weather").expanduser())
files = downloader.download(station)  # downloads if absent, returns cached path otherwise
files.epw  # Path
files.ddy
files.station

WeatherDownloader is idempotent — repeated download() calls hit the cache. To force re-download, pass force=True.

The cache directory defaults to $XDG_CACHE_HOME/idfkit/weather on Linux, ~/Library/Caches/idfkit/weather on macOS, and %LOCALAPPDATA%\idfkit\weather on Windows.

Design days

EnergyPlus autosizes HVAC equipment from "design day" conditions (typical hottest hour, typical coldest hour, etc.). They live in DDY files; apply_ashrae_sizing injects them into your model in one call:

from idfkit import load_idf
from idfkit.weather import StationIndex, apply_ashrae_sizing

doc = load_idf("building.idf")
station = StationIndex.load().search("chicago ohare")[0].station
added = apply_ashrae_sizing(doc, station, standard="90.1")
print(f"Added {len(added)} design days")

standard="90.1" adds Heating 99.6% + Cooling 1% DB + Cooling 1% WB; the default standard="general" adds Heating 99.6% + Cooling 0.4% DB. Those are the only two presets.

Lower-level access — download a station's DDY, inspect it, and inject selected percentiles into a model:

from idfkit.weather import DesignDayManager, DesignDayType

ddm = DesignDayManager.from_station(station)
heating = ddm.get(DesignDayType.HEATING_99_6)  # IDFObject | None
cooling = ddm.get(DesignDayType.COOLING_DB_0_4)
added = ddm.apply_to_model(  # injects per the preset args
    doc,
    heating="99.6%",
    cooling="0.4%",
    include_wet_bulb=True,
)

DesignDayType members include HEATING_99_6, HEATING_99, COOLING_DB_0_4/_1/_2, COOLING_WB_*, COOLING_ENTH_*, DEHUMID_*, HUMIDIFICATION_99_6/_99, HTG_WIND_*, WIND_*. Use ddm.summary() to print everything the DDY classifies. NoDesignDaysError is raised by ddm.raise_if_empty() when a station's DDY contains no usable design days (rare, but possible for incomplete TMYx entries).

Detecting the user's location

For interactive tooling that wants a sensible default:

from idfkit.weather import detect_location

lat, lon = detect_location()  # IP-based geolocation, best-effort
nearest = StationIndex.load().nearest(lat, lon)[0]

Refreshing the index

The bundled index ages well — TMYx datasets don't change often. To refresh from upstream:

from idfkit.weather import StationIndex

if index.check_for_updates():
    index = StationIndex.refresh()  # re-download + rebuild

refresh() requires idfkit[weather] for the openpyxl extra (TMYx publishes the catalogue as .xlsx).

Common mistakes

geocoding in a tight loop

for addr in addresses:
    lat, lon = geocode(addr)               # Nominatim rate-limits, will start failing

sleep, or pre-compute

import time

coords = []
for addr in addresses:
    coords.append(geocode(addr))
    time.sleep(1.0)

simulating without design days

doc = load_idf("building.idf")
simulate(doc, "weather.epw")               # autosizing fails — no design days in the model

apply design days first

station = StationIndex.load().search("chicago ohare")[0].station
apply_ashrae_sizing(doc, station, standard="90.1")
simulate(doc, "weather.epw")

assuming display_name is unique

station_by_name = {s.display_name: s for s in index.stations}
# Collisions when multiple TMYx year-ranges exist for the same WMO ID.

key on (wmo, source) if you need uniqueness

station_by_key = {(s.wmo, s.source): s for s in index.stations}