Download API¶

Weather file downloading and caching.

WeatherDownloader¶

`idfkit.weather.download.WeatherDownloader` ¶

Download and cache weather files from climate.onebuilding.org.

Downloaded ZIP archives are extracted and cached locally so that subsequent requests for the same station and dataset are served from disk without a network call.

Examples:

from idfkit.weather import StationIndex, WeatherDownloader

station = StationIndex.load().search("chicago ohare")[0].station
downloader = WeatherDownloader()
files = downloader.download(station)
print(files.epw)

Parameters:

Name	Type	Description	Default
`cache_dir`	`Path \| None`	Override the default cache directory.	`None`
`max_age`	`timedelta \| float \| None`	Maximum age of cached files before re-downloading. Can be a timedelta or a number of seconds. If `None` (default), cached files never expire.	`None`

Note

Extracted .ddy files are rewritten in place to drop SizingPeriod:DesignDay objects whose numeric fields contain non-numeric placeholder tokens (e.g. N, N/A). These appear in some upstream archives when source data is unavailable and would otherwise cause EnergyPlus to fail with a type-constraint error.

Note

The cache has no size limit. For CI/CD environments with limited disk space, consider using clear_cache periodically or setting a max_age to force re-downloads of stale files.

Source code in src/idfkit/weather/download.py

class WeatherDownloader:
    """Download and cache weather files from climate.onebuilding.org.

    Downloaded ZIP archives are extracted and cached locally so that
    subsequent requests for the same station and dataset are served from
    disk without a network call.

    Examples:
        ```python
        from idfkit.weather import StationIndex, WeatherDownloader

        station = StationIndex.load().search("chicago ohare")[0].station
        downloader = WeatherDownloader()
        files = downloader.download(station)
        print(files.epw)
        ```

    Args:
        cache_dir: Override the default cache directory.
        max_age: Maximum age of cached files before re-downloading.
            Can be a [timedelta][datetime.timedelta] or a number of seconds.
            If ``None`` (default), cached files never expire.

    Note:
        Extracted ``.ddy`` files are rewritten in place to drop
        ``SizingPeriod:DesignDay`` objects whose numeric fields contain
        non-numeric placeholder tokens (e.g. ``N``, ``N/A``). These appear
        in some upstream archives when source data is unavailable and
        would otherwise cause EnergyPlus to fail with a type-constraint
        error.

    Note:
        The cache has no size limit. For CI/CD environments with limited disk
        space, consider using [clear_cache][idfkit.weather.download.WeatherDownloader.clear_cache] periodically or setting
        a ``max_age`` to force re-downloads of stale files.
    """

    __slots__ = ("_cache_dir", "_max_age_seconds")

    def __init__(
        self,
        cache_dir: Path | None = None,
        max_age: timedelta | float | None = None,
    ) -> None:
        self._cache_dir = cache_dir or default_cache_dir()
        if max_age is None:
            self._max_age_seconds: float | None = None
        elif isinstance(max_age, timedelta):
            self._max_age_seconds = max_age.total_seconds()
        else:
            self._max_age_seconds = float(max_age)

    def _is_stale(self, path: Path) -> bool:
        """Check if a cached file is older than max_age."""
        if self._max_age_seconds is None:
            return False
        if not path.exists():
            return True
        age = time.time() - path.stat().st_mtime
        return age > self._max_age_seconds

    @overload
    def download(self, station: WeatherStation) -> WeatherFiles: ...
    @overload
    def download(self, station: WeatherStation, *, only: None) -> WeatherFiles: ...
    @overload
    def download(self, station: WeatherStation, *, only: Iterable[str]) -> PartialWeatherFiles: ...
    def download(
        self,
        station: WeatherStation,
        *,
        only: Iterable[str] | None = None,
    ) -> WeatherFiles | PartialWeatherFiles:
        """Download and extract weather files for *station*.

        If the files are already cached and not stale, no network request is made.

        Args:
            station: The weather station to download files for.
            only: If given, extract only members whose suffix matches one of
                these values (e.g. ``{".epw"}`` or ``[".epw", ".ddy"]``).
                Each entry is normalised to a lowercase suffix with a leading
                dot (``"epw"`` and ``".EPW"`` both match ``.epw`` members).
                When ``None`` (default), every member of the archive is
                extracted and the result is required to contain a ``.epw``
                and a ``.ddy``.

        Returns:
            [WeatherFiles][idfkit.weather.download.WeatherFiles] for a full
            extraction, or
            [PartialWeatherFiles][idfkit.weather.download.PartialWeatherFiles]
            when ``only=`` is set.

        Raises:
            RuntimeError: If the download or extraction fails, or if a full
                extraction is missing a required ``.epw`` or ``.ddy`` file.
        """
        # Derive a cache subdirectory from the ZIP filename
        zip_filename = station.url.rsplit("/", maxsplit=1)[-1]
        stem = zip_filename.removesuffix(".zip")
        station_dir = self._cache_dir / "files" / str(station.wmo) / stem
        zip_path = station_dir / zip_filename

        # Download if not cached or if stale
        if not zip_path.exists() or self._is_stale(zip_path):
            station_dir.mkdir(parents=True, exist_ok=True)
            logger.info("Downloading weather data for %s (WMO %s)", station.display_name, station.wmo)
            try:
                req = Request(station.url, headers={"User-Agent": _USER_AGENT})  # noqa: S310
                with urlopen(req, timeout=120) as resp:  # noqa: S310
                    zip_path.write_bytes(resp.read())
            except (HTTPError, URLError, TimeoutError, OSError) as exc:
                msg = f"Failed to download weather data from {station.url}: {exc}"
                raise RuntimeError(msg) from exc
        else:
            logger.debug("Cache hit for station %s (WMO %s)", station.display_name, station.wmo)

        only_set = _normalise_suffixes(only)
        self._ensure_extracted(zip_path, station_dir, only_set)

        epw_path = self._find_file(station_dir, ".epw")
        ddy_path = self._find_file(station_dir, ".ddy")
        stat_path = self._find_file(station_dir, ".stat")

        if ddy_path is not None:
            sanitize_ddy_file(ddy_path)

        if only_set is not None:
            return PartialWeatherFiles(
                epw=epw_path,
                ddy=ddy_path,
                stat=stat_path,
                zip_path=zip_path,
                station=station,
            )

        # Full-extract path: EPW and DDY are required.
        if epw_path is None:
            msg = f"No .epw file found in downloaded archive for {station.display_name}"
            raise RuntimeError(msg)
        if ddy_path is None:
            msg = f"No .ddy file found in downloaded archive for {station.display_name}"
            raise RuntimeError(msg)
        return WeatherFiles(
            epw=epw_path,
            ddy=ddy_path,
            stat=stat_path,
            zip_path=zip_path,
            station=station,
        )

    @staticmethod
    def _ensure_extracted(
        zip_path: Path,
        station_dir: Path,
        only: frozenset[str] | None,
    ) -> None:
        """Extract members from *zip_path* into *station_dir*.

        If *only* is ``None``, every member is extracted (matching the
        historical ``extractall`` behaviour). Otherwise, only members whose
        lowercased suffix is in *only* are extracted. A member is skipped if
        an up-to-date copy already exists on disk (mtime ≥ ZIP mtime).
        """
        try:
            with zipfile.ZipFile(zip_path) as zf:
                # Compare against the ZIP's mtime rather than ``_is_stale`` —
                # ``zipfile`` preserves archive-internal timestamps, so the
                # extracted file's mtime can be arbitrarily old.
                zip_mtime = zip_path.stat().st_mtime
                for member in zf.namelist():
                    suffix = Path(member).suffix.lower()
                    if only is not None and suffix not in only:
                        continue
                    target = station_dir / Path(member).name
                    if target.exists() and target.stat().st_mtime >= zip_mtime:
                        continue
                    zf.extract(member, station_dir)
        except zipfile.BadZipFile as exc:
            msg = f"Downloaded file is not a valid ZIP archive: {zip_path}"
            raise RuntimeError(msg) from exc

    def get_epw(self, station: WeatherStation) -> Path:
        """Download and return the path to the EPW file.

        Extracts the full archive. To skip extraction of unwanted members,
        call ``download(station, only={".epw"}).epw`` directly.
        """
        return self.download(station).epw

    def get_ddy(self, station: WeatherStation) -> Path:
        """Download and return the path to the DDY file.

        Extracts the full archive. To skip extraction of unwanted members,
        call ``download(station, only={".ddy"}).ddy`` directly.
        """
        return self.download(station).ddy

    def _resolve_filename(self, filename: str, index: StationIndex | None) -> WeatherStation:
        """Resolve an EPW filename to a station, raising on failure."""
        if index is None:
            from .index import StationIndex as _StationIndex

            index = _StationIndex.load()
        stations = index.get_by_filename(filename)
        if not stations:
            msg = f"No weather station found for filename: {filename!r}"
            raise ValueError(msg)
        return stations[0]

    def get_epw_by_filename(
        self,
        filename: str,
        *,
        index: StationIndex | None = None,
    ) -> Path:
        """Download and return the EPW path for an EPW filename.

        Resolves the canonical EPW filename to a station via
        [StationIndex.get_by_filename][idfkit.weather.index.StationIndex.get_by_filename]
        and downloads the corresponding weather files.

        Args:
            filename: EPW filename or stem (with or without extension).
            index: A pre-loaded station index.  If ``None``, loads the
                default index via
                [StationIndex.load][idfkit.weather.index.StationIndex.load].

        Raises:
            ValueError: If the filename does not match any station.
        """
        return self.get_epw(self._resolve_filename(filename, index))

    def get_ddy_by_filename(
        self,
        filename: str,
        *,
        index: StationIndex | None = None,
    ) -> Path:
        """Download and return the DDY path for an EPW filename.

        Same as
        [get_epw_by_filename][idfkit.weather.download.WeatherDownloader.get_epw_by_filename]
        but returns the DDY file path.

        Args:
            filename: EPW filename or stem (with or without extension).
            index: A pre-loaded station index.

        Raises:
            ValueError: If the filename does not match any station.
        """
        return self.get_ddy(self._resolve_filename(filename, index))

    def clear_cache(self) -> None:
        """Remove all cached weather files.

        This removes the entire ``files/`` subdirectory within the cache,
        which contains all downloaded ZIP archives and extracted files.
        """
        files_dir = self._cache_dir / "files"
        if files_dir.exists():
            shutil.rmtree(files_dir)

    @staticmethod
    def _find_file(directory: Path, suffix: str) -> Path | None:
        """Find the first file with the given suffix in *directory*."""
        for p in directory.iterdir():
            if p.suffix.lower() == suffix.lower() and p.is_file():
                return p
        return None

`download(station, *, only=None)` ¶

download(station: WeatherStation) -> WeatherFiles

download(
    station: WeatherStation, *, only: None
) -> WeatherFiles

download(
    station: WeatherStation, *, only: Iterable[str]
) -> PartialWeatherFiles

Download and extract weather files for station.

If the files are already cached and not stale, no network request is made.

Parameters:

Name	Type	Description	Default
`station`	`WeatherStation`	The weather station to download files for.	required
`only`	`Iterable[str] \| None`	If given, extract only members whose suffix matches one of these values (e.g. `{".epw"}` or `[".epw", ".ddy"]`). Each entry is normalised to a lowercase suffix with a leading dot (`"epw"` and `".EPW"` both match `.epw` members). When `None` (default), every member of the archive is extracted and the result is required to contain a `.epw` and a `.ddy`.	`None`

Returns:

Type	Description
`WeatherFiles \| PartialWeatherFiles`	WeatherFiles for a full
`WeatherFiles \| PartialWeatherFiles`	extraction, or
`WeatherFiles \| PartialWeatherFiles`	PartialWeatherFiles
`WeatherFiles \| PartialWeatherFiles`	when `only=` is set.

Raises:

Type	Description
`RuntimeError`	If the download or extraction fails, or if a full extraction is missing a required `.epw` or `.ddy` file.

Source code in src/idfkit/weather/download.py

def download(
    self,
    station: WeatherStation,
    *,
    only: Iterable[str] | None = None,
) -> WeatherFiles | PartialWeatherFiles:
    """Download and extract weather files for *station*.

    If the files are already cached and not stale, no network request is made.

    Args:
        station: The weather station to download files for.
        only: If given, extract only members whose suffix matches one of
            these values (e.g. ``{".epw"}`` or ``[".epw", ".ddy"]``).
            Each entry is normalised to a lowercase suffix with a leading
            dot (``"epw"`` and ``".EPW"`` both match ``.epw`` members).
            When ``None`` (default), every member of the archive is
            extracted and the result is required to contain a ``.epw``
            and a ``.ddy``.

    Returns:
        [WeatherFiles][idfkit.weather.download.WeatherFiles] for a full
        extraction, or
        [PartialWeatherFiles][idfkit.weather.download.PartialWeatherFiles]
        when ``only=`` is set.

    Raises:
        RuntimeError: If the download or extraction fails, or if a full
            extraction is missing a required ``.epw`` or ``.ddy`` file.
    """
    # Derive a cache subdirectory from the ZIP filename
    zip_filename = station.url.rsplit("/", maxsplit=1)[-1]
    stem = zip_filename.removesuffix(".zip")
    station_dir = self._cache_dir / "files" / str(station.wmo) / stem
    zip_path = station_dir / zip_filename

    # Download if not cached or if stale
    if not zip_path.exists() or self._is_stale(zip_path):
        station_dir.mkdir(parents=True, exist_ok=True)
        logger.info("Downloading weather data for %s (WMO %s)", station.display_name, station.wmo)
        try:
            req = Request(station.url, headers={"User-Agent": _USER_AGENT})  # noqa: S310
            with urlopen(req, timeout=120) as resp:  # noqa: S310
                zip_path.write_bytes(resp.read())
        except (HTTPError, URLError, TimeoutError, OSError) as exc:
            msg = f"Failed to download weather data from {station.url}: {exc}"
            raise RuntimeError(msg) from exc
    else:
        logger.debug("Cache hit for station %s (WMO %s)", station.display_name, station.wmo)

    only_set = _normalise_suffixes(only)
    self._ensure_extracted(zip_path, station_dir, only_set)

    epw_path = self._find_file(station_dir, ".epw")
    ddy_path = self._find_file(station_dir, ".ddy")
    stat_path = self._find_file(station_dir, ".stat")

    if ddy_path is not None:
        sanitize_ddy_file(ddy_path)

    if only_set is not None:
        return PartialWeatherFiles(
            epw=epw_path,
            ddy=ddy_path,
            stat=stat_path,
            zip_path=zip_path,
            station=station,
        )

    # Full-extract path: EPW and DDY are required.
    if epw_path is None:
        msg = f"No .epw file found in downloaded archive for {station.display_name}"
        raise RuntimeError(msg)
    if ddy_path is None:
        msg = f"No .ddy file found in downloaded archive for {station.display_name}"
        raise RuntimeError(msg)
    return WeatherFiles(
        epw=epw_path,
        ddy=ddy_path,
        stat=stat_path,
        zip_path=zip_path,
        station=station,
    )

WeatherFiles¶

`idfkit.weather.download.WeatherFiles` `dataclass` ¶

Paths to a fully extracted weather bundle.

Returned by WeatherDownloader.download(station) (no only=). epw and ddy are guaranteed non-None — a missing one raises during download.

Attributes:

Name	Type	Description
`epw`	`Path`	Path to the `.epw` file.
`ddy`	`Path`	Path to the `.ddy` file.
`stat`	`Path \| None`	Path to the `.stat` file, or `None` if not included.
`zip_path`	`Path`	Path to the original downloaded ZIP archive.
`station`	`WeatherStation`	The station this download corresponds to.

Source code in src/idfkit/weather/download.py

@dataclass(frozen=True)
class WeatherFiles:
    """Paths to a fully extracted weather bundle.

    Returned by ``WeatherDownloader.download(station)`` (no ``only=``).
    ``epw`` and ``ddy`` are guaranteed non-``None`` — a missing one raises
    during download.

    Attributes:
        epw: Path to the ``.epw`` file.
        ddy: Path to the ``.ddy`` file.
        stat: Path to the ``.stat`` file, or ``None`` if not included.
        zip_path: Path to the original downloaded ZIP archive.
        station: The station this download corresponds to.
    """

    epw: Path
    ddy: Path
    stat: Path | None
    zip_path: Path
    station: WeatherStation

`epw` `instance-attribute` ¶

`ddy` `instance-attribute` ¶

`stat` `instance-attribute` ¶

`zip_path` `instance-attribute` ¶

`station` `instance-attribute` ¶

PartialWeatherFiles¶

`idfkit.weather.download.PartialWeatherFiles` `dataclass` ¶

Paths to a selectively extracted weather bundle.

Returned by WeatherDownloader.download(station, only=...). Any field whose suffix was not requested and not already cached on disk will be None.

Attributes:

Name	Type	Description
`epw`	`Path \| None`	Path to the `.epw` file, or `None` if not extracted.
`ddy`	`Path \| None`	Path to the `.ddy` file, or `None` if not extracted.
`stat`	`Path \| None`	Path to the `.stat` file, or `None` if not extracted.
`zip_path`	`Path`	Path to the original downloaded ZIP archive.
`station`	`WeatherStation`	The station this download corresponds to.

Source code in src/idfkit/weather/download.py

@dataclass(frozen=True)
class PartialWeatherFiles:
    """Paths to a selectively extracted weather bundle.

    Returned by ``WeatherDownloader.download(station, only=...)``. Any field
    whose suffix was not requested *and* not already cached on disk will be
    ``None``.

    Attributes:
        epw: Path to the ``.epw`` file, or ``None`` if not extracted.
        ddy: Path to the ``.ddy`` file, or ``None`` if not extracted.
        stat: Path to the ``.stat`` file, or ``None`` if not extracted.
        zip_path: Path to the original downloaded ZIP archive.
        station: The station this download corresponds to.
    """

    epw: Path | None
    ddy: Path | None
    stat: Path | None
    zip_path: Path
    station: WeatherStation

Download API¶

WeatherDownloader¶

idfkit.weather.download.WeatherDownloader ¶

download(station, *, only=None) ¶

WeatherFiles¶

idfkit.weather.download.WeatherFiles dataclass ¶

epw instance-attribute ¶

ddy instance-attribute ¶

stat instance-attribute ¶

zip_path instance-attribute ¶

station instance-attribute ¶

PartialWeatherFiles¶

idfkit.weather.download.PartialWeatherFiles dataclass ¶

epw instance-attribute ¶

ddy instance-attribute ¶

stat instance-attribute ¶

zip_path instance-attribute ¶

station instance-attribute ¶

`idfkit.weather.download.WeatherDownloader` ¶

`download(station, *, only=None)` ¶

`idfkit.weather.download.WeatherFiles` `dataclass` ¶

`epw` `instance-attribute` ¶

`ddy` `instance-attribute` ¶

`stat` `instance-attribute` ¶

`zip_path` `instance-attribute` ¶

`station` `instance-attribute` ¶

`idfkit.weather.download.PartialWeatherFiles` `dataclass` ¶

`epw` `instance-attribute` ¶

`ddy` `instance-attribute` ¶

`stat` `instance-attribute` ¶

`zip_path` `instance-attribute` ¶

`station` `instance-attribute` ¶