4.9 KiB
4.9 KiB
How far back can we store data? How big can the database get?
Current setup (Forecast API + Air Quality API)
| Source | Parameter | Max past | What you get |
|---|---|---|---|
| Open-Meteo Forecast | past_days |
92 days | Archived forecast (not observations) for the last ~3 months |
| Open-Meteo Air Quality | past_days |
92 days | Same; hourly AQ aggregated to daily in our app |
So with the current code:
- You can request at most 92 days in the past per call.
- Our default is
past_days=3andforecast_days=14, so we fetch 3 days back and 14 days ahead. - History grows by repeatedly calling
/metricsor/forecast: each call upserts that window into the DB. So if you call every day, you keep sliding the window and you effectively keep about the last 92 days of data per location (older dates are never re-fetched with the Forecast API).
Bottom line (current): With the Forecast API alone, you can realistically keep up to 92 days of history per location. The DB can hold more rows, but you cannot fetch more past days from this API.
How big can the database get?
- Schema: One row per
(location_id, date)indaily_metrics. - Rough size: 4 locations × 365 days ≈ 1,460 rows/year; 10 years ≈ 14,600 rows. Row size is small (a few hundred bytes). PostgreSQL handles this easily (millions of rows is fine).
- Practical limit: Disk and backup size, not “number of days”. So the database can be as big as you want as long as you have a data source that provides older dates (see below).
Going further back: Historical Weather API
Open-Meteo has a Historical Weather API (archive) that goes back to 1940:
- Endpoint:
https://archive-api.open-meteo.com/v1/archive - Parameters:
latitude,longitude,start_date,end_date(e.g.2020-01-01to2024-12-31), plusdaily=...(and optionallyhourly=...). - Data: Reanalysis (ERA5, ERA5-Land, etc.): 1940–present at 0.1°–0.25°; from 2017 onward also 9 km resolution (IFS). Slight delay (e.g. a few days) for recent dates.
- Use case: Backfill years of daily (or hourly) data into
daily_metrics(and optionally an hourly table) so you can compare weekdays, weekends, seasons, and later holidays over many years.
So:
- Forecast API: last 92 days only.
- Historical (Archive) API: 1940 to present (with a small recent delay).
- Database: Can store as many days as you feed it; size is not the limiting factor.
Summary
| Question | Answer |
|---|---|
| How many days back can we fetch with the current Forecast API? | 92 days (past_days max). |
| How big can our history be with the current setup? | Up to 92 days per location, unless you add a backfill. |
| How big can the database get? | As large as you need; Postgres and disk are the only limits. |
| Can we store years of history? | Yes, by adding a backfill that calls the Historical Weather API (/v1/archive) in chunks (e.g. by year or month) and upserts into daily_metrics (and optionally air quality if available for that API). |
Backfill and scheduled refresh (implemented)
- Backfill (1990–today):
POST /backfillstarts a background job that fetches historical daily weather from the Open-Meteo Archive API for all locations and upserts intodaily_metrics. Optional query params:from_date,to_date(default: 1990-01-01 to today). The request returns immediately (202); the job runs in the background. Air quality is not available from the archive API, so only weather fields are backfilled for old dates. - Scheduled refresh (4×/day): The backend runs a job every 6 hours that fetches current forecast + air quality for all locations (all countries) and upserts into the DB. So new data is stored regularly without manual “Load forecast data” clicks.
Daily vs hourly: one request, both stored
You do not need separate “hourly” API calls to store hourly data.
- Open-Meteo Forecast API returns both daily and hourly in a single request. When we call it with
forecast_days=14, we get:- 14 days of daily aggregates (sunrise, sunset, solar sum, T min/max, wind, rain, etc.),
- and the full hourly time series for that same 14-day window (temperature, solar, wind, rain, etc. per hour).
- Scheduled job (4×/day): Each run calls the forecast API once per location, then:
- upserts daily into
daily_metrics, - upserts hourly into
hourly_metrics(from the same response).
- upserts daily into
- Dashboard “Load forecast data”: Same thing: one forecast request per location (for the selected country), then both daily and hourly are persisted in the background.
So multiple requests per day (e.g. 4×/day) are enough: each request already contains the full 14-day hourly series. No extra “hourly-only” requests are required to store or refresh hourly data.