solar/docs/TRADING_PLATFORM.md

12 KiB
Raw Permalink Blame History

Quantitative Trading Platform Roadmap

The platform uses solar production knowledge (forecasts and stored hourly/daily metrics per location) and stored electricity prices (day-ahead €/MWh from Utilitarian Spot / ENTSO-E) to support quantitative trading strategies. All pulled prices are persisted in the electricity_prices table for historical reference and backtesting.


History vs expected earnings

Concept Data source API Use case
Historical earnings Stored hourly_metrics + electricity_prices GET /trading/backtest with from_date/to_date in the past Backtest: “What would I have earned if I sold at day-ahead price?”
Expected earnings Live forecast (Open-Meteo) + stored electricity_prices GET /trading/expected “What do I expect to earn over the next N days?” (no need to pre-fill hourly_metrics)
  • History: Uses only DB. Date range = past days (e.g. last 7 or 14 days). Same backtest logic; frontend labels it “Historical earnings”.
  • Expected: Uses live forecast for the location + stored prices for the zone. Returns daily expected revenue and kWh for the next 114 days. Frontend shows “Expected earnings” for the next 7 days by default.

How to extend the framework

  1. New strategy
    Add a function in backend/app/trading.py (e.g. run_backtest_threshold(..., min_price_eur_mwh=...)). Call it from a new endpoint in main.py (e.g. GET /trading/backtest-threshold). Frontend: new section or toggle on the Trading page.

  2. New data source
    Add fetcher (e.g. app/spot_prices.py-style), store in a new table or existing one, then use in trading.py and expose via a new or existing endpoint.

  3. New metric (e.g. Sharpe, drawdown)
    Compute in trading.py inside the backtest (or a dedicated analytics helper), add fields to the backtest response; frontend displays them in the summary cards or table.

  4. Multi-location / portfolio
    New endpoint that accepts multiple location_ids, runs backtest per location, aggregates (e.g. sum revenue, weighted average). Frontend: multi-select locations and show combined + per-location breakdown.


Current state (foundation)

  • Data
    • Solar: Stored in daily_metrics and hourly_metrics per location (irradiance, estimated kWh from configurable kWp/PR).
    • Prices: Stored in electricity_prices per zone (BE, NL, FR, DE_LU, …); filled every 6h and via POST /prices/refresh.
  • Location ↔ zone: get_zone_for_location(location_id) maps location to ENTSO-E zone (via country). Locations in countries without a zone have no price data.
  • Trading module (app/trading.py)
    • Production value series: For a location and date range, join hourly production (estimated kWh from shortwave_radiation) with stored prices by hour; output (time_utc, estimated_kwh, price_eur_mwh, value_eur).
    • Backtest: “Sell production at day-ahead price” — daily revenue and total (€) over a date range.
  • API
    • GET /trading/production-value?location_id=...&from_date=...&to_date=... — aligned series (kWh, price, value per hour).
    • GET /trading/backtest?location_id=...&from_date=...&to_date=... — daily PnL and totals (historical).
    • GET /trading/expected?location_id=...&days_ahead=7 — expected daily revenue and kWh for the next N days (live forecast + stored prices).
    • GET /trading/battery-backtest?zone=...&from_date=...&to_date=...&capacity_kwh=...&power_kw=...&efficiency=... — battery arbitrage backtest (charge cheap / discharge dear); see docs/BATTERY_TRADING.md.

Whats next (phased)

Phase 1 Data & analytics (done / in place)

  • Store all electricity prices in DB (electricity_prices).
  • Map location → zone; align hourly production with prices.
  • Production value series and simple backtest (revenue = production × price).
  • Optional: cache or materialize “production value” by (location, date) for fast dashboards.

Phase 2 Strategies & backtesting

  • Battery arbitrage backtest: Charge on cheapest hours, discharge on dearest; returns revenue, cost, profit, cycles, avg spread (see docs/BATTERY_TRADING.md and GET /trading/battery-backtest).
  • Strategy registry: Define strategies (e.g. “sell at day-ahead”, “threshold: only value hours above X €/MWh”) with parameters.
  • Backtest API: Run a strategy over a date range; return PnL, Sharpe, max drawdown, daily curve.
  • Multi-location / portfolio: Backtest across several locations (same zone or multiple zones) with optional weights.
  • Benchmarks: Compare vs “always sell” or fixed price.

Phase 3 Signals & paper trading

  • Signals API: “Given todays forecast and todays prices, what does the strategy suggest?” (e.g. expected revenue per hour, recommended position).
  • Paper trading: Record hypothetical trades (time, zone, side, volume, price) and maintain running PnL.
  • Dashboard: Strategy performance, daily PnL, simple exposure view.

Phase 4 Live execution (later)

  • Integration with exchange/broker (e.g. EPEX, Nord Pool) for day-ahead or intraday.
  • Risk limits (position, exposure by zone).
  • Monitoring, alerts, audit log.

Quantitative database what is stored

All input data that feeds the trading logic is persisted in PostgreSQL. The app behaves as a real quantitative database: time series are stored once and reused for backtests, analytics, and reporting.

Table Contents Used for
daily_metrics Per location, per date: solar (MJ/m²), temp min/max, sun times, wind, rain, weather code, optional AQ (PM2.5, AQI). History charts, daily KPIs, backtest input (via hourly when available).
hourly_metrics Per location, per hour (UTC): shortwave_radiation (W/m²), temp, wind, rain, etc. Production value series, backtest (kWh × price per hour).
electricity_prices Per zone, per time (UTC): day-ahead price €/MWh. Backtest, expected earnings, price comparison, production value.
backtest_daily Per location, per date: revenue_eur, kwh (materialized “sell at day-ahead” result). Fast backtest API: serve from cache when date range is fully covered; else compute and backfill.

Other derived outputs (production value series, expected earnings) are computed on demand. Backtest uses backtest_daily when the requested date range is fully covered; otherwise it computes from hourly_metrics + electricity_prices and upserts into backtest_daily for the next request.


Price forecasting: train a model and evaluate

Because day-ahead price data is often only available for today/tomorrow, we can train a model to predict prices for D+2 … D+7 and use later actuals (once stored in electricity_prices) as the reference.

Ground truth

  • Actual price = the value we store in electricity_prices when it becomes available (e.g. after the market clears or we fetch historical day). For a given (zone, time_utc) we have one actual. Align predictions with the same (zone, time_utc) and compare.

How to evaluate: two complementary views

View What it answers Example
Accuracy vs actual “How close is the prediction to the real price?” Within x% of actual (e.g. ±10%), or MAPE, MAE. Good for model selection and tuning.
Benchmark vs average “Would using this prediction have been better than a simple rule?” Compare revenue (or PnL) if we had used predicted price vs revenue if we had used average price (e.g. rolling 7-day mean) or “last known price”. Good for economic value and trading usefulness.

Recommendation: use both.

  1. Accuracy (e.g. within x% of actual)

    • Define “good forecast” e.g. |predicted actual| / actual ≤ 0.10 (within 10%) or per-hour MAPE.
    • Lets you tune the model and compare architectures (e.g. persistence vs average vs ML).
    • Does not by itself tell you if the forecast would have made money.
  2. Benchmark (vs average or naive)

    • Compare expected revenue (or backtest PnL) under two rules:
      • Rule A: use model prediction for price when actual is missing.
      • Rule B: use average price (e.g. same hour previous 7 days, or daily average).
    • If Rule A consistently beats Rule B, the model is economically useful; if not, a simple average may be enough.

So: accuracy metrics for “is the forecast right?”, benchmark comparison for “is it worth using?”.

Practical setup (minimal)

  • Store predictions: e.g. table price_predictions (model_id, zone, time_utc, value_eur_mwh, created_at). When you run the model (batch or on schedule), insert predictions for D+1 … D+7.
  • Evaluate later: once electricity_prices has the actual for that (zone, time_utc), join and compute:
    • Accuracy: MAPE, MAE, % of points within x% of actual.
    • Benchmark: For the same period, compare backtest “revenue if we used predicted price” vs “revenue if we used rolling average price” (or last-known). Optionally use median price instead of mean to reduce impact of spikes.
  • Average price for the benchmark can be: same hour previous N days, or day-ahead average for that zone over a rolling window. Keep it simple and document it (e.g. “benchmark = 7-day rolling mean by hour”).

This gives you a clear path: train → store predictions → when actuals arrive, score accuracy and benchmark; iterate on the model and thresholds (e.g. x%) as needed.

Data and UI: how much to load vs show

  • Lookback (rolling mean): The baseline and benchmark use a rolling window (e.g. 7 days). You can use 10 days for a smoother series if desired (rolling_days=10). Backend always loads only the lookback it needs for the requested date range.
  • What to show in the UI: To keep the screen focused, show only the last 3 days for evaluation (accuracy + benchmark) and the next 3 days for predictions. The backend still computes the baseline with 7 (or 10) days of history; the UI simply requests a 3-day range:
    • Forecast eval: from_date=today-3, to_date=today (last 3 days of metrics).
    • Forecast benchmark: same 3-day range.
    • Predictions / expected: request days_ahead=7 from the API if you need 7 days for other uses, but in the Trading view display only the next 3 days (e.g. slice the daily array to 3).
  • So: load what you need for the chart (3 days); the backend uses 710 days of history for the rolling mean. No need to “load 10 and show 3” in the sense of over-fetching—the 10 days are internal to the baseline computation.

Design principles

  • Single source of truth: Prices and production are stored once; trading layer only reads and joins.
  • Location-scoped: Strategies run per location (or portfolio of locations); zone comes from config.
  • Backtest-first: New strategies are backtested on stored data before paper or live.
  • Transparent: All inputs (production, prices) and outputs (value, PnL) are traceable via existing APIs and DB.

File reference

Area Files
Config / zones backend/app/config.pyELECTRICITY_PRICE_ZONES, COUNTRY_TO_SPOT_ZONE, get_zone_for_location
Prices storage backend/app/spot_prices.py, backend/app/store.py (upsert/get), backend/app/jobs.py (refresh)
Trading logic backend/app/trading.py — production value series, backtest (with materialized read/backfill), expected earnings
Materialized backend/app/models.pyBacktestDaily; store.pyget_backtest_daily, upsert_backtest_daily
API backend/app/main.py/prices/*, /trading/production-value, /trading/backtest, /trading/expected
Roadmap docs/TRADING_PLATFORM.md (this file)