forecasting.data

forecasting.data.loader

Module: loader.py Description: Data loading logic to convert Databento MBO records into contiguous numpy arrays for the C++ Engine.

class forecasting.data.loader.DataLoader(data_path: str, start_time: str = None, end_time: str = None, timezone: str = 'UTC')[source]

Bases: object

load(instrument_id: int = None, n_events: int = None) SimulationData[source]

Loads data from file (or S3), filters for instrument, and prepares contiguous arrays.

Parameters:
  • instrument_id (int, optional) – Specific ID to filter for. If None, defaults to the most active instrument in the file.

  • n_events (int, optional) – Maximum number of events to load. If None, loads all events. Useful for memory-efficient sampling.

Returns:

A dataclass containing contiguous (C-style) numpy arrays

ready for zero-copy access by the C++ engine.

Return type:

SimulationData

class forecasting.data.loader.SimulationData(instrument_id: int, n_events: int, actions: numpy.ndarray, sides: numpy.ndarray, prices: numpy.ndarray, sizes: numpy.ndarray, order_ids: numpy.ndarray, ts_recvs: numpy.ndarray, flags: numpy.ndarray)[source]

Bases: object

Holds prepared, contiguous memory arrays for the simulation engine. Also keeps reference to the original records for Python-side access if needed.

actions: numpy.ndarray
flags: numpy.ndarray
instrument_id: int
n_events: int
order_ids: numpy.ndarray
prices: numpy.ndarray
sides: numpy.ndarray
sizes: numpy.ndarray
ts_recvs: numpy.ndarray