TheDocumentation Index
Fetch the complete documentation index at: https://nixtla-old-docs.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
HierarchicalForecast package contains utility functions to wrangle
and visualize hierarchical series datasets. The
aggregate
function of the module allows you to create a hierarchy from categorical
variables representing the structure levels, returning also the
aggregation contraints matrix .
In addition, HierarchicalForecast ensures compatibility of its
reconciliation methods with other popular machine-learning libraries via
its external forecast adapters that transform output base forecasts from
external libraries into a compatible data frame format.
Aggregate Function
source
aggregate
Utils Aggregation Function. Aggregates bottom level series contained in the DataFrame
df according to levels defined in the spec list.
| Type | Default | Details | |
|---|---|---|---|
| df | Union | Dataframe with columns [time_col, *target_cols], columns to aggregate and optionally exog_vars. | |
| spec | list | list of levels. Each element of the list should contain a list of columns of df to aggregate. | |
| exog_vars | Optional | None | |
| sparse_s | bool | False | Return S_df as a sparse Pandas dataframe. |
| id_col | str | unique_id | Column that will identify each serie after aggregation. |
| time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
| id_time_col | Optional | None | Column that will identify each timestep after temporal aggregation. If provided, aggregate will operate temporally. |
| target_cols | Sequence | (‘y’,) | list of columns that contains the targets to aggregate. |
| Returns | tuple | Hierarchically structured series. |
source
aggregate_temporal
Utils Aggregation Function for Temporal aggregations. Aggregates bottom level timesteps contained in the DataFrame
df according to temporal
levels defined in the spec list.
| Type | Default | Details | |
|---|---|---|---|
| df | Union | Dataframe with columns [time_col, target_cols] and columns to aggregate. | |
| spec | dict | Dictionary of temporal levels. Each key should be a string with the value representing the number of bottom-level timesteps contained in the aggregation. | |
| exog_vars | Optional | None | |
| sparse_s | bool | False | Return S_df as a sparse Pandas dataframe. |
| id_col | str | unique_id | Column that will identify each serie after aggregation. |
| time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
| id_time_col | str | temporal_id | Column that will identify each timestep after aggregation. |
| target_cols | Sequence | (‘y’,) | List of columns that contain the targets to aggregate. |
| aggregation_type | str | local | If ‘local’ the aggregation will be performed on the timestamps of each timeseries independently. If ‘global’ the aggregation will be performed on the unique timestamps of all timeseries. |
| Returns | tuple | Temporally hierarchically structured series. |
source
make_future_dataframe
Create future dataframe for forecasting.
| Type | Default | Details | |
|---|---|---|---|
| df | Union | Dataframe with ids, times and values for the exogenous regressors. | |
| freq | Union | Frequency of the data. Must be a valid pandas or polars offset alias, or an integer. | |
| h | int | Forecast horizon. | |
| id_col | str | unique_id | Column that identifies each serie. |
| time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
| Returns | FrameT | DataFrame with future values |
source
get_cross_temporal_tags
Get cross-temporal tags.
| Type | Default | Details | |
|---|---|---|---|
| df | Union | DataFrame with temporal ids. | |
| tags_cs | dict | Tags for the cross-sectional hierarchies | |
| tags_te | dict | Tags for the temporal hierarchies | |
| sep | str | // | Separator for the cross-temporal tags. |
| id_col | str | unique_id | Column that identifies each serie. |
| id_time_col | str | temporal_id | Column that identifies each (aggregated) timestep. |
| cross_temporal_id_col | str | cross_temporal_id | Column that will identify each cross-temporal aggregation. |
| Returns | tuple | DataFrame with cross-temporal ids. |
Hierarchical Visualization
source
HierarchicalPlot
*Hierarchical Plot This class contains a collection of matplotlib visualization methods, suited for small to medium sized hierarchical series. Parameters:
S: DataFrame with summing matrix of size
(base, bottom), see aggregate
function.tags: np.ndarray, with hierarchical aggregation indexes, where each
key is a level and its value contains tags associated to that level.S_id_col : str=‘unique_id’, column that identifies each
aggregation.*
source
plot_summing_matrix
*Summation Constraints plot This method simply plots the hierarchical aggregation constraints matrix . Returns:
fig: matplotlib.figure.Figure, figure object
containing the plot of the summing matrix.*
source
plot_series
*Single Series plot Parameters:
series: str, string identifying the 'unique_id'
any-level series to plot.Y_df: DataFrame, hierarchically
structured series (). It contains columns
['unique_id', 'ds', 'y'], it may have 'models'.models:
list[str], string identifying filtering model columns.level:
float list 0-100, confidence levels for prediction intervals available
in Y_df.id_col : str=‘unique_id’, column that identifies each
serie.time_col : str=‘ds’, column that identifies each timestep,
its values can be timestamps or integers.target_col : str=‘y’,
column that contains the target.Returns:
fig: matplotlib.figure.Figure, figure object
containing the plot of the single series.*
source
plot_hierarchically_linked_series
*Hierarchically Linked Series plot Parameters:
bottom_series: str, string identifying the
'unique_id' bottom-level series to plot.Y_df: DataFrame,
hierarchically structured series (). It contains
columns [‘unique_id’, ‘ds’, ‘y’] and models. models:
list[str], string identifying filtering model columns.level:
float list 0-100, confidence levels for prediction intervals available
in Y_df.id_col : str=‘unique_id’, column that identifies each
serie.time_col : str=‘ds’, column that identifies each timestep,
its values can be timestamps or integers.target_col : str=‘y’,
column that contains the target.Returns:
fig: matplotlib.figure.Figure, figure object
containing the plots of the hierarchilly linked series.*
source
plot_hierarchical_predictions_gap
*Hierarchically Predictions Gap plot Parameters:
Y_df: DataFrame, hierarchically structured series
(). It contains columns [‘unique_id’, ‘ds’, ‘y’]
and models. models: list[str], string identifying filtering
model columns. xlabel: str, string for the plot’s x axis
label.ylabel: str, string for the plot’s y axis label.id_col : str=‘unique_id’, column that identifies each serie.time_col : str=‘ds’, column that identifies each timestep, its values
can be timestamps or integers.target_col : str=‘y’, column that
contains the target.Returns:
fig: matplotlib.figure.Figure, figure object
containing the plot of the aggregated predictions at different levels of
the hierarchical structure.*
External Forecast Adapters
source
samples_to_quantiles_df
*Transform Random Samples into HierarchicalForecast input. Auxiliary function to create compatible HierarchicalForecast input
Y_hat_df
dataframe.
Parameters:samples: numpy array. Samples from forecast
distribution of shape [n_series, n_samples, horizon].unique_ids: string list. Unique identifiers for each time series.dates: datetime list. list of forecast dates.quantiles: float
list in [0., 1.]. Alternative to level, quantiles to estimate from y
distribution.level: int list in [0,100]. Probability levels for
prediction intervals.model_name: string. Name of forecasting
model.id_col : str=‘unique_id’, column that identifies each
serie.time_col : str=‘ds’, column that identifies each timestep,
its values can be timestamps or integers.backend : str=‘pandas’,
backend to use for the output dataframe, either ‘pandas’ or
‘polars’.Returns:
quantiles: float list in [0., 1.]. quantiles to
estimate from y distribution .Y_hat_df: DataFrame. With base
quantile forecasts with columns ds and models to reconcile indexed by
unique_id.*
