geostep.power.run_power_analysis

geostep.power.run_power_analysis(historical_data: DataFrame, geo_col: str, date_col: str, kpi_col: str, effect_sizes: List[float] = [0.01, 0.02, 0.03, 0.05], test_weeks_list: List[int] = [4, 6, 8, 10, 12], pre_period_weeks: int = 8, n_sims: int = 500, alpha: float = 0.05) DataFrame[source]

Runs a full power analysis using historical data to determine the probability of detecting a given effect size.

Parameters:
  • historical_data (pd.DataFrame) – DataFrame with historical data. Must contain geo, date, and KPI columns.

  • geo_col (str) – Name of the geographic identifier column.

  • date_col (str) – Name of the date column. Should be weekly or daily.

  • kpi_col (str) – Name of the Key Performance Indicator column to be measured.

  • effect_sizes (list of float, optional) – A list of effect sizes (e.g., 0.03 for 3% lift) to simulate.

  • test_weeks_list (list of int, optional) – A list of test durations (in weeks) to simulate.

  • pre_period_weeks (int, optional) – The number of weeks to use for the pre-period baseline.

  • n_sims (int, optional) – The number of simulations to run for each scenario. Higher is more accurate but slower.

  • alpha (float, optional) – The significance level for the test (Type I error rate).

Returns:

A DataFrame summarizing the statistical power for each combination of effect size and test duration.

Return type:

pd.DataFrame