geostep.power.run_power_analysis
- geostep.power.run_power_analysis(historical_data: DataFrame, geo_col: str, date_col: str, kpi_col: str, effect_sizes: List[float] = [0.01, 0.02, 0.03, 0.05], test_weeks_list: List[int] = [4, 6, 8, 10, 12], pre_period_weeks: int = 8, n_sims: int = 500, alpha: float = 0.05) DataFrame[source]
Runs a full power analysis using historical data to determine the probability of detecting a given effect size.
- Parameters:
historical_data (pd.DataFrame) – DataFrame with historical data. Must contain geo, date, and KPI columns.
geo_col (str) – Name of the geographic identifier column.
date_col (str) – Name of the date column. Should be weekly or daily.
kpi_col (str) – Name of the Key Performance Indicator column to be measured.
effect_sizes (list of float, optional) – A list of effect sizes (e.g., 0.03 for 3% lift) to simulate.
test_weeks_list (list of int, optional) – A list of test durations (in weeks) to simulate.
pre_period_weeks (int, optional) – The number of weeks to use for the pre-period baseline.
n_sims (int, optional) – The number of simulations to run for each scenario. Higher is more accurate but slower.
alpha (float, optional) – The significance level for the test (Type I error rate).
- Returns:
A DataFrame summarizing the statistical power for each combination of effect size and test duration.
- Return type:
pd.DataFrame