geostep.designer.StratifiedRandomizationDesigner
- class geostep.designer.StratifiedRandomizationDesigner(num_groups: int = 2, n_strata: int = 4, seed: int = 42)[source]
Methods
__init__([num_groups, n_strata, seed])Initialize designer with configuration.
design(df, geo_col, strat_vars)Assigns geographic units to groups using stratified randomization.
enable_monitoring([enabled])Enable or disable performance monitoring.
get_metrics()Get performance and execution metrics.
monitor_operation(operation_name[, ...])Context manager for monitoring operations.
post_process_design(design_df)Post-process design results.
prepare_data([df, geo_col, strat_vars])Prepare data for stratified randomization.
set_metrics_collector(collector)Set the metrics collector for this instance.
validate_inputs([df, geo_col, strat_vars])Validate inputs for stratified randomization design.
Attributes
- __init__(num_groups: int = 2, n_strata: int = 4, seed: int = 42)[source]
Initialize designer with configuration.
- validate_inputs(df: DataFrame = None, geo_col: str = None, strat_vars: List[str] = None, **kwargs) None[source]
Validate inputs for stratified randomization design.
- Parameters:
df (pd.DataFrame) – DataFrame containing geographic units and stratification variables.
geo_col (str) – The name of the column with unique geographic identifiers.
strat_vars (list of str) – A list of column names to use for stratification.
- Raises:
ValidationError – If input validation fails.
DesignError – If num_groups is not 2 or n_strata is invalid.
- prepare_data(df: DataFrame = None, geo_col: str = None, strat_vars: List[str] = None, **kwargs) None[source]
Prepare data for stratified randomization.
- Parameters:
df (pd.DataFrame) – DataFrame containing geographic units and stratification variables.
geo_col (str) – The name of the column with unique geographic identifiers.
strat_vars (list of str) – A list of column names to use for stratification.
- design(df: DataFrame, geo_col: str, strat_vars: List[str]) DataFrame[source]
Assigns geographic units to groups using stratified randomization.
- Parameters:
df (pd.DataFrame) – DataFrame containing geographic units and stratification variables.
geo_col (str) – The name of the column with unique geographic identifiers.
strat_vars (list of str) – A list of column names to use for stratification.
- Returns:
The input DataFrame with ‘stratum’ and ‘assignment’ columns added.
- Return type:
pd.DataFrame
- Raises:
ValidationError – If input validation fails.