Chi-square test
chi_square_test(data, target_column, columns=None)
Perform a Chi-square test of independence between a target variable and one or more other variables.
Input data should be categorical data. Continuous data or non-categorical data should be discretized or binned before using this function, as Chi-square tests are not applicable to continuous variables directly.
The test assumes that the observed frequencies in each category are independent.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame
|
Dataframe containing the input data. |
required |
target_column |
str
|
Variable against which independence of other variables is tested. |
required |
columns |
Optional[Sequence[str]]
|
Variables that are tested against the variable in target_column. If None, every column is used. |
None
|
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, float]]
|
Test statistics, p-value and degrees of freedom for each variable. |
Raises:
Type | Description |
---|---|
EmptyDataFrameException
|
Input Dataframe is empty. |
InvalidParameterValueException
|
Invalid column is input. |
Source code in eis_toolkit/exploratory_analyses/chi_square_test.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
|