A binomial logistic regression (or logistic regression for short) is used when the outcome variable being predicted is dichotomous (i.e. yes/no, pass/fail). This model can be used with any number of independent variables that are categorical or continuous.
Assumptions
In addition to the two mentioned above:
- Independence of observations
- Categories of the outcome variable must be mutually exclusive and exhaustive
- Linear relationship between continuous variables and the logit transformation of the outcome variable
Running Logistic Regression in SPSS
- Analyze > Regression > Binary Logistic...
- Move the dichotomous outcome variable to the "Dependent" box.
- Move all predictor variables into the "Covariates" box (ignoring the "Previous" and "Next" options).
- Click on the "Categorical" button to define the categories of any categorical predictor variables.
- Click "Continue" to return to the main dialogue box.
- Click on the "Options" button to select additional statistics and plots you want included with your output.
- Click "Continue" to return to the main dialogue box.
- Click "OK" to run the test.
Interpreting Output
- Model Summary
- Cox & Snell R Square - a measure of variance explained - interpreted the same as R-square in linear regression
- Nagelkerke R Square - preferred measure of variance explained - interpreted the same as R-square in linear regression
- Ombinus Tests of Model Coefficients
- Provides results of the Chi-Square Goodness-of-Fit test used to assess the significance of the overall model
- Classification Table
- Provides a measure of the accuracy of the model
- percentage accuracy in classification (PAC) - the percentage of cases correctly classified with the predictor variables added
- sensitivity - the percentage of cases that had the observed characteristic and were correctly predicted by the model
- specificity - the percentage of cases that did not have the observed characteristic and were correctly predicted by the model
- positive predictive value - the percentage of correctly predicted cases with observed characteristic compared to the total number of cases predicted as having the characteristic
- negative predictive value - the percentage of correctly predicted cases without observed characteristic compared to the total number of cases predicted as not having the characteristic
- Variables in the Equation
- Provides a measure of the contribution of each predictor variable in the model (like the "Coefficients" output for a linear regression)
- Wald test - used to determine the significance (sig.) for each predictor variable
- Exp(B) is an odds ratio used to predict the probability of an event occurring based on a one-unit change in the predictor variable when all other predictors are kept constant.
Reporting Results in APA Style
A logistic regression was performed to assess the effects of age and gender on the likelihood of having cancer. The logistic regression model was statistically significant, χ2(4) = 17.313, p < .001. The model explained 42% (Nagelkerke R2) of the variance in cancer presence and correctly classified 73% of cases. Males were 7.02 times more likely to have cancer than females. Additionally, increasing age was associated with an increased likelihood of developing cancer.