I am puzzling over an issue with glm where interaction terms don't show up in the estimation results unless they are entered as separate variables. I was not able to reproduce the problem using the mtcars data set, but there is nothing unusual about my data.
The first three models below give the same results, while the fourth gives the expected result: Code:
probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
summary(probit_model2)
probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original*pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
summary(probit_model2)
probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original**pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
summary(probit_model2)
df <- df %>% mutate(squared = pop1_3km_original*pop1_3km_original, .after=pop1_3km_original)
probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original+ squared, family = binomial(link = "probit"), data = df, weights = indiv_weight)
summary(probit_model2)
Output:
> probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
Warning message:
In eval(family$initialize) : non-integer #successes in a binomial glm!
> summary(probit_model2)
Call:
glm(formula = any_ics99 ~ wgrp1 + pop1_3km_original, family = binomial(link = "probit"), 
    data = df, weights = indiv_weight)
Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-5.784  -3.079  -2.257   2.729   8.409  
Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.33000    0.01479   22.31   <2e-16 ***
wgrp1             -0.77032    0.01717  -44.87   <2e-16 ***
pop1_3km_original -0.44447    0.01958  -22.70   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
    Null deviance: 31927  on 2327  degrees of freedom
Residual deviance: 29474  on 2325  degrees of freedom
  (1239 observations deleted due to missingness)
AIC: 29642
Number of Fisher Scoring iterations: 5
> 
> probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original*pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
Warning message:
In eval(family$initialize) : non-integer #successes in a binomial glm!
> summary(probit_model2)
Call:
glm(formula = any_ics99 ~ wgrp1 + pop1_3km_original * pop1_3km_original, 
    family = binomial(link = "probit"), data = df, weights = indiv_weight)
Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-5.784  -3.079  -2.257   2.729   8.409  
Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.33000    0.01479   22.31   <2e-16 ***
wgrp1             -0.77032    0.01717  -44.87   <2e-16 ***
pop1_3km_original -0.44447    0.01958  -22.70   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
    Null deviance: 31927  on 2327  degrees of freedom
Residual deviance: 29474  on 2325  degrees of freedom
  (1239 observations deleted due to missingness)
AIC: 29642
Number of Fisher Scoring iterations: 5
> 
> probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original**pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
Error in terms.formula(formula, data = data) : invalid power in formula
> summary(probit_model2)
Call:
glm(formula = any_ics99 ~ wgrp1 + pop1_3km_original * pop1_3km_original, 
    family = binomial(link = "probit"), data = df, weights = indiv_weight)
Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-5.784  -3.079  -2.257   2.729   8.409  
Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.33000    0.01479   22.31   <2e-16 ***
wgrp1             -0.77032    0.01717  -44.87   <2e-16 ***
pop1_3km_original -0.44447    0.01958  -22.70   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
    Null deviance: 31927  on 2327  degrees of freedom
Residual deviance: 29474  on 2325  degrees of freedom
  (1239 observations deleted due to missingness)
AIC: 29642
Number of Fisher Scoring iterations: 5
> 
> df <- df %>% mutate(squared = pop1_3km_original*pop1_3km_original, .after=pop1_3km_original)
> 
> probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original+ squared, family = binomial(link = "probit"), data = df, weights = indiv_weight)
Warning message:
In eval(family$initialize) : non-integer #successes in a binomial glm!
> summary(probit_model2)
Call:
glm(formula = any_ics99 ~ wgrp1 + pop1_3km_original + squared, 
    family = binomial(link = "probit"), data = df, weights = indiv_weight)
Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-5.839  -3.056  -2.288   2.755   8.401  
Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.31527    0.01689  18.669  < 2e-16 ***
wgrp1             -0.76865    0.01720 -44.702  < 2e-16 ***
pop1_3km_original -0.35047    0.05525  -6.344 2.24e-10 ***
squared           -0.07031    0.03878  -1.813   0.0698 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
    Null deviance: 31927  on 2327  degrees of freedom
Residual deviance: 29471  on 2324  degrees of freedom
  (1239 observations deleted due to missingness)
AIC: 29641
Number of Fisher Scoring iterations: 5
 
    