I am running a logistic regression model implemented through generalized estimating equations (GEEs) and keep running into the following error despite trying multiple solutions posted here on SO and elsewhere. I am unsure from where this error arises. I am using the gee package but the error also occurs in geepack.
Does anyone know why this error may be occurring despite no NA, inf, or character variables in the dataset? My suspicion is that there is something very simple I am missing, but after two days, I have to throw it to better coders than me.
Minimal data and code to reproduce the error, attempts at solutions, and relevant SO questions are below.
Data
df <- structure(list(id = structure(c(7L, 1L, 20L, 15L, 14L, 6L, 8L, 24L, 21L, 19L, 5L, 4L, 18L,
13L, 23L, 16L, 25L, 12L, 10L, 9L, 22L, 17L, 11L, 3L, 2L, 2L),
levels = c("ALWA28M", "BOMA13M", "BOMA41M", "DAYA35M", "DEMB72M", "EDAB3WM", "EFCH52M",
"FASI6M", "FRRO35M", "GRAS35F", "GRKA48M", "JARA35M", "KABA27M", "KECH4WM",
"MAAD60M", "MACH33M", "MEBA29F", "MIGU42M", "MTSA10M", "NTMA22F", "RACA2M",
"STMA35M", "TOKE39M", "TRMA12M", "YOLU29M"), class = "factor"),
testres = structure(c(1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L),
levels = c("POS", "NEG"), class = "factor"),
agegrp = structure(c(5L, 3L, 3L, 5L, 1L, 1L, 2L, 2L, 1L, 2L, 6L, 4L, 4L,
3L, 4L, 4L, 3L, 4L, 4L, 4L, 4L, 3L, 5L, 4L, 2L, 2L),
levels = c("0", "1", "2", "3", "4", "5"), class = "factor")),
row.names = c(NA, 26L),
class = "data.frame")
Model
gee::gee(testres ~ agegrp, data = df,
id = id,
family = binomial,
corstr = "exchangeable")
Error
Error in gee::gee(testres ~ agegrp, data = df, id = id, family = binomial, : NA/NaN/Inf in foreign function call (arg 2) In addition: Warning message: In gee::gee(testres ~ agegrp, data = df, id = id, family = binomial, : NAs introduced by coercion
Checking data to ensure no NA, Inf, or character variables - all are factors with no missing data
# All factors
str(df)
# 'data.frame': 26 obs. of 3 variables:
# $ id : Factor w/ 25 levels "ALWA28M","BOMA13M",..: 7 1 20 15 14 6 8 24 21 19 ...
# $ testres: Factor w/ 2 levels "POS","NEG": 1 1 1 2 1 1 1 1 1 1 ...
# $ agegrp : Factor w/ 6 levels "0","1","2","3",..: 5 3 3 5 1 1 2 2 1 2 ...
# No NAs or Infinites
lapply(df, table, useNA = "always")
# 0 NAs
lapply(df, \(x) table(is.infinite(x)))
# All FALSE
Alternative approach using geepack
geepack::geeglm(testres ~ agegrp,
data = df, id = id,
corstr = "exchangeable",
family = "binomial")
geepack error:
Error in lm.fit(zsca, qlf(pr2), offset = soffset) : NA/NaN/Inf in 'y' In addition: Warning messages: 1: In model.response(mf, "numeric") : using type = "numeric" with a factor response will be ignored 2: In Ops.factor(y, mu) : ‘-’ not meaningful for factors
Changing the correlation structure yields same error. Standard logistic regression converges:
summary(glm(testres ~ agegrp, data = df, family = "binomial"(link = logit)))
SO questions that did not resolve the issue. While this issue is common on the site, in my view there is not a sufficient answer to this question on SO, hence the decision to post.
- How to eliminate "NA/NaN/Inf in foreign function call (arg 7)" running predict with randomForest
- R: NA/NaN/Inf in foreign function call (arg 1)
- Error in fitting a model with gee(): NA/NaN/Inf in foreign function call (arg 3)
- NA/NaN/Inf in foreign function call (arg 2)
- NA/NaN/Inf in foreign function call (arg 5)
- lme: NA/NaN/Inf in foreign function call (arg 3)
- NA/NaN/Inf in foreign function call (arg 1) when trying to run a PGLS (Pagel's lambda)
- How to eliminate “NA/NaN/Inf in foreign function call (arg 3)” in bigglm
- R error in glmnet: NA/NaN/Inf in foreign function call