Sequential ordered logits

Digging deeper in vglm()

In a previous tutorial I went over two packages for ordered logits – polr() and vglm() – and walked through some of the basics for running models in both. In those models I focused on more classical cumulative ratio models, but now I am going to introduce continuation ratio models, also called a sequential model. A good review can be found in Chapters 4 and 5 of O’Connell’s Logistic Regression Models for Ordinal Response Variables. In short though, the added value is this: Unlike the more common cumulative ordered logistic regression (i.e. ordered logit), where the model is estimating the likelihood of being at or above a certain category (e.g. P(Y>=2)), a sequential logit evaluates the likelihood of being in a particular level against the likelihood of being in the level just below it (e.g. P[Y>2|Y>=2]).

Let’s see how we can run this models using the VGAM package.

library(VGAM) 
Loading required package: stats4
Loading required package: splines
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

I’ll be using the same data set described in the previous tutorial, farmer interest in adopting a new technology.

head(df)
  interest acres income education econ.value envt.value info.count
1      Not  4.84     NA        NA       4.33        3.6          1
2 Somewhat  3.01      1         5       4.00        3.0          1
3 Somewhat  4.17      3         2         NA         NA          1
4     <NA>  4.40      3         3       5.00         NA          3
5      Not  2.44      4         7       4.67        3.6          9
6     Very  6.64      2         7       2.00        4.2          5

If we wanted to run a more classic, cumulative model, the model specifications would be as follows, namely specifying our family as “cumulative”. Note we are also specifying that parallel = T for now, indicating that we assume that the slopes between stages are parallel, and that we reverse the order of our stages because vglm() oddly defaults to descending order.

cumltv.model <- vglm(ordered(interest) ~ acres + income + education + 
                         econ.value + envt.value + info.count, data = df, 
                       family=cumulative(parallel = T, reverse = T))
summary(cumltv.model)

Call:
vglm(formula = ordered(interest) ~ acres + income + education + 
    econ.value + envt.value + info.count, family = cumulative(parallel = T, 
    reverse = T), data = df)

Coefficients: 
              Estimate Std. Error z value Pr(>|z|)    
(Intercept):1  0.88002    0.54529   1.614  0.10656    
(Intercept):2 -1.01208    0.54711  -1.850  0.06433 .  
acres          0.06972    0.06660   1.047  0.29517    
income        -0.18702    0.07119  -2.627  0.00861 ** 
education     -0.02324    0.06035  -0.385  0.70024    
econ.value    -0.50212    0.10343  -4.855 1.20e-06 ***
envt.value     0.57938    0.09616   6.025 1.69e-09 ***
info.count     0.14728    0.03288   4.479 7.51e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Names of linear predictors: logitlink(P[Y>=2]), logitlink(P[Y>=3])

Residual deviance: 1002.928 on 1010 degrees of freedom

Log-likelihood: -501.4641 on 1010 degrees of freedom

Number of Fisher scoring iterations: 5 

No Hauck-Donner effect found in any of the estimates


Exponentiated coefficients:
     acres     income  education econ.value envt.value info.count 
 1.0722115  0.8294304  0.9770324  0.6052435  1.7849227  1.1586799 

The switch to the sequential model takes just a change in the family, now specifying it as “cratio”, and no longer reversing the order.

seqntl.model <- vglm(ordered(interest) ~ acres + income + education + 
                         econ.value + envt.value + info.count, data = df, 
                       family=cratio(parallel = T))
summary(seqntl.model)

Call:
vglm(formula = ordered(interest) ~ acres + income + education + 
    econ.value + envt.value + info.count, family = cratio(parallel = T), 
    data = df)

Coefficients: 
              Estimate Std. Error z value Pr(>|z|)    
(Intercept):1  1.11877    0.49400   2.265   0.0235 *  
(Intercept):2 -0.28988    0.49660  -0.584   0.5594    
acres          0.05510    0.05989   0.920   0.3576    
income        -0.16453    0.06407  -2.568   0.0102 *  
education     -0.03270    0.05432  -0.602   0.5472    
econ.value    -0.46247    0.09476  -4.881 1.06e-06 ***
envt.value     0.49286    0.08635   5.708 1.14e-08 ***
info.count     0.13283    0.03000   4.427 9.54e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Names of linear predictors: logitlink(P[Y>1|Y>=1]), logitlink(P[Y>2|Y>=2])

Residual deviance: 1004.007 on 1010 degrees of freedom

Log-likelihood: -502.0035 on 1010 degrees of freedom

Number of Fisher scoring iterations: 5 

No Hauck-Donner effect found in any of the estimates


Exponentiated coefficients:
     acres     income  education econ.value envt.value info.count 
 1.0566490  0.8482910  0.9678279  0.6297238  1.6369976  1.1420599 

Here we see that the results between the two models are quite similar, though the effect sizes are slightly smaller in the sequential model with less significance. The main difference is in the conditions of the linear predictors (as noted above). Deciding which family of ordered logits is right for your model may depend on model fit, or perhaps theoretically with the nature of the variable. If you believe that the stages you are evaluating are predicated on going through the previous stage, then a sequential model might be for you.

Also important to note in vglm() is that we cannot rely on the brant() function as we can with polr models. So, testing our proportional odds assumption (that is, the assumption that parallel = T), takes a bit of manual work. Let’s first specify the model without making the proportional odds assumption.

seqntl.model.nonprop = vglm(ordered(interest) ~ acres + income + education + 
                         econ.value + envt.value + info.count, data = df, 
                          family=cratio(parallel = F))
summary(seqntl.model.nonprop)

Call:
vglm(formula = ordered(interest) ~ acres + income + education + 
    econ.value + envt.value + info.count, family = cratio(parallel = F), 
    data = df)

Coefficients: 
               Estimate Std. Error z value Pr(>|z|)    
(Intercept):1 -0.883092   0.662736  -1.332 0.182698    
(Intercept):2  2.230679   0.792002   2.817 0.004855 ** 
acres:1        0.093624   0.085864   1.090 0.275547    
acres:2        0.006717   0.086027   0.078 0.937768    
income:1      -0.223318   0.093201  -2.396 0.016572 *  
income:2      -0.109974   0.091128  -1.207 0.227502    
education:1    0.029837   0.076562   0.390 0.696746    
education:2   -0.092454   0.079330  -1.165 0.243841    
econ.value:1  -0.154035   0.129175  -1.192 0.233081    
econ.value:2  -0.767367   0.144987  -5.293 1.21e-07 ***
envt.value:1   0.633108   0.114018   5.553 2.81e-08 ***
envt.value:2   0.248064   0.138626   1.789 0.073543 .  
info.count:1   0.110310   0.045887   2.404 0.016220 *  
info.count:2   0.153613   0.040627   3.781 0.000156 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Names of linear predictors: logitlink(P[Y>1|Y>=1]), logitlink(P[Y>2|Y>=2])

Residual deviance: 980.3759 on 1004 degrees of freedom

Log-likelihood: -490.188 on 1004 degrees of freedom

Number of Fisher scoring iterations: 5 

No Hauck-Donner effect found in any of the estimates


Exponentiated coefficients:
     acres:1      acres:2     income:1     income:2  education:1  education:2 
   1.0981470    1.0067392    0.7998604    0.8958571    1.0302870    0.9116913 
econ.value:1 econ.value:2 envt.value:1 envt.value:2 info.count:1 info.count:2 
   0.8572417    0.4642337    1.8834549    1.2815419    1.1166244    1.1660390 

A first good sign is that we did not encounter any errors. In some cases, vglm() cannot fit the non-proportional model if it really isn’t suited. If the model runs, however, we still want to check the assumption. Below is a method for testing whether or not the model overall meets the assumption (though it does not provide p-values for specific variables). We calculate the p-value against the hypothesis that the proportion odds assumption is correct.

dev <- deviance(seqntl.model) - deviance(seqntl.model.nonprop)
degf <- seqntl.model@df.residual - seqntl.model.nonprop@df.residual
p.value <- 1 - pchisq(q = dev , df = degf)
p.value
[1] 0.0006104421