Skip to contents

Tests whether a higher-order AD model provides significantly better fit than a lower-order model for categorical longitudinal data.

Usage

test_order_cat(
  y = NULL,
  order_null = 0,
  order_alt = 1,
  blocks = NULL,
  homogeneous = TRUE,
  n_categories = NULL,
  fit_null = NULL,
  fit_alt = NULL,
  test = c("lrt", "score", "mlrt", "wald")
)

Arguments

y

Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c. Can be NULL if both fit_null and fit_alt are provided.

order_null

Order under the null hypothesis (default 0).

order_alt

Order under the alternative hypothesis (default 1). Must be greater than order_null.

blocks

Optional integer vector of length n_subjects specifying group membership.

homogeneous

Logical. If TRUE (default), parameters are shared across all groups.

n_categories

Number of categories. If NULL, inferred from data.

fit_null

Optional pre-fitted model under null hypothesis (class "cat_fit"). If provided, y is not required for fitting under H0.

fit_alt

Optional pre-fitted model under alternative hypothesis. If provided, y is not required for fitting under H1.

test

Type of test statistic. One of "lrt" (default), "score", "mlrt", or "wald".

Value

A list of class "cat_lrt" containing:

method

Inference method used: one of "lrt", "score", "mlrt", or "wald".

lrt_stat

Likelihood ratio test statistic

df

Degrees of freedom

p_value

P-value from chi-square distribution

fit_null

Fitted model under H0

fit_alt

Fitted model under H1

order_null

Order under null

order_alt

Order under alternative

table

Summary data frame

Details

The likelihood ratio test statistic is: $$\lambda = -2[\ell_0 - \ell_1]$$ where \(\ell_0\) and \(\ell_1\) are the maximized log-likelihoods under the null and alternative hypotheses.

Under H0, \(\lambda\) follows a chi-square distribution with degrees of freedom equal to the difference in the number of free parameters.

For testing AD(p) vs AD(p+1), the degrees of freedom are: $$df = (c-1)^2 \times c^p \times (n - p - 1)$$ where c is the number of categories and n is the number of time points.

If y contains missing values and models are fit internally, this function defaults to na_action = "marginalize" for fitting. Score- and Wald-based variants currently require complete data.

References

Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.

Examples

if (FALSE) { # \dontrun{
# Simulate AD(1) data
set.seed(123)
y <- simulate_cat(200, 6, order = 1, n_categories = 2)

# Test AD(0) vs AD(1)
test_01 <- test_order_cat(y, order_null = 0, order_alt = 1)
print(test_01$table)

# Test AD(1) vs AD(2)
test_12 <- test_order_cat(y, order_null = 1, order_alt = 2)
print(test_12$table)

# Using pre-fitted models
fit0 <- fit_cat(y, order = 0)
fit1 <- fit_cat(y, order = 1)
test_prefitted <- test_order_cat(fit_null = fit0, fit_alt = fit1)
} # }