R2MLwiN - Known Bugs
Current known bugs
Interaction terms are not recognised when choosing informative priors
Informative priors for interactions terms with the form variable1:variable2 are not correctly recognised, instead giving an "Expected '='" error message. To work around this these parameter names need to be enclosed in quotes.
The Intercept is ignored when specifying priors for the fixed part
If you attempt to set a prior for the Intercept as follows:
(mymodel <- runMLwiN(normexam ~ 1 + standlrt + (1 | school) + (1 | student), estoptions = list(EstM = 1, mcmcMeth = list(priorParam = list(fixe = list(Intercept = c(mean=2,sd=10))))), data = tutorial))
the prior is not applied. The workaround is to instead specify it as follows:
(mymodel <- runMLwiN(normexam ~ 1 + standlrt + (1 | school) + (1 | student), estoptions = list(EstM = 1, mcmcMeth = list(priorParam = list(fixe = list("1" = c(mean=2,sd=10))))), data = tutorial))
Incorrect documentation for random part informative priors
The documentation states the the random part parameters required for priorParam are:
- estimate – an estimate for the true value of the inverse of the covariance matrix
- size – the number of rows in the covariance matrix
This should be:
- estimate - a prior guess for the true value of the covariance matrix
- size - sample size for guess
"." character is not preserved in variables names
When fitting a model R2MLwiN will convert any "." characters to "_" in the parameter names, e.g.
> library(R2MLwiN) > data(tutorial) > tutorial$stand.lrt <- tutorial$standlrt > runMLwiN(normexam ~ 1 + stand.lrt + (1 | student), data=tutorial)
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN (version: 2.36) multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.13s Number of obs: 4059 (from total 4059) The model converged after 3 iterations. Log likelihood: -4880.3 Deviance statistic: 9760.5 --------------------------------------------------------------------------------------------------- The model formula: normexam ~ 1 + stand.lrt + (1 | student) Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] Intercept -0.00119 0.01264 -0.09 0.9249 -0.02596 0.02358 stand_lrt 0.59506 0.01273 46.76 0 *** 0.57011 0.62000 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. var_Intercept 0.64842 0.01439 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
This causes an error when matching the returned chains with their corresponding parameters after running MCMC, e.g.
> runMLwiN(normexam ~ 1 + stand.lrt + (1 | student), estoptions=list(EstM=1), data=tutorial) Error in combchains[, FP.names, drop = FALSE] : subscript out of bounds
Inadvertent removal of intercept when updating formula via update
R2MLwiN has a non-standard (for R) requirement that the intercept is explicitly specified in its formulae (in keeping with the manner in which users typically specify models in MLwiN). However, this does mean that the intercept is removed from the formula if the user employs the generic update function to change the formula's specification. E.g.:
># Intercept specified via inclusion of 1 in fixed effects in formula
> a <- formula(normexam ~ 1 + standlrt + (1|student))
> a
normexam ~ 1 + standlrt + (1 | student)
> a <- update(a, .~. + sex)
> # Intercept removed from fixed effects following update
> a
normexam ~ standlrt + (1 | student) + sex
Formula is incorrectly parsed if there is a "." character in the response
The following will work:
> library(R2MLwiN) > data(bang, package = "R2MLwiN") > (mymodel1 <- runMLwiN(logit(use) ~ 1 + lc, D = "Binomial", data = bang)) -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN (version: 3.02) multilevel model (Binomial) Estimation algorithm: IGLS MQL1 Elapsed time : 0.6s Number of obs: 2867 (from total 2867) The model converged after 4 iterations. Log likelihood: NA Deviance statistic: NA --------------------------------------------------------------------------------------------------- The model formula: logit(use) ~ 1 + lc Level 1: l1id --------------------------------------------------------------------------------------------------- Intercept -1.12288 0.08348 -13.45 3.05e-41 *** -1.28650 -0.95926 lcOne_child 0.93275 0.12156 7.67 1.676e-14 *** 0.69450 1.17100 lcTwo_children 1.09250 0.12509 8.73 2.466e-18 *** 0.84733 1.33768 lcThree_plus 0.87224 0.10302 8.47 2.523e-17 *** 0.67033 1.07416 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the l1id level: Coef. Std. Err. var_bcons_1 1.00000 0.00000 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
If there is a "." character in the response name an error will be produced:
> bang$use.test <- bang$use > (mymodel1 <- runMLwiN(logit(use.test) ~ 1 + lc, D = "Binomial", data = bang)) Invalid link function specified: NA Error in runMLwiN(logit(use.test) ~ 1 + lc, D = "Binomial", data = bang) :
If you encounter any other bugs please report them here.
Bugs fixed
Attempting to fit models with interaction terms can return error depending on order of terms in formulae (fixed in R2MLwiN-0.8-6)
The error:
Error in `[.data.frame`(indata, , outvars) : undefined columns selected
is returned when interactions are fitted in certain model formulae (see examples below). We are current working to resolves this, and will submit an updated version of the package to CRAN as soon as it is resolved.
library(R2MLwiN) data("tutorial") df <- tutorial # create dummy variables for schav (schav_low, schav_mid, schav_high): df <- cbind(df[1:10], sapply(levels(df$schav), function(x) as.integer(x == df$schav))) colnames(df)[11:13] <- c('schav_low', "schav_mid", "schav_high") ## following models are the same, yet error returned depending on order within formulae: # the following execute correctly: formula_1 <- normexam ~ 1 + standlrt + schav_low + schav_mid + schav_low:standlrt + standlrt:schav_mid + (1 | school) runMLwiN(formula_1, data = df) formula_2 <- normexam ~ 1 + standlrt + schav_low + schav_mid + standlrt:schav_low + standlrt:schav_mid + (1 | school) runMLwiN(formula_2, data = df) formula_3 <- normexam ~ 1 + standlrt + schav_mid + standlrt:schav_low + standlrt:schav_mid + schav_low + (1 | school) runMLwiN(formula_3, data = df) # the following return error: formula_4 <- normexam ~ 1 + schav_low + schav_mid + standlrt:schav_low + schav_mid:standlrt + standlrt + (1 | school) runMLwiN(formula_4, data = df) formula_5 <- normexam ~ 1 + schav_low + schav_mid + schav_low:standlrt + schav_mid:standlrt + standlrt + (1 | school) runMLwiN(formula_5, data = df) formula_6 <- normexam ~ 1 + schav_low + schav_mid + schav_low:standlrt + standlrt:schav_mid + standlrt + (1 | school) runMLwiN(formula_6, data = df) formula_7 <- normexam ~ 1 + schav_low + schav_mid + standlrt:schav_low + standlrt:schav_mid + standlrt + (1 | school) runMLwiN(formula_7, data = df)
Parameter can sometimes be incorrectly labelled in model output (fixed in R2MLwiN-0.8-5)
If the order of parameters differs between the fixed part and the random part of the model formula (and/or between different levels of the random part), or if some parameters are included in the random part but not in the fixed part, then the labels in the output table can be incorrect. This incorrect ordering will also be reflected in the names of any saved residuals. For example the output of the following model (in which the order of sex and standlrt differs between the random and fixed part):
library(R2MLwiN) data(tutorial, package = "R2MLwiN") (mymodel <- runMLwiN(normexam ~ 1 + sex + standlrt + (1 + standlrt + sex | student), data = tutorial))
will be (incorrect labelling):
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN (version: 3.02) multilevel model (Binomial) Estimation algorithm: IGLS MQL1 Elapsed time : 0.6s Number of obs: 2867 (from total 2867) The model converged after 4 iterations. Log likelihood: NA Deviance statistic: NA --------------------------------------------------------------------------------------------------- The model formula: normexam ~ 1 + sex + standlrt + (1 + standlrt + sex | student) Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] Intercept -0.10367 0.02019 -5.13 2.826e-07 *** -0.14324 -0.06410 sexgirl 0.16990 0.02577 6.59 4.279e-11 *** 0.11940 0.22040 standlrt 0.59123 0.01286 45.97 0 *** 0.56602 0.61643 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. var_Intercept 0.65016 0.02569 cov_Intercept_standlrt -0.01764 0.01469 var_standlrt 0.00000 0.00000 cov_Intercept_sexgirl -0.01987 0.01145 cov_standlrt_sexgirl 0.03822 0.01499 var_sexgirl 0.01083 0.01068 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
instead of (correct labelling):
--------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] Intercept -0.10367 0.02019 -5.13 2.826e-07 *** -0.14324 -0.06410 sexgirl 0.16990 0.02577 6.59 4.279e-11 *** 0.11940 0.22040 standlrt 0.59123 0.01286 45.97 0 *** 0.56602 0.61643 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. var_Intercept 0.65016 0.02569 cov_Intercept_sexgirl -0.01764 0.01469 var_sexgirl 0.00000 0.00000 cov_Intercept_standlrt -0.01987 0.01145 cov_sexgirl_standlrt 0.03822 0.01499 var_standlrt 0.01083 0.01068 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
likewise the results for the model (common intercept in random part, but not in fixed part):
library(R2MLwiN) data(alevchem, package = "R2MLwiN") alevchem$school <- as.numeric(factor(paste0(alevchem$lea, alevchem$estab))) alevchem$gcseav <- double2singlePrecision(alevchem$gcse_tot/alevchem$gcse_no - 6) (mymodel <- runMLwiN(logit(a_point, cons, 6) ~ 1 + gcseav[1:5] + I(gcseav^2)[1:5] + gender[1:5] + (1[1:5] + gcseav[1:5] | school), D = "Ordered Multinomial", estoptions = list(EstM = 1), data = alevchem))
would be displayed as (incorrect labelling):
--------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Cred. Interval] ESS Intercept_F -1.96766 0.10483 -18.77 1.349e-78 *** -2.17625 -1.75728 115 Intercept_E -0.79025 0.09584 -8.25 1.64e-16 *** -0.97425 -0.59735 83 Intercept_D 0.28683 0.09847 2.91 0.003582 ** 0.09748 0.48787 83 Intercept_C 1.55977 0.10496 14.86 5.968e-50 *** 1.35020 1.75521 82 Intercept_B 3.37472 0.12591 26.80 2.974e-158 *** 3.11744 3.62296 126 gcseav_12345 -2.30974 0.08094 -28.54 4.147e-179 *** -2.47066 -2.15060 237 I(gcseav^2)_12345 -0.44151 0.05000 -8.83 1.042e-18 *** -0.53661 -0.34317 438 genderfemale_12345 0.75232 0.10050 7.49 7.096e-14 *** 0.55900 0.94373 252 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Cred. Interval] ESS var_Intercept_12345 0.13480 0.06346 0.04160 0.27692 30 cov_Intercept_12345_gcseav_12345 -0.02778 0.06300 -0.14639 0.10106 96 var_gcseav_12345 0.61193 0.12805 0.40400 0.90175 223 ---------------------------------------------------------------------------------------------------
instead of (correct labelling):
--------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Cred. Interval] ESS Intercept_F -1.96766 0.10483 -18.77 1.349e-78 *** -2.17625 -1.75728 115 Intercept_E -0.79025 0.09584 -8.25 1.64e-16 *** -0.97425 -0.59735 83 Intercept_D 0.28683 0.09847 2.91 0.003582 ** 0.09748 0.48787 83 Intercept_C 1.55977 0.10496 14.86 5.968e-50 *** 1.35020 1.75521 82 Intercept_B 3.37472 0.12591 26.80 2.974e-158 *** 3.11744 3.62296 126 gcseav_12345 -2.30974 0.08094 -28.54 4.147e-179 *** -2.47066 -2.15060 237 I(gcseav^2)_12345 -0.44151 0.05000 -8.83 1.042e-18 *** -0.53661 -0.34317 438 genderfemale_12345 0.75232 0.10050 7.49 7.096e-14 *** 0.55900 0.94373 252 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the school level: Coef. Std. Err. [95% Cred. Interval] ESS var_gcseav_12345 0.13480 0.06346 0.04160 0.27692 30 cov_gcseav_12345_Intercept_12345 -0.02778 0.06300 -0.14639 0.10106 96 var_Intercept_12345 0.61193 0.12805 0.40400 0.90175 223 ---------------------------------------------------------------------------------------------------
We apologise for any inconvenience this may have caused; if you are using a version prior to 0.8-5 please update your version as soon as it is feasible to do so.
Residual point estimates are not correct for models fitted via MCMC (fixed in R2MLwiN-0.8-5)
Due to incorrect matching the MCMC residuals returned via the resi.store option will be double the correct value and requests for standardised residuals will be ignored. Note that if residual chains are requested then these will be correct (i.e. residuals requested via the resi.chains option).
Cannot fit multiple membership models with a multivariate response (fixed in R2MLwiN-0.8-5)
If multiple membership is specified for multivariate model the generated MLwiN code will not be valid, failing with an error similar to the following:
runMLwiN fails when used with "with" function (fixed in R2MLwiN-0.8-4)
If you try to pass the runMLwiN function to "with" an error occurs. This does not happen with other estimation functions e.g.
> with(tutorial, lm(normexam ~ 1)) Call: lm(formula = normexam ~ 1) Coefficients: (Intercept) -0.0001139 > with(tutorial, runMLwiN(normexam ~ 1 + (1 | student))) Error in eval(expr, envir, enclos) : object 'normexam' not found
Formula fails to be parsed if variables contain a "." character (fixed in R2MLwiN 0.8-3)
If the formula refers to a variable that contains a "." character, for example:
> runMLwiN(Median.yr ~ 1 + (1|ID), data = dataset)
The follow error will occur:
Error in eval(expr, envir, enclos) : object 'Median' not found
Factor variables appear at the end of the output regardless of their position in the formula (fixed in R2MLwiN 0.8-3)
For example in the following formula schgend appears before standlrt:
(mymodel <- runMLwiN(normexam ~ 1 + schgend + standlrt + (1|student), data=tutorial))
however in the results it appears at the end:
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN (version: 2.36) multilevel model (Normal) Estimation algorithm: IGLS Elapsed time : 0.24s Number of obs: 4059 (from total 4059) The model converged after 3 iterations. Log likelihood: -4843.6 Deviance statistic: 9687.1 --------------------------------------------------------------------------------------------------- The model formula: normexam ~ 1 + schgend + standlrt + (1 | student) Level 1: student --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] Intercept -0.09608 0.01713 -5.61 2.052e-08 *** -0.12967 -0.06250 standlrt 0.59433 0.01261 47.12 0 *** 0.56961 0.61905 schgendboysch 0.11777 0.03918 3.01 0.002647 ** 0.04098 0.19456 schgendgirlsch 0.23584 0.02750 8.58 9.732e-18 *** 0.18195 0.28974 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the student level: Coef. Std. Err. var_Intercept 0.63680 0.01414 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
Fit fails when requesting residuals with MCMC and complex level-1 (fixed in R2MLwiN 0.8-2)
When requesting residuals for a model with complex level-1 variation estimated with MCMC MLwiN will stop with the error:
Wrong number of output columns
Incorrect output when data has unused factor levels (fixed in R2MLwiN 0.8-2)
If the data being used has unused factor levels (for example if it is a subset) then these may still be taken into account when displaying the number of units.
Model is incorrectly fit when there is an interaction involving a common coefficient (fixed in R2MLwiN 0.8-2)
If a multinomial model is set up with common coefficients and these in turn are used within an interaction then the interaction terms are ignored in the model. The following warning message is also displayed:
NAs introduced by coercion
Unordered multinomial models fail to run in MCMC when starting values are specified (fixed in R2MLwiN 0.8-2)
When attempting to run an unordered multinomial model with MCMC and user-provided starting values you may get the error:
model specified does not match last model run
Mixed/Poisson multivariate response models: offset specification (fixed in R2MLwiN 0.8-2)
When running a mixed/poisson multivariate response model the offset can only be applied to the first response, otherwise an MLwiN error will occur.
The most recent version of linearHypothesis fails to work with mlwinfitIGLS objects (fixed in R2MLwiN 0.8-2)
When running the linearHypothesis function from the car package on objects returned from an IGLS fit you will get the error:
$ operator not defined for this S4 class
This is due to a change in the NAMESPACE in recent versions of the car package preventing the correct method being found and will be fixed in the next release of R2MLwiN.
Error returned when fitting negative binomial model (fixed in R2MLwiN 0.8-2)
When attempting to run a negative binomial model the following error will occur:
Error in if (as.logical(D[2])) { : missing value where TRUE/FALSE needed
The parameters for the default gamma Priors cannot be modified (fixed in R2MLwiN 0.8-2)
There is currently no way to change the default uninformative gamma Prior to something other than Gamma(0.001, 0.001).
Bug in MCMC starting values (fixed in R2MLwiN 0.8-1)
When using starting values for MCMC estimation the fixed part covariance matrix is set to the values chosen for the random part, which can cause MLwiN to crash.
Bug in ordered multinomial specification (fixed in R2MLwiN 0.8-1)
When checking the base category for ordered multinomial the chosen category is incorrectly compared again the first and last factor alphabetically rather than the underlying numeric codes.
Bug in number of reported observation for mixed response models (fixed in R2MLwiN 0.8-1)
The output for mixed response models contains the number of observations in the expanded data, rather than those in the original data (fixed in R2MLwiN 0.8-1).
Bug in model setup when using interactions and common coefficients (fixed in R2MLwiN 0.8-1)
If interaction terms with common coefficients cause the model setup to fail.
Inconsistency in level when specifiying some options in multivariate models (fixed in R2MLwiN 0.8-1)
The MCMC options "hcen", "paex" and "resi.store.levs", as well as the IGLS "reset" option use level numbering as defined by MLwiN, rather than the R2MLwiN command. This can result in the option being applied to the wrong level in multivariate models (fixed in R2MLwiN 0.8-1).
Incorrect parameter numbering for MCMC residual chains (fixed in R2MLwiN 0.8-1)
When bringing back residual chains from MLwiN the parameters are labelling in the incorrect order (u_0_1, u_0_2...u1_1, u1_2... instead of u_0_1, u_1_1, u_0_2, u_1_2...). This will cause functions such as predLines which use the labelling to work incorrectly (fixed in R2MLwiN 0.8-1).
Bug in debugmode when re-fitting non-Normal models with missing values (fixed in R2MLwiN version 0.8-0)
Missing values are erroneously marked as 'infinite' when re-fitting non-Normal models in debugmode, thus resulting in errors (although the initial model fit, in batch mode, is not affected).
We hope to resolve this bug in the next release of R2MLwiN, but in the meantime, if you wish to re-fit non-Normal models, with missing values, in debugmode via the MLwiN GUI, you can avoid errors by first replacing the current dataset in MLwiN via File > Import, selecting all the imported variables, and replacing them with the dataset outputted by R2MLwiN as part of the current call (the path to this files is specified in the workdir argument in R2MLwiN's runMLwiN function (by default, workdir = tempdir() ).
Bug when specifying interactions in R2MLwiN Formula (fixed in R2MLwiN version 0.2-0)
When an interaction was specified in the R2MLwiN formula (as variable1:variable2), R2MLwiN placed this at the end of the list of predictors as modelled in MLwiN, regardless of where it was specified in the original formula, but then didn't take this into account when importing the results back in; i.e. the coefficient estimates risked being mislabelled. This was fixed in R2MLwiN version 0.2-0.
Bug in display when >1 non-Normal response in multilevel mixed response models (fixed in R2MLwiN version 0.2-0)
Errors were returned when modelling >1 non-Normal response in a multilevel mixed response model; fixed in R2MLwiN version 0.2-0.
Bug in Untoggle (fixed in R2MLwiN version 0.1-8)
Note, there was a bug in the manner in which the R2MLwiN function Untoggle assigned variable names for factors, fixed in R2MLwiN version 0.1-8.