Inconsistent results with fitlme

Question

Maria Garcia Garrido 2022년 7월 27일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1769185-inconsistent-results-with-fitlme

편집: Darshak 2025년 4월 29일

I am trying to analyse some data with linear mixed-effects models using the function 'fitlme', but I noticed that I obtain different results depending on the alphabetical order of a categorical grouping variable. Could you let me know if I am doing something wrong?

I first create a table with the dataset:

grouping_variable1 = categorical(grouping_variable1);
grouping_variable2 = categorical(grouping_variable2);
ID = categorical(ID);
    
tbl1 = table(ID,grouping_variable1,grouping_variable2,dependent_variable);

I then create the linear mixed-effects model with the following line:

lme = fitlme(tbl1,'dependent_variable ~ grouping_variable1*grouping_variable2 + (1|ID)');

These are the two results I get deppending on the alphabetical order of grouping_variable2/4:

Option 1

grouping_variable2 = string(zeros(23*2,1));
grouping_variable2(1:23)  = 'Training';
grouping_variable2(23+1:23*2)  = 'Test';
grouping_variable2 = categorical(grouping_variable2);
tbl1 = table(animal_ID,grouping_variable1,grouping_variable2,dependent_variable);
lme = fitlme(tbl1,'dependent_variable ~ grouping_variable1*grouping_variable2 + (1|ID)')
lme = 
Linear mixed-effects model fit by ML
Model information:
    Number of observations              46
    Fixed effects coefficients           8
    Random effects coefficients         23
    Covariance parameters                2
Formula:
    dependent_variable ~ 1 + grouping_variable1*grouping_variable2 + (1 | ID)
Model fit statistics:
    AIC       BIC        LogLikelihood    Deviance
    -28.98    -10.693    24.49            -48.98  
Fixed effects coefficients (95% CIs):
    Name                                                           Estimate     SE          tStat       DF    pValue       Lower    
    {'(Intercept)'                                        }          0.46417    0.058778      7.8969    38    1.557e-09      0.34518
    {'grouping_variable1_2'                            }          0.16375    0.092937      1.7619    38     0.086122    -0.024391
    {'grouping_variable1_3'                             }         0.023056    0.083125     0.27736    38      0.78301     -0.14522
    {'grouping_variable1_4'                            }          0.20131    0.080102      2.5132    38     0.016325     0.039152
    {'grouping_variable2_Training'                        }         0.020833    0.073085     0.28506    38      0.77715     -0.12712
    {'grouping_variable1_2:grouping_variable2_Training'}         -0.19375     0.11556     -1.6767    38      0.10182     -0.42768
    {'grouping_variable1_3:grouping_variable2_Training' }        -0.060833     0.10336    -0.58857    38      0.55963     -0.27007
    {'grouping_variable1_4:grouping_variable2_Training'}         -0.20012    0.099597     -2.0093    38     0.051649     -0.40174
    Upper    
      0.58316
      0.35189
      0.19133
      0.36347
      0.16879
     0.040182
       0.1484
    0.0015053
Random effects covariance parameters (95% CIs):
Group: ID (23 Levels)
    Name1                  Name2                  Type           Estimate    Lower       Upper  
    {'(Intercept)'}        {'(Intercept)'}        {'std'}        0.068596    0.027252    0.17266
Group: Error
    Name               Estimate    Lower       Upper
    {'Res Std'}        0.12659     0.094816    0.169
    
    
results = anova(lme)
results = 
    ANOVA marginal tests: DFMethod = 'Residual'
    Term                                             FStat       DF1    DF2    pValue   
    {'(Intercept)'                          }          62.361    1      38     1.557e-09
    {'grouping_variable1'                   }          2.9461    3      38      0.045032
    {'grouping_variable2'                   }        0.081258    1      38       0.77715
    {'grouping_variable1:grouping_variable2'}           1.789    3      38       0.16568

Option 2

grouping_variable4 = string(zeros(23*2,1));
grouping_variable4(1:23)  = 'ATraining';
grouping_variable4(23+1:23*2)  = 'BTest';
grouping_variable4 = categorical(grouping_variable4);
tbl1 = table(animal_ID,grouping_variable1,grouping_variable4,dependent_variable);
lme = fitlme(tbl1,'dependent_variable ~ grouping_variable1*grouping_variable4 + (1|ID)')
lme = 
Linear mixed-effects model fit by ML
Model information:
    Number of observations              46
    Fixed effects coefficients           8
    Random effects coefficients         23
    Covariance parameters                2
Formula:
    dependent_variable ~ 1 + grouping_variable1*grouping_variable4 + (1 | ID)
Model fit statistics:
    AIC       BIC        LogLikelihood    Deviance
    -28.98    -10.693    24.49            -48.98  
Fixed effects coefficients (95% CIs):
    Name                                                        Estimate     SE          tStat       DF    pValue        Lower         Upper  
    {'(Intercept)'                                     }            0.485    0.058778      8.2513    38    5.3474e-10       0.36601    0.60399
    {'grouping_variable1_2'                         }            -0.03    0.092937     -0.3228    38       0.74862      -0.21814    0.15814
    {'grouping_variable1_3'                          }        -0.037778    0.083125    -0.45447    38       0.65208      -0.20606     0.1305
    {'grouping_variable1_4'                         }        0.0011905    0.080102    0.014862    38       0.98822      -0.16097    0.16335
    {'grouping_variable4_BTest'                        }        -0.020833    0.073085    -0.28506    38       0.77715      -0.16879    0.12712
    {'grouping_variable1_2:grouping_variable4_BTest'}          0.19375     0.11556      1.6767    38       0.10182     -0.040182    0.42768
    {'grouping_variable1_3:grouping_variable4_BTest' }         0.060833     0.10336     0.58857    38       0.55963       -0.1484    0.27007
    {'grouping_variable1_4:grouping_variable4_BTest'}          0.20012    0.099597      2.0093    38      0.051649    -0.0015053    0.40174
Random effects covariance parameters (95% CIs):
Group: ID (23 Levels)
    Name1                  Name2                  Type           Estimate    Lower       Upper  
    {'(Intercept)'}        {'(Intercept)'}        {'std'}        0.068596    0.027252    0.17266
Group: Error
    Name               Estimate    Lower       Upper
    {'Res Std'}        0.12659     0.094816    0.169
    
results = anova(lme)
results = 
    ANOVA marginal tests: DFMethod = 'Residual'
    Term                                             FStat       DF1    DF2    pValue    
    {'(Intercept)'                          }          68.084    1      38     5.3474e-10
    {'grouping_variable1'                   }         0.11571    3      38        0.95036
    {'grouping_variable4'                   }        0.081258    1      38        0.77715
    {'grouping_variable1:grouping_variable4'}           1.789    3      38        0.16568

The p value of grouping_variable1 is different in the ANOVA results, depending of the other variable. I am not sure of what I may be doing wrong, or why the results would be different because of this, so any help or guidance would be greatly appreciated.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Maria Garcia Garrido 2022년 7월 29일

The problem occurred when interactions were included in the model. Setting 'DummyVasCoding' to 'effects' to create dummy variables using effects coding solved the problem.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Darshak 2025년 4월 28일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1769185-inconsistent-results-with-fitlme#answer_1564348

편집: Darshak 2025년 4월 29일

MATLAB Online에서 열기

Hello Maria,

This behavior with “fitlme” is a known thing when working with categorical variables in MATLAB. It happens because of how MATLAB internally handles categorical predictors. When you fit a linear mixed-effects model like:

lme = fitlme(tbl1, 'dependent_variable ~ grouping_variable1*grouping_variable2 + (1|ID)');

and “grouping_variable2 ” is a categorical variable, MATLAB automatically picks one of the categories as a reference level. By default, it picks the one that comes first alphabetically.

The important part is:

The reference level affects how the model interprets both main effects and interaction terms.
If the order of categories is changed (like renaming “Training” and 'Test' to “ATraining” and “BTest”), you're effectively changing the reference category.
That means the model's coefficients, p-values, and ANOVA results can shift, even though the overall fit (AIC, BIC, LogLikelihood) stays identical.

It’s just the way one-hot encoding and contrast settings work.

To avoid getting different results depending on naming or alphabetical order, it's a good idea to explicitly set the reference category for categorical variables.

In MATLAB, this can be done with “reordercats”:

grouping_variable2 = reordercats(grouping_variable2, {'Test', 'Training'});

Or

“setcats”:

grouping_variable2 = setcats(grouping_variable2, {'Test', 'Training'});

The following documentations can be referred for these functions:

https://www.mathworks.com/help/releases/R2022a/matlab/ref/categorical.reordercats.html

https://www.mathworks.com/help/releases/R2022a/matlab/ref/categorical.setcats.html

The following documentation can be used to refer to ‘fitlme’:

https://www.mathworks.com/help/releases/R2022a/stats/fitlme.html

Hope this helps with the issue!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Inconsistent results with fitlme

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Inconsistent results with fitlme

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기