REPEATED MEASURES ANOVA MATLAB
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
1 개 추천
Hi everyone!
I am having troubles understanding how to perform the correct anova test on my data. I have 16 subjects who are being treated with 2 different treatments. For each subject and each treatment measures are taken at 6 different time points (attached is the excel file). Is repeated measures anova the best way to go? If so, what is the most apporpriate command to use?
Here is the code I have so far. The results are kind of strange so I am not sure if it is wrong or I am just interpreting it the wrong way.
Thanks,
Giulia
Q2 = xlsread('Q2.xls');
Treatment = categorical(Q2(:,2));
ID = Q2(:,1);
t = table(Treatment, Q2(:,3),Q2(:,4),Q2(:,5),Q2(:,6),Q2(:,7), Q2(:,8),'VariableNames', {'Treatment', 't1', 't2', 't3', 't4', 't5', 't6'});
Time = [1 2 3 4 5 6]';
rm = fitrm( t, 't1-t6~Treatment', 'WithinDesign', Time);
ranovatbl = ranova(rm);
disp(ranovatbl);
multcompare(rm, 'Time', 'By', 'Treatment')
댓글 수: 1
Scott MacKenzie
2021년 4월 27일
편집: Scott MacKenzie
2021년 4월 27일
Below is an Anova I just did on your data set (not using MATLAB). What you have is a 2 x 6 within-subjects design. You have two independent variables: Treatment (2 levels) and Time Point (6 levels). What was the dependent variable? In your question, you only say "measures are taken". Measures of what? Here, I'm assuming you measured something like reaction time or task completion time with milliseconds as the units.
Here is a summary of the key results: The effect of treatment on task completion time was not statistically significant, F(1,10) = 1.131, p > .05. The effect of time point on task completion time was statistically significant, F(5,50) = 5.127, p < .001. The Treatment x Time Point interaction effect was also statistically significant, F(5,50) = 6.478, p = .0001. See below for an Anova table for your data and experiment design.

I haven't figure out how to generate a table like this in MATLAB, so I'm using a different tool.
BTW, the attached file has data for 11 participants, not 16 as stated in your question.
채택된 답변
Is repeated measures anova the best way to go?
Your design is certainly applicable to a repeated measures design but the answer to that question depends on what you're testing.
It looks like you want to perform a repeated measures ANOVA with a covariate. In addition to patient number, the treatment type is the covariate, and the 6 time points is the within-subject repeated measure.
There are 4 null hypotheses to look at:
- Patient number has no effect on the population mean
- Treatment type has no effect on the population mean
- There is no interaction between patient number and treatment type (usually the most important question)
- There is no 3-way interaction between patient number, treatment type, and treatment time.
Setting up a RM Anova with a convariate in Matlab
Your data are attached in a file named Q2unlocked.xlsx. The following lines set up a RM Anova with a covariate (written in r2019b).
% Read in the table
Q2 = readtable('Q2unlocked.xlsx');
% Replace the 6 measurment headers
Q2.Properties.VariableNames(3:8) = {'t1', 't2', 't3', 't4', 't5', 't6'};
% The within-subjects design
withinDsgn = table((1:6)','VariableNames',{'Time'});
% Run the RM Anova (note the interaction between PATIENT and TREAT!)
rm = fitrm(Q2, 't1-t6~PATIENT*TREAT', 'WithinDesign', withinDsgn);
Testing for assumptions
Before we look at the results, you must make sure your data are appropriate for a RM Anova by confirming the following assumptions.
- Independent observations: You collected the data randomly and without bias
- Normality: The measurments have an approximately normal distribution
- Sphericity: we'll use the Mauchly test.
% Are the data approximately normal? This should ideally be done for each factor
histogram(reshape(Q2{:,3:end},[],1))

A small rightward tail but not too bad.
% Does the data pass the sphericity test?
rm.mauchly
ans = 1×4 table
W ChiStat DF pValue
_______ _______ __ _______
0.48842 11.537 14 0.64344
Yes: a pValue < 0.05 would fail
RM Anova results
ranova(rm) or rm.ranova produce the results table showing the p-value and then 3 adjusted p-values depending on whether the the response variables have different variances (symmetry assumption).
% Produce ranova table
rm.ranova
ans = 5×8 table
SumSq DF MeanSq F pValue pValueGG pValueHF pValueLB
__________ __ __________ ______ ________ ________ ________ ________
(Intercept):Time 3.9654e+06 5 7.9308e+05 2.2798 0.053221 0.071668 0.053221 0.14843
PATIENT:Time 4.0894e+06 5 8.1788e+05 2.3511 0.047012 0.064753 0.047012 0.14259
TREAT:Time 3.2657e+06 5 6.5314e+05 1.8775 0.10606 0.12652 0.10606 0.18747
PATIENT:TREAT:Time 3.9524e+06 5 7.9048e+05 2.2723 0.053916 0.072433 0.053916 0.14905
Error(Time) 3.1309e+07 90 3.4788e+05
Row 1 representing all differences across the within-subjects factors (treatment times). It has a p value of 0.053 which is just beyond the socially accepted threshold of 0.05. A boxplot will show us if this value seems reasonable.
figure()
boxplot(Q2{:,3:end})
xlabel('Treatment times')
ylabel('measurement value')
title('Data pooled between all Patients and treatment factors')

It doesn't look like there's much of an effect of treatment time when factors are combined
Row 2 shows the interaction between Patient and treatment times and is p=0.047
bpdata = [];
for i = 1:max(Q2.PATIENT) %assuming patient numbers are 1:max
bpdata = [bpdata, Q2{Q2.PATIENT==i,3:8},nan(size(unique(Q2.TREAT)))];
end
figure()
boxplot(bpdata)
arrayfun(@xline,7:7:size(bpdata,2))
xlabel('6 treatment times across 11 patients')
ylabel('measurement value')
title('Data pooled between treatment factors')
set(gca,'XTick', [])

Clearly the difference between the 6 treatment times differs across subjects (Treatment type combined)
Row 3 shows the interaction between treatment type and treatment times and is p=0.106.
bpdata = [];
for i = 1:max(Q2.TREAT) %assuming TREAT numbers are 1:max
bpdata = [bpdata, Q2{Q2.TREAT==i,3:8},nan(size(unique(Q2.PATIENT)))];
end
figure()
boxplot(bpdata)
xline(7)
xlabel('6 treatment times across 2 TREAT factors')
ylabel('measurement value')
title('Data pooled between patients')
set(gca,'XTick', [])

Other than the 2nd treatment factor there is no difference between the 2 Treatment types.
Row 4 shows the 3-way interaction between patient, treatment type, and treatment times.
댓글 수: 14
Thanks a lot!
I tried running the code and it gives me this error:
Unable to use a value of type cell as an index.
Error in HW5_Q2 (line 3)
withinDsgn = table((1:6)','VariableNames',{'Time'});
Any suggestions?
No worries,
it worked, I had another variable called table that messed up.
Thanks a lot again
Glad I could help.
I just updated the answer to clean up some of my terminology. Your "TREAT" variable is a covariate so my previous terminology (2-way) was misleading. None of the code was changed.
BTW, never name a variable "table" since that overrides the table() function which is a commonly used function, especially in statistics tests.
I have one question though, what factors do you mean with this ? thanks!
% Are the data approximately normal? This should ideally be done for each factor
histogram(reshape(Q2{:,3:end},[],1))
Adam Danz
2021년 4월 16일
deejt
2021년 4월 27일
I have one more question: if your data does not pass the test of sphericity, do you then have to always the corrected p-values from the rm table? I also calculated the epsilon, but how exactly do I have to combine the results from the epsilon correction to the rm table and the reporting of p- values?
% Produce ranova table: rm.ranova table
rm.mauchly
Scott MacKenzie
2021년 4월 27일
편집: Scott MacKenzie
2021년 4월 27일
The Anova table in Adam's reponse can't be correct. There were 2 levels of the Treatment independent variable, so the first df with the F-statistic for the Treatment main effect is 1 (not 5 as shown in Adam's table).
@Scott MacKenzie the repeated measures design in my answer is
rm = fitrm(Q2, 't1-t6~PATIENT*TREAT', 'WithinDesign', withinDsgn);
which contains 6 repeated responses (t1-t6) and two predictors with an interaction term. That's why there are 5 df.
Adam Danz
2021년 4월 27일
@deejt, to be honest, I avoid parametric statistics largely for this exact reason and it's becoming a growing trend to use non-parametric tests whenever possible.
For a short list of academic journal articles on why we should show using statistical significance testing, see the end of this answer.
Scott MacKenzie
2021년 4월 27일
편집: Scott MacKenzie
2021년 4월 28일
Adam, sorry, but concerning your comment on the degrees of freedom, this is simply wrong. Remember, we are talking about the main effect of Treatment in Giulia's data set. Treatment is the primary independent variable. Determining whether or not it has a signficant effect on the dependent variable is the main reason for doing the analysis of variance in the first place. But, your Anova table provides no insight on this issue.
It is a simple fact that the first degree of freedom in the F-statistic for a main effect is n-1 where n is the number of conditions or levels in the independent variable. Just look in the Anova table I provide above or in any statistics textbook. Treatment is an independent variable with two levels, so its df is 1. But, your Anova table has no line entry with df = 1. The only reasonable conclusion is that there is either no main effect for Treatment given in your table, or, if there is, it is wrong.
This is probably the wrong place for dialog like this. Sorry. I think I'll going to put together a separate question on this point, as I, like Giulia, am interested in using MATLAB for an analysis of variance on data obtained in a factorial experiment.
Hi Scott I'm not a statitics expert and have met very few people who are. Not many people here answer stats questions so your commentary is welcomed.
The repeated measures ANOVA table that Matlab generates (not me) is a produced by ranova and is similar to the examples from the documentation here and here. You can review the Wilkinson Notation here. The model includes a patient term, a treatment group term, and an interaction term between patient and treatment group. The table it produces includes a term representing all differences across the within-subjects factors (time) and terms for all interactions between the within-subject terms and all between-subject model terms, as explained in the ranova documentation.
The df for the Time term should be 5 since there are 6 measurements. The df for treatment*time interaction should also be 5 since there are 2 treatment levels and 5*1=5. But I don't know why the patient*time df is 5 in the table. I'd have to go through the code to figure it out since the documentation on statistical functions in Matlab tends to be scant.
These data could also be reshaped to perform an n-way analysis of variance using anovan which contains the df you were expecting.
% Reshape data
data = readmatrix('Q2unlocked.xlsx');
M = repmat(data(:,1:2),6,1);
M(:,end+1) = repelem((1:6)',size(data,1),1);
rmeas = data(:,3:end);
M(:,end+1) = rmeas(:);
[p, tbl] = anovan(M(:,end),{M(:,1), M(:,2), M(:,3)},'model','interaction',...
'varnames',{'Participants','Treatment','Time'})

Although, I'm sure the model used to generate the table in your comment differed. Several people in the past have asked why anova tables differ between software and those questions rarely get solved. I don't advise anyone to use statistical significance testing to begin with so I only dive in when needed.
Lastly, you bring up a good point that this model may not be the one that answers the OP's question. Perhaps specifying the within-groups model when calling ranova is the better choice. Instead of rm.ranova which is the same as ranova(rm),
ranova(rm,'WithinModel','Time')
which results in
10×8 table
SumSq DF MeanSq F pValue pValueGG pValueHF pValueLB
__________ __ __________ ________ ________ ________ ________ ________
(Intercept) 2.9816e+07 1 2.9816e+07 6.914 0.017011 0.017011 0.017011 0.017011
PATIENT 2.4972e+06 1 2.4972e+06 0.57908 0.45653 0.45653 0.45653 0.45653
TREAT 1.6441e+05 1 1.6441e+05 0.038124 0.84738 0.84738 0.84738 0.84738
PATIENT:TREAT 9.5387e+05 1 9.5387e+05 0.22119 0.64378 0.64378 0.64378 0.64378
Error 7.7623e+07 18 4.3124e+06
(Intercept):Time 2.1085e+07 1 2.1085e+07 5.7044 0.028088 0.028088 0.028088 0.028088
PATIENT:Time 1.8057e+06 1 1.8057e+06 0.48853 0.49352 0.49352 0.49352 0.49352
TREAT:Time 5.4909e+05 1 5.4909e+05 0.14856 0.70443 0.70443 0.70443 0.70443
PATIENT:TREAT:Time 8.4605e+05 1 8.4605e+05 0.2289 0.6381 0.6381 0.6381 0.6381
Error(Time) 6.6531e+07 18 3.6962e+06
Adam. Thanks for your thoughful comments. Yes, it is sometimes perplexing that different stats packages give different results for the same data. The table you provide using anovan comes very close to matching the Anova table I provided. In fact all the mean squares are identical, which is certainly good. Oddly enough, the F-statistics differ -- not good. In my table, the Treatment effect is not significant F(1,10) = 1.131, p > .05). The F-statistic (1.131) is the ratio MS-treatment / MS-treatment_by_participant. It seems anovan uses a different ratio: MS-treatment / MS-error. I don't believe that's correct, considering that the factor is within-subjects not between-subjects. This also means that the second df would be 50 (from the error MS line), not 10. Again, I don't think that's right. I ran these data through JMP (from SAS) and got

Here, we see the F-statistics agree with my table but not with MATLAB's anovan table. Hmmm, interesting.
Scott MacKenzie
2021년 4월 29일
편집: Scott MacKenzie
2021년 4월 29일
Problem solved!
T = readtable('q2_data.xlsx');
T.Properties.VariableNames = {'T1_TP1', 'T1_TP2', 'T1_TP3', 'T1_TP4', 'T1_TP5', 'T1_TP6', ...
'T2_TP1', 'T2_TP2', 'T2_TP3', 'T2_TP4', 'T2_TP5', 'T2_TP6' };
withinDesign = table([1 1 1 1 1 1 2 2 2 2 2 2]',[1:6 1:6]','VariableNames',{'Treatment','TimePeriod'});
withinDesign.Treatment = categorical(withinDesign.Treatment);
withinDesign.TimePeriod = categorical(withinDesign.TimePeriod);
rm = fitrm(T, 'T1_TP1-T2_TP6~1', 'WithinDesign', withinDesign);
ranova(rm, 'WithinModel', 'Treatment*TimePeriod')
This script will generate an Anova table with the exact same SS, MS, df, F-statistics, and p-values as in the table above and in the table in my first comment on this question. For this, I rearranged the data originally posted by Giulia into an 11x12 matrix. See the attached spreadsheet for the rearranged data and some explanatory comments.
Full disclosure. The script above is not mine. It's a slightly modified verion of a script provided by Jeff Miller in response to a question I posted yesterday.
Adam Danz
2021년 4월 29일
Thanks for the update, Scott. I saw Jeff's solution earlier today and also noticed the difference in withinDesign. I look forward to examining it closer and perhaps updating my answer when I have a chance to do so, shortly.
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Repeated Measures and MANOVA에 대해 자세히 알아보기
제품
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
