Bootstrap Confidence Interval 90%
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
1 개 추천
We were asked to calculate the 90% confidence interval for a given dataset using bootci function. This was my line in Matlab
Pbci = bootci(2000,{@mean,Pb},'alpha',.1)%90 confidence interval
Is this the correct way?
Next we were asked to use the bootstrap technique to estimate the 90% confidence interval for the probability that the mean of Pb exceeds the MCL (i.e., 50ppm). Would i use Pbci = bootci(2000,{@mean,Pb},'weights',) for this? If so what do i put as my constraints for the weights part?
채택된 답변
Adam Danz
2019년 10월 26일
"We were asked to calculate the 90% confidence interval [using bootci()]... Is this the correct way?"
To determine if it's the correct way, compare the results with a lower level computation of the confidence intervals. Not only will this confirm that you're using the bootci() function correctly but you'll get a better understanding of how those intervals are computed.
Let's say your data contain 1000 samples and you're bootstrapping the mean of your data 2000 times. bootci() resamples the data with replacement 2000 times and computes the mean on each iteration. That means on each of the 2000 iterations, it randomly chooses 1000 samples, many of which will be duplicates (it uses the randi() function), and computes the mean. It then uses the 2000 means that were computed to determine the CI. There are several methods of doing this (explained under "types" in the documentation) but with enough bootstraps, the distribution of means should be normal (thanks to the Central Limit Theorem) and the CI type shouldn't make too much of a difference (though I typically recommend the percentile method which is not dependent on the shape of the distribution).
Follow thie brief tutorial below where the CIs are computing for a random dataset using bootci() and using a lower-level, direct computation of the CIs. As you can see by the figure it produces, the results are nearly the same. The only differences are due to a different randomized resampling of the data between methods.
% Create random data from a normal distribution
% with mean 28.25 and sd 8.5.
data = (randn(1,100000)*8.5 + 28.25)';
% Run bootci (percentile method is chosen since that's how we're
% computing it below in the other method.
nBoot = 2000; %number of bootstraps
[bci,bmeans] = bootci(nBoot,{@mean,data},'alpha',.1,'type','per'); %90 confidence interval
% Compute bootstrap sample mean
bmu = mean(bmeans);
% Now repeat that process with lower-level bootstrapping
% using the same sampling proceedure and the same data.
bootMeans = nan(1,nBoot);
for i = 1:nBoot
bootMeans(i) = mean(data(randi(numel(data),size(data))));
end
CI = prctile(bootMeans,[5,95]);
mu = mean(bootMeans);
% Plot the bootci() results
figure()
ax1 = subplot(2,1,1);
histogram(bmeans);
hold on
xline(bmu, 'k-', sprintf('mu = %.2f',bmu),'LineWidth',2,'FontSize',12)
xline(bci(1),'k-',sprintf('%.1f',bci(1)),'LineWidth',2,'FontSize',12)
xline(bci(2),'k-',sprintf('%.1f',bci(2)),'LineWidth',2,'FontSize',12)
title('bootci()')
% plot the lower-level, direct computation results
ax2 = subplot(2,1,2);
histogram(bootMeans);
hold on
xline(mu, 'k-', sprintf('mu = %.2f',mu),'LineWidth',2,'FontSize',12)
xline(CI(1),'k-',sprintf('%.1f',CI(1)),'LineWidth',2,'FontSize',12)
xline(CI(2),'k-',sprintf('%.1f',CI(2)),'LineWidth',2,'FontSize',12)
title('Lower level')
linkaxes([ax1,ax2], 'xy')

"Next we were asked to use the bootstrap technique to estimate the 90% confidence interval for the probability that the mean of Pb exceeds the MCL (i.e., 50ppm)."
I'm not sure I follow this part. What is MCL? Is it a scalar value (like 50) or is is the mean of a 2nd distribution?
댓글 수: 5
Samuel Wray
2019년 10월 26일
We're supposed to find the probability that the bootstrap means from the first question exceed a number such as 50. It is concentrations of Pb in soil so it is 50ppm.
That desciption differs from what's in your question. The phrasing used in your question is:
we were asked to use the bootstrap technique to estimate the 90% confidence interval for the probability that the mean of Pb exceeds the MCL (i.e., 50ppm).
Pb are your your raw data so in that phrasing of the quesiton, you're comparing some value MCL to the mean of your data mean(Pb).
But the phrasing used in your comment above:
We're supposed to find the probability that the bootstrap means from the first question exceed a number such as 50.
The bootstrap mean is the mean of all 2000 bootstrap means (which probably nearly matches your population mean, anyway).
So I'm wondering if you should be using your population distribution (which could take any form) or the bootstrap'd distribtion of means (which is very close to a normal distribution). Since this sounds like homework, I'm sure there is text of lecture materail that would clear this up.
This may be informative:
Dear Adam Danz, I want to ask your kind help. I have fitted my deterministic disease transmission model parameters using small sample data. I want to determine the 95% confidence inetrval of my model parameters. For that I want to apply bootstraping and generate some additional samples. But, my data is just like thisl. example (t1, 1418), (t2, 741), (t3, 1059), (t4, 1160) and (t5, 1129). Is it possible to apply bootsrtaping at each time and determine the maximum and minimum bootstrapped mean, please? I am looking sample matlab code for these?
Adam Danz
2024년 2월 16일
Could you provide a clearer description of your data? I didn't understand the example.
Temesgen
2024년 2월 16일
Ok thank you for your reply. I have those datas at different times. Let say I have yearly data for five years. Let say at year 1 my data is 1418, year 2 741 ...
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Resampling Techniques에 대해 자세히 알아보기
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
