TreeBagger using obscene amount of memory when run in parallel

조회 수: 6 (최근 30일)
Nicholas
Nicholas 2011년 10월 12일
댓글: amanita 2014년 2월 28일
Hi,
Im experiencing issues when running TreeBagger on a cluster. I run this code on a large cluster with 64 processors and 128 GB of memory. However, when I try to use TreeBagger on my dataset (~200 MB in size) with 5000 trees, matlab errors out after a few hours with OUT of MEMORY issues.
Here are my steps:
1. send a batch job to the cluster via distributed computing toolbox and open a matlabpool with 32 workers.
2. options = statset('UseParallel', 'Always');
3. B= TreeBagger(ntrees, tsp, tsp_label, 'Fboot', fboot, 'Options', options); where ntrees = 5000 and fboot=0.5.
I dont understand why TreeBagger is using so much memory (>128GB). When I run this same job locally on my 16GB computer, the memory use does not exceed 16GB. Am I doing something improperly?
Thanks for your help!

채택된 답변

Steve
Steve 2011년 10월 12일
Nicholas,
Each worker in the matlabpool is a separate matlab executable with its own working memory. In the case of TreeBagger, each worker has a separate copy of the TreeBagger data, which includes your full dataset, and eventually, all or most of the trees, plus any additional object contents. Thus, for TreeBagger, total memory consumption tends to increase quasi-linearly with the size of the matlabpool.
If you run in serial mode on your own computer, there is only one copy of this memory. (Though if you run in parallel on K cores locally, there will be K copies of the data.)
You might try to run with a smaller matlabpool if total memory consumption across the matlabpool is a limiting factor.
Best,
Steve
  댓글 수: 2
Nicholas
Nicholas 2011년 10월 13일
Thanks Steve! I didn't realize it would replicate the data across each worker.
amanita
amanita 2014년 2월 28일
I had the same problem. I had parfor loops and treebaggers inside. It was faster to run the outer parfor loops and the treebaggers in serial rather than serial loops and treebaggers in parallel. Nice to know why!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Classification Ensembles에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by