How to run the example 'Run mapreduce on a Hadoop Cluster'?

조회 수: 1 (최근 30일)
Jingyu Ru
Jingyu Ru 2015년 7월 22일
댓글: lov kumar 2019년 6월 4일
Hadoop version 1.2.1 Matlab version 2015a
Linux ubuntu 14.
I install Hadoop for a pseudo-distributed configuration (all of the Hadoop daemons run on a single host).
It is success to run the example 'wordcount'in Hadoop.
And it is success to read the data from the HDFS through the Matlab.
But when I try to run the example in Matlab 'Run mapreduce on a Hadoop Cluster',I failed.
It shows that the Map 0% and Reduce 0%.
There is my Matlab codes
setenv('HADOOP_HOME','/home/rjy/soft/hadoop-1.2.1');
cluster = parallel.cluster.Hadoop;
cluster.HadoopProperties('mapred.job.tracker') = 'localhost:50031';
cluster.HadoopProperties('fs.default.name') = 'hdfs://localhost:8020';
% ds = datastore('hdfs:/user/root/rrr')
outputFolder = '/home/rjy/logs/hadooplog';
mr = mapreducer(cluster);
ds = datastore('airlinesmall.csv','TreatAsMissing','NA',...
'SelectedVariableNames','ArrDelay','ReadSize',1000);
preview(ds)
meanDelay = mapreduce(ds,@meanArrivalDelayMapper,@meanArrivalDelayReducer,mr,...
'OutputFolder',outputFolder)
Error log is shown as follows,
Error using mapreduce (line 100)
The HADOOP job failed to complete.
Error in run_mapreduce_on_a_hadoop (line 12)
meanDelay = mapreduce(ds,@meanArrivalDelayMapper,@meanArrivalDelayReducer,mr,...
Caused by:
The HADOOP job was not able to start MATLAB for attempt 1 of 'MAP' task 0. The user
home directory '/homes/' for the cluster either did not exist or was not writable by
the HADOOP user. Check the documentation on how to set the user home directory for the
cluster.
What's the '/homes/' means? I have never seen it before. My hadoop directory is './home/rjy/soft/hadoop-1.2.1'
I wish to know how to solve this problem. I have tried many methods to solve it.
Please give me some suggestions. Thanks.

채택된 답변

Esther
Esther 2015년 7월 26일
Hi Jingyu,
Try editing the mapred-site.xml to set the value of this property: mapreduce.admin.user.home.dir
which will be set to a value of that of your hadoop directory: './home/rjy/soft/hadoop-1.2.1'
Maybe this will force it to look to your hadoop installation instead of /homes.
Hope this can solve the problem.
Cheers, Esther
  댓글 수: 2
Jingyu Ru
Jingyu Ru 2015년 7월 28일
Thanks a lot, I have solved the problem by set the mapred-site.xml,
Thank you very much!!! I'm so exciting!!!
Esther
Esther 2015년 7월 31일
You're welcome!

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

lov kumar
lov kumar 2019년 5월 28일
How to fix this error
Error using mapreduce (line 124)
The HADOOP job failed to submit. It is possible that there is some issue with the HADOOP configuration.
Error in bg (line 14)
meanDelay = mapreduce(ds,@meanArrivalDelayMapper,@meanArrivalDelayReducer,mr,...
  댓글 수: 4
lov kumar
lov kumar 2019년 6월 4일
I am using this code:
setenv('HADOOP_PREFIX','C:/hadoop-2.8.0');
getenv('HADOOP_PREFIX');
ds = datastore('hdfs://localhost:9000/lov/airlinesmall.csv',...
'TreatAsMissing','NA',...
'SelectedVariableNames',{'UniqueCarrier','ArrDelay'});
cluster = parallel.cluster.Hadoop;
cluster.HadoopProperties('mapred.job.tracker') = 'localhost:9000';
cluster.HadoopProperties('fs.default.name') = 'hdfs://localhost:8088';
disp(cluster);
mapreducer(cluster);
result = mapreduce(ds,@maxArrivalDelayMapper,@maxArrivalDelayReducer,'OutputFolder','hdfs://localhost:9000/lov');
lov kumar
lov kumar 2019년 6월 4일
I am using single machine.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by