You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when I ran function mapreduce in rmr2, I encountered an error pipeMapRed.waitOutputThreads(): subprocess failed with code 127. My environment is that min 17.1 rebecca, hadoop 2.6.0 with localhost setup, R 3.1.3 compiled with Intel MKL, intel C/C++ compiler by myself, oracle java 1.8.40
I digged into this error, I discover that it is the shared library in the system does not load correctly. Since it was successive to run r code by using original streaming files and hadoop command: bash hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -files mapper.R,reducer.R,/opt/intel/composer_xe_2013_sp1/compiler/lib/intel64/libiomp5.so,/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so -mapper "mapper.R -m" -reducer "reducer.R -r"-input /user/hadoop/testData/* -output /user/hadoop/testData2-output
I have try to add backend.parameter = list(hadoop=list(files=/opt/intel/composer_xe_2013_sp1/compiler/lib/intel64/libiomp5.so, files=/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so)) to mapreduce function, but it comes another error. I spectacular that it is caused by hadoop streaming does not accept 2 and more -files.
Therefore, I modify the original file, R/streaming.R, in the package before building. I modify the files parameter in final.command with R files = paste(collapse = ",", c(image.files, map.file, reduce.file, combine.file, "/opt/intel/composer_xe_2013_sp1/compiler/lib/intel64/libiomp5.so", "/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so"))
Then it fix the error pipeMapRed.waitOutputThreads(): subprocess failed with code 127. I wonder is it possible to add a new parameter in rmr2 to modify the input files. Or is there another solution to solve this problem by editing the environment of hadoop.
The text was updated successfully, but these errors were encountered:
There are some limitations related to the specific order of options that may be a problem here. In short backend.parameters is safe for generic options such as -D, which is the one used most often. -files is not generic so it needs to be in a certain order wrt generic ones and there's only so much rmr2 can do to order them right without embedding the full knowledge of what is generic plus a complete refactor of how the cmd line is put together right now (one would have to delay conversion to a string until the cmd line is fully specified). It's quite a bit of development and added, permanent complexity for a very specialized use case. The other thing is that -files is already used and It does accept a list of files, which suggests that specifying it twice may not be acceptable, but I am not 100% sure. If that's the case, allowing the user to specify additional -files arguments would require an even deeper refactor.
when I ran function
mapreduce
in rmr2, I encountered an errorpipeMapRed.waitOutputThreads(): subprocess failed with code 127
. My environment is that min 17.1 rebecca, hadoop 2.6.0 with localhost setup, R 3.1.3 compiled with Intel MKL, intel C/C++ compiler by myself, oracle java 1.8.40I digged into this error, I discover that it is the shared library in the system does not load correctly. Since it was successive to run r code by using original streaming files and hadoop command:
bash hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -files mapper.R,reducer.R,/opt/intel/composer_xe_2013_sp1/compiler/lib/intel64/libiomp5.so,/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so -mapper "mapper.R -m" -reducer "reducer.R -r"-input /user/hadoop/testData/* -output /user/hadoop/testData2-output
I have try to add backend.parameter = list(hadoop=list(files=/opt/intel/composer_xe_2013_sp1/compiler/lib/intel64/libiomp5.so, files=/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so)) to mapreduce function, but it comes another error. I spectacular that it is caused by hadoop streaming does not accept 2 and more -files.
Therefore, I modify the original file, R/streaming.R, in the package before building. I modify the files parameter in final.command with
R files = paste(collapse = ",", c(image.files, map.file, reduce.file, combine.file, "/opt/intel/composer_xe_2013_sp1/compiler/lib/intel64/libiomp5.so", "/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so"))
Then it fix the error
pipeMapRed.waitOutputThreads(): subprocess failed with code 127
. I wonder is it possible to add a new parameter in rmr2 to modify the input files. Or is there another solution to solve this problem by editing the environment of hadoop.The text was updated successfully, but these errors were encountered: