emr: Cannot run program "bash": java.io.IOException: error=12, Cannot allocate memory

larry ogrodnek - 29 Sep 2010

Moving one of our jobs from hive 0.4 / hadoop 0.18 to hive 0.5 / hadoop 0.20 on amazon emr, I ran into a weird error in the reduce stage, something like:

java.io.IOException: Task: attempt_201007141555_0001_r_000009_0 - The reduce copier failed 
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:384) 
  at org.apache.hadoop.mapred.Child.main(Child.java:170) 
Caused by: java.io.IOException: Cannot run program "bash": java.io.IOException: error=12, Cannot allocate memory 
  at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) 
  at org.apache.hadoop.util.Shell.runCommand(Shell.java:149) 
  at org.apache.hadoop.util.Shell.run(Shell.java:134) 
  at org.apache.hadoop.fs.DF.getAvailable(DF.java:73) 
  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329) 
  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) 
  at org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:160) 
  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2622) 
  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) 
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory 
  at java.lang.UNIXProcess.(UNIXProcess.java:148) 
  at java.lang.ProcessImpl.start(ProcessImpl.java:65) 
  at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) 
  ... 8 more

There's some discussion on this thread in the emr forums.

From Andrew's response to the thread:

The issue here is that when Java tries to fork a process (in this case bash), Linux allocates as much memory as the current Java process, even though the command you are running might use very little memory. When you have a large process on a machine that is low on memory this fork can fail because it is unable to allocate that memory.

The workaround here is to either use an instance with more memory (m2 class), or reduce the number of mappers or reducers you are running on each machine to free up some memory.

Since the task I was running was reduce heavy, I chose to just drop the number of mappers from 4 to 2. You can do this pretty easy with the emr bootstrap actions.

My job ended up looking something like this:

elastic-mapreduce --create --name "awesome script" \
  --num-instances 8 --instance-type m1.large \ 
  --hadoop-version 0.20 \ 
  --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-hadoop \ 
  --args "-s,mapred.tasktracker.map.tasks.maximum=2" \ 
  --hive-script --arg  s3://....../script 
comments powered by Disqus