How to limit the number of containers requested by a pig script?

1 Answer

As far as I understand, number of mappers you cannot drive. The number of reducers you can control via PARALEL keyword. Number of containers on a node is given by following combination of settings:yarn.nodemanager.resource.memory-mb - set on a cluster.

And following properties can be "modified" from your script setting to a different number,
mapreduce.map.memory.mb and
mapreduce.reduce.memory.mb

answer Oct 21, 2014 by Vijay Shukla

are you saying that we cant change the mappers per job through the script, right? Because, otherwise, if invoking through command line or code, then we can, I think. We do have this property mapreduce.job.maps.

commented Oct 21, 2014 by anonymous

What I understand so far is that in pig you cannot decide how many mappers will run. That is given by some optimization - given the number of files, size of blocks etc. What you can control is the number of reducers via Parallel directive. But for sure you can SET mapreduce.job.maps but not sure what the effect will be. That is what I remember from doc.
Hope this helps

commented Oct 21, 2014 by anonymous

How to limit the number of containers requested by a pig script?

Your comment on this post:

1 Answer

Your comment on this answer:

Your answer

Preview