It can be controlled by setting the parameters, mapred.min.split.size and mapred.max.split.size, while configuring the job in MapReduce. The value is to be set in bytes. So if we have a 20 GB file, and we want to fire 40 mappers, then we need to set it to 20480 / 40 = 512 MB each. So for that the code would be,
conf.set("mapred.min.split.size", "536870912");
conf.set("mapred.max.split.size", "536870912");