This is an architectural choice. IoT applications are possible without MQTT but there are some advantages when using MQTT. If you are completely new to MQTT, take a look at this in-depth MQTT series: http://forkbomb-blog.de/2015/all-you-need-to-know-about-mqtt
Basically the main architectural advantage is publish / subscribe designed for low-latency, high throughput (mobile) communication with minimal protocol overhead (which is important if bandwidth is at a premium). You can completely decouple consumers and producers.
HDFS is the (distributed) Hadoop file system and is the foundation for Map / Reduce processing. It is not comparable to a MQTT broker. The MQTT broker could write to the HDFS, though (in case of HiveMQ with a custom plugin).
Basically MQTT is a protocol while the products you are mentioning are, well, products which solve completely different problems:
Flume is basically used for log aggregation at scale. You won't use MQTT for that, at least there is not too much advantage because this is typically done in backend applications.
Spark and Hadoop shine at Big Data crunching. They are a framework and not a ready to use solution. They are not really comparable to MQTT. Often MQTT brokers like HiveMQ are used in conjunction with these, Spark / Hadoop for data processing and HiveMQ for communication.