The output will be discrete but the question asked about prediction has to be understood carefully.
The main difference that makes this a regression problem rather than a classification problem is that the output from the classification is limited to some (usually a few) predefined values / levels. It's easy to see here that the sales quantity can be any (integer) value in principle, for example it makes no sense to consider the values 14 and 15 as different classes.
It is also helpful to think of the classification output as a category that is generally unordered. From this point of view, the difference between 14 and 15 can be thought of as the difference between 14 and 764 (these are just different categories). Intuitively, because we are interested in the exact quantity sold, these differences are so different that it is clear that we mis-predicted "14" instead of "15" instead of "764"