This should be self-evident.
You can do it later as long as you obtain the same outcome regardless of the split you used. But what is the advantage of doing so? Then just start with the preprocessing.
You should be alright if you discretize by rounding - for example, float to integer (which is unaffected by the split). However, if you discretize using quantiles, it should be evident that you can make a mistake because the different portions will be discretized differently!
Let's imagine you want to divide data into two categories:
Input data Type Output value
0.9 good 1.05
1.0 good 1.05
1.1 good 1.05
1.2 good 1.05
---
2.1 good 2.20
2.3 good 2.20
2.2 good 2.20
--- SPLIT HERE ---
1.1 bad 1.20
1.2 bad 1.20
1.3 bad 1.20
---
1.9 bad 2.00
2.0 bad 2.00
2.1 bad 2.00
Because the average of each cluster of values was used, both "good" and "bad" were discretized into two discrete values. The resulting property, however, plainly reveals the genuine membership because the averages for "excellent" and "bad" differ. The task of detecting "bad" has gotten a lot simpler.
Separate preprocessing is not required and you don't need to perform it also.
Elevate your skills with our comprehensive Machine Learning Course.