Recently, Alpha Zero and Leela Chess Zero are based on the idea that a given chess position s can be given both an evaluation number, as well as a probability vector for the next move. This number and vector are the output neurons of a CNN. My question is: given that different chess positions s and s' have a different amount of legal chess moves, how is this probability vector structured if it doesn't have a fixed size?