Is batch normalization useful for small networks

Question

We know batch normalization (BN) speeds up training of deep neural networks. But does it help with small neural networks as well? I have been experimenting with a 6-layer convolutional-MLP network and I cannot see any benefit for BN in training this network.

Nandini · Answer 1 · Mar 3, 2022

Batch normalization is a technique that is commonly employed with very deep neural networks. With each mini-batch, the outputs of layers after many layers fluctuate, and the layer must keep chasing a shifting objective.

However, because the fluctuations remain within a restricted range and do not confront the difficulty of a shifting target, this is not a significant issue for shallow neural networks. So, for shallow neural networks, you may train without batch normalization and everything will work well.