Here is a concise example of generating mel-spectrograms from audio using Librosa below:
In the above code, we are using Librosa. This library handles audio loading and spectrogram generation, Mel-Spectrogram, which is Generated with librosa.feature.melspectrogram and converted to decibels for better visual representation and Parameters n_mels controls the number of mel bands, and fmax limits the maximum frequency.
Hence, this spectrogram can be used as input for training generative models like WaveNet or Tacotron.