The three efficient techniques are as follows:
- 1.Subword Tokenization(Byte pair encoding)
- 2.Character-Level Tokenization
- 3.Use of a special Token
These methods helps in mitigating issues related to out-of-vocabulary words improving functioning of the model during inference.