Small language model

Small language models or compact language models are artificial intelligence language models designed for human natural language processing including language and text generation. They are smaller in scale and scope than large language models.

Typically, a large language model's number of training parameters is in the hundreds of billions, with some models even exceeding a trillion parameters. The size of any large language model is vast because it contains a large amount of information, which allows it to generate better^{[clarification needed]} content. However, this requires enormous computational power, making it impossible for an individual to train a large language model using just a single computer and graphical processing unit.

Small language models, on the other hand, use far fewer parameters, typically ranging from a few thousand to a few hundred million. This make them more feasible to train and host in resource-constrained environments such as a single computer or even a mobile device.^[1]^[2]^[3]^[4]^[5]

Most contemporary (2020s) small language models use the same architecture as a large language model, but with a smaller parameter count and sometimes lower arithmetic precision. Parameter count is reduced by a combination of knowledge distillation and pruning. Precision can be reduced by quantization. Work on large language models mostly translate to small language models: pruning and quantization are also widely used to speed up large language models.

[1]

[2]

[3]

[4]

[5]

Small language model

Models

Language model with small pre-training dataset

See also

References

Wikiwand - on