Top Qs
Timeline
Chat
Perspective
byte pair encoding
From Wiktionary, the free dictionary
Remove ads
English
Alternative forms
Noun
byte pair encoding (countable and uncountable, plural byte pair encodings)
- (computing) A lossless data compression algorithm that iteratively replaces the most frequent pair of adjacent bytes in a sequence with a new byte not already present in the data.
- (natural language processing) A subword tokenization method that iteratively merges the most frequent pairs of adjacent characters in a corpus to form longer and more meaningful tokens, typically until a predefined vocabulary size is reached.
Remove ads
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads