Self-supervised learning
Machine learning paradigm / From Wikipedia, the free encyclopedia
Dear Wikiwand AI, let's keep it short by simply answering these key questions:
Can you list the top facts and stats about Self-supervised learning?
Summarize this article for a 10 years old
Self-supervised learning (SSL) refers to a machine learning paradigm, and corresponding methods, for processing unlabelled data to obtain useful representations that can help with downstream learning tasks. The most salient thing about SSL methods is that they do not need human-annotated labels, which means they are designed to take in datasets consisting entirely of unlabelled data samples. Then the typical SSL pipeline consists of learning supervisory signals (labels generated automatically) in a first stage, which are then used for some supervised learning task in the second and later stages. For this reason, SSL can be described as an intermediate form of unsupervised and supervised learning.
Part of a series on |
Machine learning and data mining |
---|
![]() |
The typical SSL method is based on an artificial neural network or other model such as a decision list.[1] The model learns in two steps. First, the task is solved based on an auxiliary or pretext classification task using pseudo-labels which help to initialize the model parameters.[2][3] Second, the actual task is performed with supervised or unsupervised learning.[4][5][6] Other auxiliary tasks involve pattern completion from masked input patterns (silent pauses in speech or image portions masked in black).
Self-supervised learning was referred as "self-labeling" in 2013. Self-labeling generates labels based on values of the input variables, as for example, to allow the application of supervised learning methods on unlabeled time-series.[7][8]
Self-supervised learning has produced promising results in recent years and has found practical application in audio processing and is being used by Facebook and others for speech recognition.[9] The primary appeal of SSL is that training can occur with data of lower quality, rather than improving ultimate outcomes. Self-supervised learning more closely imitates the way humans learn to classify objects.[10]