Supervised learning is a machine learning technique for creating a function from training data. The training data consist of pairs of input objects (typically vectors), and desired outputs. The output of the function can be a continuous value (called regression), or can predict a class label of the input object (called classification). The task of the supervised learner is to predict the value of the function for any valid input object after having seen a number of training examples (i.e. pairs of input and target output). To achieve this, the learner has to generalize from the presented data to unseen situations in a "reasonable" way (see inductive bias). (Compare with unsupervised learning.)
Supervised learning can generate models of two types. Most commonly, supervised learning generates a global model that maps input objects to desired outputs. In some cases, however, the map is implemented as a set of local models (such as in case-based reasoning or the nearest neighbor algorithm).
In order to solve a given problem of supervised learning (e.g. learning to recognize handwriting) one has to consider various steps:
The goal of supervised learning of a global model is to find a function g, given a set of points of the form (x, g(x)).
It is assumed that the set of points for which the behavior of g is known is an i.i.d. sample drawn according to an unknown probability distribution p of a larger, possibly infinite, population. Furthermore, one assumes the existence of a task-specific loss function L of type
where Y is the codomain of g and L maps into the nonnegative real numbers (further restrictions may be placed on L). The quantity L(z, y) is the loss incurred by predicting z as the value of g at a given point when the true value is y.
The risk associated with a function f is then defined as the expectation of the loss function, as follows:
if the probability distribution p is discrete (the analogous continuous case employs a definite integral and a probability density function).
The goal is now to find a function f* among a fixed subclass of functions for which the risk R(f*) is minimal.
However, since the behavior of g is generally only known for a finite sequence of points (x1, y1), ..., (xn, yn), one can only approximate the true risk, for example with the empirical risk:
Selecting the function f* that minimizes the empirical risk is known as the principle of empirical risk minimization. Statistical learning theory investigates under what conditions empirical risk minimization is admissible and how good the approximations can be expected to be.
Überwachtes Lernen | Apprendimento supervisionato | Supervised learning | การเรียนรู้แบบมีผู้สอน | Học có giám sát
This article is licensed under the GNU Free Documentation License.
It uses material from the
"Supervised learning".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world