fbpx
Wikipedia

Winnow (algorithm)

The winnow algorithm[1] is a technique from machine learning for learning a linear classifier from labeled examples. It is very similar to the perceptron algorithm. However, the perceptron algorithm uses an additive weight-update scheme, while Winnow uses a multiplicative scheme that allows it to perform much better when many dimensions are irrelevant (hence its name winnow). It is a simple algorithm that scales well to high-dimensional data. During training, Winnow is shown a sequence of positive and negative examples. From these it learns a decision hyperplane that can then be used to label novel examples as positive or negative. The algorithm can also be used in the online learning setting, where the learning and the classification phase are not clearly separated.

Algorithm edit

The basic algorithm, Winnow1, is as follows. The instance space is  , that is, each instance is described as a set of Boolean-valued features. The algorithm maintains non-negative weights   for  , which are initially set to 1, one weight for each feature. When the learner is given an example  , it applies the typical prediction rule for linear classifiers:

  • If  , then predict 1
  • Otherwise predict 0

Here   is a real number that is called the threshold. Together with the weights, the threshold defines a dividing hyperplane in the instance space. Good bounds are obtained if   (see below).

For each example with which it is presented, the learner applies the following update rule:

  • If an example is correctly classified, do nothing.
  • If an example is predicted incorrectly and the correct result was 0, for each feature  , the corresponding weight   is set to 0 (demotion step).
     
  • If an example is predicted incorrectly and the correct result was 1, for each feature  , the corresponding weight   multiplied by α(promotion step).
     

A typical value for α is 2.

There are many variations to this basic approach. Winnow2[1] is similar except that in the demotion step the weights are divided by α instead of being set to 0. Balanced Winnow maintains two sets of weights, and thus two hyperplanes. This can then be generalized for multi-label classification.

Mistake bounds edit

In certain circumstances, it can be shown that the number of mistakes Winnow makes as it learns has an upper bound that is independent of the number of instances with which it is presented. If the Winnow1 algorithm uses   and   on a target function that is a  -literal monotone disjunction given by  , then for any sequence of instances the total number of mistakes is bounded by:  .[2]

References edit

  1. ^ a b Nick Littlestone (1988). "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm", Machine Learning 285–318(2).
  2. ^ Nick Littlestone (1989). "Mistake bounds and logarithmic linear-threshold learning algorithms". Technical report UCSC-CRL-89-11, University of California, Santa Cruz.

winnow, algorithm, winnow, algorithm, technique, from, machine, learning, learning, linear, classifier, from, labeled, examples, very, similar, perceptron, algorithm, however, perceptron, algorithm, uses, additive, weight, update, scheme, while, winnow, uses, . The winnow algorithm 1 is a technique from machine learning for learning a linear classifier from labeled examples It is very similar to the perceptron algorithm However the perceptron algorithm uses an additive weight update scheme while Winnow uses a multiplicative scheme that allows it to perform much better when many dimensions are irrelevant hence its name winnow It is a simple algorithm that scales well to high dimensional data During training Winnow is shown a sequence of positive and negative examples From these it learns a decision hyperplane that can then be used to label novel examples as positive or negative The algorithm can also be used in the online learning setting where the learning and the classification phase are not clearly separated Algorithm editThe basic algorithm Winnow1 is as follows The instance space is X 0 1 n displaystyle X 0 1 n nbsp that is each instance is described as a set of Boolean valued features The algorithm maintains non negative weights w i displaystyle w i nbsp for i 1 n displaystyle i in 1 ldots n nbsp which are initially set to 1 one weight for each feature When the learner is given an example x 1 x n displaystyle x 1 ldots x n nbsp it applies the typical prediction rule for linear classifiers If i 1 n w i x i gt 8 displaystyle sum i 1 n w i x i gt Theta nbsp then predict 1 Otherwise predict 0Here 8 displaystyle Theta nbsp is a real number that is called the threshold Together with the weights the threshold defines a dividing hyperplane in the instance space Good bounds are obtained if 8 n 2 displaystyle Theta n 2 nbsp see below For each example with which it is presented the learner applies the following update rule If an example is correctly classified do nothing If an example is predicted incorrectly and the correct result was 0 for each feature x i 1 displaystyle x i 1 nbsp the corresponding weight w i displaystyle w i nbsp is set to 0 demotion step x i 1 w i 0 displaystyle forall x i 1 w i 0 nbsp If an example is predicted incorrectly and the correct result was 1 for each feature x i 1 displaystyle x i 1 nbsp the corresponding weight w i displaystyle w i nbsp multiplied by a promotion step x i 1 w i a w i displaystyle forall x i 1 w i alpha w i nbsp A typical value for a is 2 There are many variations to this basic approach Winnow2 1 is similar except that in the demotion step the weights are divided by a instead of being set to 0 Balanced Winnow maintains two sets of weights and thus two hyperplanes This can then be generalized for multi label classification Mistake bounds editIn certain circumstances it can be shown that the number of mistakes Winnow makes as it learns has an upper bound that is independent of the number of instances with which it is presented If the Winnow1 algorithm uses a gt 1 displaystyle alpha gt 1 nbsp and 8 1 a displaystyle Theta geq 1 alpha nbsp on a target function that is a k displaystyle k nbsp literal monotone disjunction given by f x 1 x n x i 1 x i k displaystyle f x 1 ldots x n x i 1 cup cdots cup x i k nbsp then for any sequence of instances the total number of mistakes is bounded by a k log a 8 1 n 8 displaystyle alpha k log alpha Theta 1 frac n Theta nbsp 2 References edit a b Nick Littlestone 1988 Learning Quickly When Irrelevant Attributes Abound A New Linear threshold Algorithm Machine Learning 285 318 2 Nick Littlestone 1989 Mistake bounds and logarithmic linear threshold learning algorithms Technical report UCSC CRL 89 11 University of California Santa Cruz Retrieved from https en wikipedia org w index php title Winnow algorithm amp oldid 940538455, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.