fbpx
Wikipedia

Connectionist temporal classification

Connectionist temporal classification (CTC) is a type of neural network output and associated scoring function, for training recurrent neural networks (RNNs) such as LSTM networks to tackle sequence problems where the timing is variable. It can be used for tasks like on-line handwriting recognition[1] or recognizing phonemes in speech audio. CTC refers to the outputs and scoring, and is independent of the underlying neural network structure. It was introduced in 2006.[2]

The input is a sequence of observations, and the outputs are a sequence of labels, which can include blank outputs. The difficulty of training comes from there being many more observations than there are labels. For example, in speech audio there can be multiple time slices which correspond to a single phoneme. Since we don't know the alignment of the observed sequence with the target labels we predict a probability distribution at each time step.[3] A CTC network has a continuous output (e.g. softmax), which is fitted through training to model the probability of a label. CTC does not attempt to learn boundaries and timings: Label sequences are considered equivalent if they differ only in alignment, ignoring blanks. Equivalent label sequences can occur in many ways – which makes scoring a non-trivial task, but there is an efficient forward–backward algorithm for that.

CTC scores can then be used with the back-propagation algorithm to update the neural network weights.

Alternative approaches to a CTC-fitted neural network include a hidden Markov model (HMM).

References edit

  1. ^ Liwicki, Marcus; Graves, Alex; Bunke, Horst; Schmidhuber, Jürgen (2007). "A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks". In Proceedings of the 9th International Conference on Document Analysis and Recognition, ICDAR 2007. CiteSeerX 10.1.1.139.5852.
  2. ^ Graves, Alex; Fernández, Santiago; Gomez, Faustino; Schmidhuber, Juergen (2006). "Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks". Proceedings of the International Conference on Machine Learning, ICML 2006: 369–376. CiteSeerX 10.1.1.75.6306.
  3. ^ Hannun, Awni (27 November 2017). "Sequence Modeling with CTC". Distill. 2 (11). arXiv:1508.01211. doi:10.23915/distill.00008. ISSN 2476-0757.

External links edit

  • Section 16.4, "CTC" in Jurafsky and Martin's Speech and Language Processing, 3rd edition

connectionist, temporal, classification, type, neural, network, output, associated, scoring, function, training, recurrent, neural, networks, rnns, such, lstm, networks, tackle, sequence, problems, where, timing, variable, used, tasks, like, line, handwriting,. Connectionist temporal classification CTC is a type of neural network output and associated scoring function for training recurrent neural networks RNNs such as LSTM networks to tackle sequence problems where the timing is variable It can be used for tasks like on line handwriting recognition 1 or recognizing phonemes in speech audio CTC refers to the outputs and scoring and is independent of the underlying neural network structure It was introduced in 2006 2 The input is a sequence of observations and the outputs are a sequence of labels which can include blank outputs The difficulty of training comes from there being many more observations than there are labels For example in speech audio there can be multiple time slices which correspond to a single phoneme Since we don t know the alignment of the observed sequence with the target labels we predict a probability distribution at each time step 3 A CTC network has a continuous output e g softmax which is fitted through training to model the probability of a label CTC does not attempt to learn boundaries and timings Label sequences are considered equivalent if they differ only in alignment ignoring blanks Equivalent label sequences can occur in many ways which makes scoring a non trivial task but there is an efficient forward backward algorithm for that CTC scores can then be used with the back propagation algorithm to update the neural network weights Alternative approaches to a CTC fitted neural network include a hidden Markov model HMM References edit Liwicki Marcus Graves Alex Bunke Horst Schmidhuber Jurgen 2007 A novel approach to on line handwriting recognition based on bidirectional long short term memory networks In Proceedings of the 9th International Conference on Document Analysis and Recognition ICDAR 2007 CiteSeerX 10 1 1 139 5852 Graves Alex Fernandez Santiago Gomez Faustino Schmidhuber Juergen 2006 Connectionist temporal classification Labelling unsegmented sequence data with recurrent neural networks Proceedings of the International Conference on Machine Learning ICML 2006 369 376 CiteSeerX 10 1 1 75 6306 Hannun Awni 27 November 2017 Sequence Modeling with CTC Distill 2 11 arXiv 1508 01211 doi 10 23915 distill 00008 ISSN 2476 0757 External links editSection 16 4 CTC in Jurafsky and Martin s Speech and Language Processing 3rd edition Retrieved from https en wikipedia org w index php title Connectionist temporal classification amp oldid 1222132094, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.