fbpx
Wikipedia

Jürgen Schmidhuber

Jürgen Schmidhuber (born 17 January 1963)[1] is a German computer scientist noted for his work in the field of artificial intelligence, specifically artificial neural networks. He is a scientific director of the Dalle Molle Institute for Artificial Intelligence Research in Switzerland.[2] He is also director of the Artificial Intelligence Initiative and professor of the Computer Science program in the Computer, Electrical, and Mathematical Sciences and Engineering (CEMSE) division at the King Abdullah University of Science and Technology (KAUST) in Saudi Arabia.[3]

Jürgen Schmidhuber
Schmidhuber speaking at the AI for GOOD Global Summit in 2017
Born17 January 1963[1]
Alma materTechnical University of Munich
Known forLong short-term memory, Gödel machine, artificial curiosity, meta-learning
Scientific career
FieldsArtificial intelligence
InstitutionsDalle Molle Institute for Artificial Intelligence Research
Websitepeople.idsia.ch/~juergen

He is best known for his foundational and highly-cited[4] work on long short-term memory (LSTM), a type of neural network architecture which went on to become the dominant technique for various natural language processing tasks in research and commercial applications in the 2010s.

Career

Schmidhuber completed his undergraduate (1987) and PhD (1991) studies at the Technical University of Munich in Munich, Germany.[1] His PhD advisors were Wilfried Brauer and Klaus Schulten.[5] He taught there from 2004 until 2009. From 2009,[6] until 2021, he was a professor of artificial intelligence at the Università della Svizzera Italiana in Lugano, Switzerland.[1]

He has served as the director of Dalle Molle Institute for Artificial Intelligence Research (IDSIA), a Swiss AI lab, since 1995.[1]

In 2014, Schmidhuber formed a company, Nnaisense, to work on commercial applications of artificial intelligence in fields such as finance, heavy industry and self-driving cars. Sepp Hochreiter, Jaan Tallinn, and Marcus Hutter are advisers to the company.[2] Sales were under US$11 million in 2016; however, Schmidhuber states that the current emphasis is on research and not revenue. Nnaisense raised its first round of capital funding in January 2017. Schmidhuber's overall goal is to create an all-purpose AI by training a single AI in sequence on a variety of narrow tasks.[7]

Research

In the 1980s, backpropagation did not work well for deep learning with long credit assignment paths in artificial neural networks. To overcome this problem, Schmidhuber (1991) proposed a hierarchy of recurrent neural networks (RNNs) pre-trained one level at a time by self-supervised learning.[8] It uses predictive coding to learn internal representations at multiple self-organizing time scales. This can substantially facilitate downstream deep learning. The RNN hierarchy can be collapsed into a single RNN, by distilling a higher level chunker network into a lower level automatizer network.[8][9] In 1993, a chunker solved a deep learning task whose depth exceeded 1000.[10]

In 1991, Schmidhuber published adversarial neural networks that contest with each other in the form of a zero-sum game, where one network's gain is the other network's loss.[11][12][13] The first network is a generative model that models a probability distribution over output patterns. The second network learns by gradient descent to predict the reactions of the environment to these patterns. This was called "artificial curiosity." In 2014, this principle was used in a generative adversarial network where the environmental reaction is 1 or 0 depending on whether the first network's output is in a given set. This can be used to create realistic deepfakes.

Schmidhuber supervised the 1991 diploma thesis of his student Sepp Hochreiter[14] and called it "one of the most important documents in the history of machine learning".[9] It not only tested the neural history compressor,[8] but also analyzed and overcame the vanishing gradient problem. This led to the deep learning method called long short-term memory (LSTM), a type of recurrent neural network. The name LSTM was introduced in a tech report (1995) leading to the most cited LSTM publication (1997), co-authored by Hochreiter and Schmidhuber.[15] The standard LSTM architecture which is used in almost all current applications was introduced in 2000 by Felix Gers, Schmidhuber, and Fred Cummins.[16] Today's "vanilla LSTM" using backpropagation through time was published with his student Alex Graves in 2005,[17][18] and its connectionist temporal classification (CTC) training algorithm[19] in 2006. CTC enabled end-to-end speech recognition with LSTM. By the 2010s, the LSTM became the dominant technique for a variety of natural language processing tasks including speech recognition and machine translation, and was widely implemented in commercial technologies such as Google Translate and Siri.[20] LSTM has become the most cited neural network of the 20th century.[9]

In 2015, Rupesh Kumar Srivastava, Klaus Greff, and Schmidhuber used LSTM principles to create the Highway network, a feedforward neural network with hundreds of layers, much deeper than previous networks.[21][22] 7 months later, the ImageNet 2015 competition was won with an open-gated or gateless Highway network variant called Residual neural network.[23] This has become the most cited neural network of the 21st century.[9]

Since 2018, transformers have overtaken the LSTM as the dominant neural network architecture in natural language processing[24] through large language models such as ChatGPT. As early as 1992, Schmidhuber published an alternative to recurrent neural networks[25] which is now called a Transformer with linearized self-attention[26][27][9] (save for a normalization operator). It learns internal spotlights of attention:[28] a slow feedforward neural network learns by gradient descent to control the fast weights of another neural network through outer products of self-generated activation patterns FROM and TO (which are now called key and value for self-attention).[26] This fast weight attention mapping is applied to a query pattern.

In 2011, Schmidhuber's team at IDSIA with his postdoc Dan Ciresan also achieved dramatic speedups of convolutional neural networks (CNNs) on fast parallel computers called GPUs. An earlier CNN on GPU by Chellapilla et al. (2006) was 4 times faster than an equivalent implementation on CPU.[29] The deep CNN of Dan Ciresan et al. (2011) at IDSIA was already 60 times faster[30] and achieved the first superhuman performance in a computer vision contest in August 2011.[31] Between 15 May 2011 and 10 September 2012, their fast and deep CNNs won no fewer than four image competitions.[32][33] They also significantly improved on the best performance in the literature for multiple image databases.[34] The approach has become central to the field of computer vision.[33] It is based on CNN designs introduced much earlier by Yann LeCun et al. (1989)[35] who applied the backpropagation algorithm to a variant of Kunihiko Fukushima's original CNN architecture called neocognitron,[36] later modified by J. Weng's method called max-pooling.[37][33]

Credit disputes

Schmidhuber has controversially argued that he and other researchers have been denied adequate recognition for their contribution to the field of deep learning, in favour of Geoffrey Hinton, Yoshua Bengio and Yann LeCun, who shared the 2018 Turing Award for their work in deep learning.[2][20][38] He wrote a "scathing" 2015 article arguing that Hinton, Bengio and Lecun "heavily cite each other" but "fail to credit the pioneers of the field".[38] In a statement to the New York Times, Yann LeCun wrote that "Jürgen is manically obsessed with recognition and keeps claiming credit he doesn't deserve for many, many things... It causes him to systematically stand up at the end of every talk and claim credit for what was just presented, generally not in a justified manner."[2] Schmidhuber replied that LeCun did this "without any justification, without providing a single example,"[39] and published details of numerous priority disputes with Hinton, Bengio and LeCun.[40] Kory Mathewson suggested that Schmidhuber's accomplishments have been downplayed because of his personality.[20]

Recognition

Schmidhuber received the Helmholtz Award of the International Neural Network Society in 2013,[41] and the Neural Networks Pioneer Award of the IEEE Computational Intelligence Society in 2016[42] for "pioneering contributions to deep learning and neural networks."[1] He is a member of the European Academy of Sciences and Arts.[43][6]

He has been referred to as the "father of (modern) AI" or similar,[44][2][45][46][47][48][49][50][51][52][53][20] and also the "father of deep learning."[54][47] Schmidhuber himself, however, has called Alexey Grigorevich Ivakhnenko the "father of deep learning,"[55] and gives credit to many even earlier AI pioneers.[9]

Views

Schmidhuber states that "in 95% of all cases, AI research is really about our old motto, which is make human lives longer and healthier and easier."[51] He admits that "the same tools that are now being used to improve lives can be used by bad actors," but emphasizes that "they can also be used against the bad actors."[50]

He does not believe AI poses a "new quality of existential threat," and is more worried about the old nuclear warheads which can "wipe out human civilization within two hours, without any AI."[44] "A large nuclear warhead doesn’t need fancy face recognition to kill an individual. No, it simply wipes out an entire city with 10 million inhabitants."[44]

Since the 1970s, Schmidhuber wanted to create "intelligent machines that could learn and improve on their own and become smarter than him within his lifetime."[44] He differentiates between two types of AIs: AI tools directed by humans, in particular for improving healthcare, and more interesting AIs that "are setting their own goals," inventing their own experiments and learning from them, like curious scientists. He has worked on both types for decades,[44] and has predicted that scaled-up versions of AI scientists will eventually "go where most of the physical resources are, to build more and bigger AIs." Within "a few tens of billions of years, curious self-improving AIs will colonize the visible cosmos in a way that’s infeasible for humans. Those who don’t won’t have an impact."[44] He said: "don’t think of humans as the crown of creation. Instead, view human civilization as part of a much grander scheme, an important step (but not the last one) on the path of the universe from very simple initial conditions toward more and more unfathomable complexity. Now it seems ready to take its next step, a step comparable to the invention of life itself over 3.5 billion years ago."[44]

He strongly supports the open-source movement, and thinks it is going to "challenge whatever big-tech dominance there might be at the moment," also because AI keeps getting 100 times cheaper per decade.[44]

References

  1. ^ a b c d e f g Schmidhuber, Jürgen. "Curriculum Vitae".
  2. ^ a b c d e John Markoff (27 November 2016). When A.I. Matures, It May Call Jürgen Schmidhuber ‘Dad’. The New York Times. Accessed April 2017.
  3. ^ . cemse.kaust.edu.sa. Archived from the original on 13 March 2023. Retrieved 9 May 2023.
  4. ^ "Juergen Schmidhuber". scholar.google.com. Retrieved 20 October 2021.
  5. ^ "Jürgen H. Schmidhuber". The Mathematics Genealogy Project. Retrieved 5 July 2022.
  6. ^ a b Dave O'Leary (3 October 2016). The Present and Future of AI and Deep Learning Featuring Professor Jürgen Schmidhuber. IT World Canada. Accessed April 2017.
  7. ^ "AI Pioneer Wants to Build the Renaissance Machine of the Future". Bloomberg.com. 16 January 2017. Retrieved 23 February 2018.
  8. ^ a b c Schmidhuber, Jürgen (1992). "Learning complex, extended sequences using the principle of history compression (based on TR FKI-148, 1991)" (PDF). Neural Computation. 4 (2): 234–242. doi:10.1162/neco.1992.4.2.234. S2CID 18271205.
  9. ^ a b c d e f Schmidhuber, Juergen (2022). "Annotated History of Modern AI and Deep Learning". arXiv:2212.11279 [cs.NE].
  10. ^ Schmidhuber, Jürgen (1993). Habilitation Thesis (PDF).
  11. ^ Schmidhuber, Jürgen (1991). "A possibility for implementing curiosity and boredom in model-building neural controllers". Proc. SAB'1991. MIT Press/Bradford Books. pp. 222–227.
  12. ^ Schmidhuber, Jürgen (2010). "Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010)". IEEE Transactions on Autonomous Mental Development. 2 (3): 230–247. doi:10.1109/TAMD.2010.2056368. S2CID 234198.
  13. ^ Schmidhuber, Jürgen (2020). "Generative Adversarial Networks are Special Cases of Artificial Curiosity (1990) and also Closely Related to Predictability Minimization (1991)". Neural Networks. 127: 58–66. arXiv:1906.04493. doi:10.1016/j.neunet.2020.04.008. PMID 32334341. S2CID 216056336.
  14. ^ S. Hochreiter., "Untersuchungen zu dynamischen neuronalen Netzen 2015-03-06 at the Wayback Machine," Diploma thesis. Institut f. Informatik, Technische Univ. Munich. Advisor: J. Schmidhuber, 1991.
  15. ^ Sepp Hochreiter; Jürgen Schmidhuber (1997). "Long short-term memory". Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. PMID 9377276. S2CID 1915014.
  16. ^ Felix A. Gers; Jürgen Schmidhuber; Fred Cummins (2000). "Learning to Forget: Continual Prediction with LSTM". Neural Computation. 12 (10): 2451–2471. CiteSeerX 10.1.1.55.5709. doi:10.1162/089976600300015015. PMID 11032042. S2CID 11598600.
  17. ^ Graves, A.; Schmidhuber, J. (2005). "Framewise phoneme classification with bidirectional LSTM and other neural network architectures". Neural Networks. 18 (5–6): 602–610. CiteSeerX 10.1.1.331.5800. doi:10.1016/j.neunet.2005.06.042. PMID 16112549. S2CID 1856462.
  18. ^ Klaus Greff; Rupesh Kumar Srivastava; Jan Koutník; Bas R. Steunebrink; Jürgen Schmidhuber (2015). "LSTM: A Search Space Odyssey". IEEE Transactions on Neural Networks and Learning Systems. 28 (10): 2222–2232. arXiv:1503.04069. Bibcode:2015arXiv150304069G. doi:10.1109/TNNLS.2016.2582924. PMID 27411231. S2CID 3356463.
  19. ^ Graves, Alex; Fernández, Santiago; Gomez, Faustino; Schmidhuber, Juergen (2006). "Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks". In Proceedings of the International Conference on Machine Learning, ICML 2006: 369–376. CiteSeerX 10.1.1.75.6306.
  20. ^ a b c d Vance, Ashlee (15 May 2018). "This Man Is the Godfather the AI Community Wants to Forget". Bloomberg Business Week. Retrieved 16 January 2019.
  21. ^ Srivastava, Rupesh Kumar; Greff, Klaus; Schmidhuber, Jürgen (2 May 2015). "Highway Networks". arXiv:1505.00387 [cs.LG].
  22. ^ Srivastava, Rupesh K; Greff, Klaus; Schmidhuber, Juergen (2015). "Training Very Deep Networks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 28: 2377–2385.
  23. ^ He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE. pp. 770–778. arXiv:1512.03385. doi:10.1109/CVPR.2016.90. ISBN 978-1-4673-8851-1.
  24. ^ Manning, Christopher D. (2022). "Human Language Understanding & Reasoning". Daedalus. 151 (2): 127–138. doi:10.1162/daed_a_01905. S2CID 248377870.
  25. ^ Schmidhuber, Jürgen (1 November 1992). "Learning to control fast-weight memories: an alternative to recurrent nets". Neural Computation. 4 (1): 131–139. doi:10.1162/neco.1992.4.1.131. S2CID 16683347.
  26. ^ a b Schlag, Imanol; Irie, Kazuki; Schmidhuber, Jürgen (2021). "Linear Transformers Are Secretly Fast Weight Programmers". ICML 2021. Springer. pp. 9355–9366.
  27. ^ Choromanski, Krzysztof; Likhosherstov, Valerii; Dohan, David; Song, Xingyou; Gane, Andreea; Sarlos, Tamas; Hawkins, Peter; Davis, Jared; Mohiuddin, Afroz; Kaiser, Lukasz; Belanger, David; Colwell, Lucy; Weller, Adrian (2020). "Rethinking Attention with Performers". arXiv:2009.14794 [cs.CL].
  28. ^ Schmidhuber, Jürgen (1993). "Reducing the ratio between learning complexity and number of time-varying variables in fully recurrent nets". ICANN 1993. Springer. pp. 460–463.
  29. ^ Kumar Chellapilla; Sid Puri; Patrice Simard (2006). "High Performance Convolutional Neural Networks for Document Processing". In Lorette, Guy (ed.). Tenth International Workshop on Frontiers in Handwriting Recognition. Suvisoft.
  30. ^ Ciresan, Dan; Ueli Meier; Jonathan Masci; Luca M. Gambardella; Jurgen Schmidhuber (2011). "Flexible, High Performance Convolutional Neural Networks for Image Classification" (PDF). Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence-Volume Volume Two. 2: 1237–1242. Retrieved 17 November 2013.
  31. ^ "IJCNN 2011 Competition result table". OFFICIAL IJCNN2011 COMPETITION. 2010. Retrieved 14 January 2019.
  32. ^ Schmidhuber, Jürgen (17 March 2017). "History of computer vision contests won by deep CNNs on GPU". Retrieved 14 January 2019.
  33. ^ a b c Schmidhuber, Jürgen (2015). "Deep Learning". Scholarpedia. 10 (11): 1527–54. CiteSeerX 10.1.1.76.1541. doi:10.1162/neco.2006.18.7.1527. PMID 16764513. S2CID 2309950.
  34. ^ Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen (June 2012). "Multi-column deep neural networks for image classification". 2012 IEEE Conference on Computer Vision and Pattern Recognition. New York, NY: Institute of Electrical and Electronics Engineers (IEEE). pp. 3642–3649. arXiv:1202.2745. CiteSeerX 10.1.1.300.3283. doi:10.1109/CVPR.2012.6248110. ISBN 978-1-4673-1226-4. OCLC 812295155. S2CID 2161592.
  35. ^ Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, Backpropagation Applied to Handwritten Zip Code Recognition; AT&T Bell Laboratories
  36. ^ Fukushima, Neocognitron (1980). "A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position". Biological Cybernetics. 36 (4): 193–202. doi:10.1007/bf00344251. PMID 7370364. S2CID 206775608.
  37. ^ Weng, J; Ahuja, N; Huang, TS (1993). "Learning recognition and segmentation of 3-D objects from 2-D images". Proc. 4th International Conf. Computer Vision: 121–128.
  38. ^ a b Oltermann, Philip (18 April 2017). "Jürgen Schmidhuber on the robot future: 'They will pay as much attention to us as we do to ants'". The Guardian. Retrieved 23 February 2018.
  39. ^ Schmidhuber, Juergen (7 July 2022). . IDSIA, Switzerland. Archived from the original on 9 February 2023. Retrieved 3 May 2023.
  40. ^ Schmidhuber, Juergen (30 December 2022). . IDSIA, Switzerland. Archived from the original on 7 April 2023. Retrieved 3 May 2023.
  41. ^ INNS Awards Recipients. International Neural Network Society. Accessed December 2016.
  42. ^ Recipients: Neural Networks Pioneer Award. Piscataway, NJ: IEEE Computational Intelligence Society. Accessed January 2019.
  43. ^ Members. European Academy of Sciences and Arts. Accessed December 2016.
  44. ^ a b c d e f g h Jones, Hessie (23 May 2023). "Juergen Schmidhuber, Renowned 'Father Of Modern AI,' Says His Life's Work Won't Lead To Dystopia". Forbes. Retrieved 26 May 2023.
  45. ^ Heaven, Will Douglas (15 October 2020). "Artificial general intelligence: Are we close, and does it even make sense to try? Quote: Jürgen Schmidhuber—sometimes called "the father of modern AI..." MIT Technology Review. Retrieved 20 August 2021.
  46. ^ Choul-woong, Yeon (22 February 2023). "User Centric AI Creates a New Order for Users". Korea IT Times. Retrieved 26 May 2023.
  47. ^ a b Dunker, Anders (2020). "Letting loose the AI demon. Quote: But this man is no crackpot: He is the father of modern AI and deep learning – foremost in his field". Modern Times Review. Retrieved 20 August 2021.
  48. ^ Enrique Alpanes (25 April 2021). Jürgen Schmidhuber, el hombre al que Alexa y Siri llamarían ‘papá’ si él quisiera hablar con ellas. El Pais. Accessed August 2021.
  49. ^ Razavi, Hooman (5 May 2020). "iHuman- AI & Ethics of Cinema (2020 Hot Docs Film Festival). Quote: The documentary interviews range AI top researchers and thinkers as Jürgen Schmidhuber - Father of Modern AI..." Universal Cinema. Retrieved 20 August 2021.
  50. ^ a b Colton, Emma (7 May 2023). "'Father of AI' says tech fears misplaced: 'You cannot stop it'". Fox News. Retrieved 26 May 2023.
  51. ^ a b Taylor, Josh (7 May 2023). "Rise of artificial intelligence is inevitable but should not be feared, 'father of AI' says". The Guardian. Retrieved 26 May 2023.
  52. ^ Wong, Andrew (16 May 2018). "The 'father of A.I' urges humans not to fear the technology". CNBC. Retrieved 27 February 2019.
  53. ^ Ruth Fulterer (21 February 2021). Der unbequeme Vater der künstlichen Intelligenz lebt in der Schweiz (The inconvenient father of AI lives in Switzerland). NZZ. Accessed August 2021.
  54. ^ Wang, Brian (14 June 2017). "Father of deep learning AI on General purpose AI and AI to conquer space in the 2050s". Next Big Future. Retrieved 27 February 2019.
  55. ^ Schmidhuber, Jurgen. "Critique of Paper by "Deep Learning Conspiracy". (Nature 521 p 436)". Retrieved 26 December 2019.

jürgen, schmidhuber, born, january, 1963, german, computer, scientist, noted, work, field, artificial, intelligence, specifically, artificial, neural, networks, scientific, director, dalle, molle, institute, artificial, intelligence, research, switzerland, als. Jurgen Schmidhuber born 17 January 1963 1 is a German computer scientist noted for his work in the field of artificial intelligence specifically artificial neural networks He is a scientific director of the Dalle Molle Institute for Artificial Intelligence Research in Switzerland 2 He is also director of the Artificial Intelligence Initiative and professor of the Computer Science program in the Computer Electrical and Mathematical Sciences and Engineering CEMSE division at the King Abdullah University of Science and Technology KAUST in Saudi Arabia 3 Jurgen SchmidhuberSchmidhuber speaking at the AI for GOOD Global Summit in 2017Born17 January 1963 1 Munich 1 West GermanyAlma materTechnical University of MunichKnown forLong short term memory Godel machine artificial curiosity meta learningScientific careerFieldsArtificial intelligenceInstitutionsDalle Molle Institute for Artificial Intelligence ResearchWebsitepeople wbr idsia wbr ch wbr juergenHe is best known for his foundational and highly cited 4 work on long short term memory LSTM a type of neural network architecture which went on to become the dominant technique for various natural language processing tasks in research and commercial applications in the 2010s Contents 1 Career 2 Research 3 Credit disputes 4 Recognition 5 Views 6 ReferencesCareer EditSchmidhuber completed his undergraduate 1987 and PhD 1991 studies at the Technical University of Munich in Munich Germany 1 His PhD advisors were Wilfried Brauer and Klaus Schulten 5 He taught there from 2004 until 2009 From 2009 6 until 2021 he was a professor of artificial intelligence at the Universita della Svizzera Italiana in Lugano Switzerland 1 He has served as the director of Dalle Molle Institute for Artificial Intelligence Research IDSIA a Swiss AI lab since 1995 1 In 2014 Schmidhuber formed a company Nnaisense to work on commercial applications of artificial intelligence in fields such as finance heavy industry and self driving cars Sepp Hochreiter Jaan Tallinn and Marcus Hutter are advisers to the company 2 Sales were under US 11 million in 2016 however Schmidhuber states that the current emphasis is on research and not revenue Nnaisense raised its first round of capital funding in January 2017 Schmidhuber s overall goal is to create an all purpose AI by training a single AI in sequence on a variety of narrow tasks 7 Research EditThis section relies excessively on references to primary sources Please improve this section by adding secondary or tertiary sources Find sources Jurgen Schmidhuber news newspapers books scholar JSTOR March 2023 Learn how and when to remove this template message In the 1980s backpropagation did not work well for deep learning with long credit assignment paths in artificial neural networks To overcome this problem Schmidhuber 1991 proposed a hierarchy of recurrent neural networks RNNs pre trained one level at a time by self supervised learning 8 It uses predictive coding to learn internal representations at multiple self organizing time scales This can substantially facilitate downstream deep learning The RNN hierarchy can be collapsed into a single RNN by distilling a higher level chunker network into a lower level automatizer network 8 9 In 1993 a chunker solved a deep learning task whose depth exceeded 1000 10 In 1991 Schmidhuber published adversarial neural networks that contest with each other in the form of a zero sum game where one network s gain is the other network s loss 11 12 13 The first network is a generative model that models a probability distribution over output patterns The second network learns by gradient descent to predict the reactions of the environment to these patterns This was called artificial curiosity In 2014 this principle was used in a generative adversarial network where the environmental reaction is 1 or 0 depending on whether the first network s output is in a given set This can be used to create realistic deepfakes Schmidhuber supervised the 1991 diploma thesis of his student Sepp Hochreiter 14 and called it one of the most important documents in the history of machine learning 9 It not only tested the neural history compressor 8 but also analyzed and overcame the vanishing gradient problem This led to the deep learning method called long short term memory LSTM a type of recurrent neural network The name LSTM was introduced in a tech report 1995 leading to the most cited LSTM publication 1997 co authored by Hochreiter and Schmidhuber 15 The standard LSTM architecture which is used in almost all current applications was introduced in 2000 by Felix Gers Schmidhuber and Fred Cummins 16 Today s vanilla LSTM using backpropagation through time was published with his student Alex Graves in 2005 17 18 and its connectionist temporal classification CTC training algorithm 19 in 2006 CTC enabled end to end speech recognition with LSTM By the 2010s the LSTM became the dominant technique for a variety of natural language processing tasks including speech recognition and machine translation and was widely implemented in commercial technologies such as Google Translate and Siri 20 LSTM has become the most cited neural network of the 20th century 9 In 2015 Rupesh Kumar Srivastava Klaus Greff and Schmidhuber used LSTM principles to create the Highway network a feedforward neural network with hundreds of layers much deeper than previous networks 21 22 7 months later the ImageNet 2015 competition was won with an open gated or gateless Highway network variant called Residual neural network 23 This has become the most cited neural network of the 21st century 9 Since 2018 transformers have overtaken the LSTM as the dominant neural network architecture in natural language processing 24 through large language models such as ChatGPT As early as 1992 Schmidhuber published an alternative to recurrent neural networks 25 which is now called a Transformer with linearized self attention 26 27 9 save for a normalization operator It learns internal spotlights of attention 28 a slow feedforward neural network learns by gradient descent to control the fast weights of another neural network through outer products of self generated activation patterns FROM and TO which are now called key and value for self attention 26 This fast weight attention mapping is applied to a query pattern In 2011 Schmidhuber s team at IDSIA with his postdoc Dan Ciresan also achieved dramatic speedups of convolutional neural networks CNNs on fast parallel computers called GPUs An earlier CNN on GPU by Chellapilla et al 2006 was 4 times faster than an equivalent implementation on CPU 29 The deep CNN of Dan Ciresan et al 2011 at IDSIA was already 60 times faster 30 and achieved the first superhuman performance in a computer vision contest in August 2011 31 Between 15 May 2011 and 10 September 2012 their fast and deep CNNs won no fewer than four image competitions 32 33 They also significantly improved on the best performance in the literature for multiple image databases 34 The approach has become central to the field of computer vision 33 It is based on CNN designs introduced much earlier by Yann LeCun et al 1989 35 who applied the backpropagation algorithm to a variant of Kunihiko Fukushima s original CNN architecture called neocognitron 36 later modified by J Weng s method called max pooling 37 33 Credit disputes EditSchmidhuber has controversially argued that he and other researchers have been denied adequate recognition for their contribution to the field of deep learning in favour of Geoffrey Hinton Yoshua Bengio and Yann LeCun who shared the 2018 Turing Award for their work in deep learning 2 20 38 He wrote a scathing 2015 article arguing that Hinton Bengio and Lecun heavily cite each other but fail to credit the pioneers of the field 38 In a statement to the New York Times Yann LeCun wrote that Jurgen is manically obsessed with recognition and keeps claiming credit he doesn t deserve for many many things It causes him to systematically stand up at the end of every talk and claim credit for what was just presented generally not in a justified manner 2 Schmidhuber replied that LeCun did this without any justification without providing a single example 39 and published details of numerous priority disputes with Hinton Bengio and LeCun 40 Kory Mathewson suggested that Schmidhuber s accomplishments have been downplayed because of his personality 20 Recognition EditSchmidhuber received the Helmholtz Award of the International Neural Network Society in 2013 41 and the Neural Networks Pioneer Award of the IEEE Computational Intelligence Society in 2016 42 for pioneering contributions to deep learning and neural networks 1 He is a member of the European Academy of Sciences and Arts 43 6 He has been referred to as the father of modern AI or similar 44 2 45 46 47 48 49 50 51 52 53 20 and also the father of deep learning 54 47 Schmidhuber himself however has called Alexey Grigorevich Ivakhnenko the father of deep learning 55 and gives credit to many even earlier AI pioneers 9 Views EditThis section contains too many quotations for an encyclopedic entry Please help improve the article by presenting facts as a neutrally worded summary with appropriate citations Consider transferring direct quotations to Wikiquote or for entire works to Wikisource June 2023 Schmidhuber states that in 95 of all cases AI research is really about our old motto which is make human lives longer and healthier and easier 51 He admits that the same tools that are now being used to improve lives can be used by bad actors but emphasizes that they can also be used against the bad actors 50 He does not believe AI poses a new quality of existential threat and is more worried about the old nuclear warheads which can wipe out human civilization within two hours without any AI 44 A large nuclear warhead doesn t need fancy face recognition to kill an individual No it simply wipes out an entire city with 10 million inhabitants 44 Since the 1970s Schmidhuber wanted to create intelligent machines that could learn and improve on their own and become smarter than him within his lifetime 44 He differentiates between two types of AIs AI tools directed by humans in particular for improving healthcare and more interesting AIs that are setting their own goals inventing their own experiments and learning from them like curious scientists He has worked on both types for decades 44 and has predicted that scaled up versions of AI scientists will eventually go where most of the physical resources are to build more and bigger AIs Within a few tens of billions of years curious self improving AIs will colonize the visible cosmos in a way that s infeasible for humans Those who don t won t have an impact 44 He said don t think of humans as the crown of creation Instead view human civilization as part of a much grander scheme an important step but not the last one on the path of the universe from very simple initial conditions toward more and more unfathomable complexity Now it seems ready to take its next step a step comparable to the invention of life itself over 3 5 billion years ago 44 He strongly supports the open source movement and thinks it is going to challenge whatever big tech dominance there might be at the moment also because AI keeps getting 100 times cheaper per decade 44 References Edit a b c d e f g Schmidhuber Jurgen Curriculum Vitae a b c d e John Markoff 27 November 2016 When A I Matures It May Call Jurgen Schmidhuber Dad The New York Times Accessed April 2017 Jurgen Schmidhuber cemse kaust edu sa Archived from the original on 13 March 2023 Retrieved 9 May 2023 Juergen Schmidhuber scholar google com Retrieved 20 October 2021 Jurgen H Schmidhuber The Mathematics Genealogy Project Retrieved 5 July 2022 a b Dave O Leary 3 October 2016 The Present and Future of AI and Deep Learning Featuring Professor Jurgen Schmidhuber IT World Canada Accessed April 2017 AI Pioneer Wants to Build the Renaissance Machine of the Future Bloomberg com 16 January 2017 Retrieved 23 February 2018 a b c Schmidhuber Jurgen 1992 Learning complex extended sequences using the principle of history compression based on TR FKI 148 1991 PDF Neural Computation 4 2 234 242 doi 10 1162 neco 1992 4 2 234 S2CID 18271205 a b c d e f Schmidhuber Juergen 2022 Annotated History of Modern AI and Deep Learning arXiv 2212 11279 cs NE Schmidhuber Jurgen 1993 Habilitation Thesis PDF Schmidhuber Jurgen 1991 A possibility for implementing curiosity and boredom in model building neural controllers Proc SAB 1991 MIT Press Bradford Books pp 222 227 Schmidhuber Jurgen 2010 Formal Theory of Creativity Fun and Intrinsic Motivation 1990 2010 IEEE Transactions on Autonomous Mental Development 2 3 230 247 doi 10 1109 TAMD 2010 2056368 S2CID 234198 Schmidhuber Jurgen 2020 Generative Adversarial Networks are Special Cases of Artificial Curiosity 1990 and also Closely Related to Predictability Minimization 1991 Neural Networks 127 58 66 arXiv 1906 04493 doi 10 1016 j neunet 2020 04 008 PMID 32334341 S2CID 216056336 S Hochreiter Untersuchungen zu dynamischen neuronalen Netzen Archived 2015 03 06 at the Wayback Machine Diploma thesis Institut f Informatik Technische Univ Munich Advisor J Schmidhuber 1991 Sepp Hochreiter Jurgen Schmidhuber 1997 Long short term memory Neural Computation 9 8 1735 1780 doi 10 1162 neco 1997 9 8 1735 PMID 9377276 S2CID 1915014 Felix A Gers Jurgen Schmidhuber Fred Cummins 2000 Learning to Forget Continual Prediction with LSTM Neural Computation 12 10 2451 2471 CiteSeerX 10 1 1 55 5709 doi 10 1162 089976600300015015 PMID 11032042 S2CID 11598600 Graves A Schmidhuber J 2005 Framewise phoneme classification with bidirectional LSTM and other neural network architectures Neural Networks 18 5 6 602 610 CiteSeerX 10 1 1 331 5800 doi 10 1016 j neunet 2005 06 042 PMID 16112549 S2CID 1856462 Klaus Greff Rupesh Kumar Srivastava Jan Koutnik Bas R Steunebrink Jurgen Schmidhuber 2015 LSTM A Search Space Odyssey IEEE Transactions on Neural Networks and Learning Systems 28 10 2222 2232 arXiv 1503 04069 Bibcode 2015arXiv150304069G doi 10 1109 TNNLS 2016 2582924 PMID 27411231 S2CID 3356463 Graves Alex Fernandez Santiago Gomez Faustino Schmidhuber Juergen 2006 Connectionist temporal classification Labelling unsegmented sequence data with recurrent neural networks In Proceedings of the International Conference on Machine Learning ICML 2006 369 376 CiteSeerX 10 1 1 75 6306 a b c d Vance Ashlee 15 May 2018 This Man Is the Godfather the AI Community Wants to Forget Bloomberg Business Week Retrieved 16 January 2019 Srivastava Rupesh Kumar Greff Klaus Schmidhuber Jurgen 2 May 2015 Highway Networks arXiv 1505 00387 cs LG Srivastava Rupesh K Greff Klaus Schmidhuber Juergen 2015 Training Very Deep Networks Advances in Neural Information Processing Systems Curran Associates Inc 28 2377 2385 He Kaiming Zhang Xiangyu Ren Shaoqing Sun Jian 2016 Deep Residual Learning for Image Recognition 2016 IEEE Conference on Computer Vision and Pattern Recognition CVPR Las Vegas NV USA IEEE pp 770 778 arXiv 1512 03385 doi 10 1109 CVPR 2016 90 ISBN 978 1 4673 8851 1 Manning Christopher D 2022 Human Language Understanding amp Reasoning Daedalus 151 2 127 138 doi 10 1162 daed a 01905 S2CID 248377870 Schmidhuber Jurgen 1 November 1992 Learning to control fast weight memories an alternative to recurrent nets Neural Computation 4 1 131 139 doi 10 1162 neco 1992 4 1 131 S2CID 16683347 a b Schlag Imanol Irie Kazuki Schmidhuber Jurgen 2021 Linear Transformers Are Secretly Fast Weight Programmers ICML 2021 Springer pp 9355 9366 Choromanski Krzysztof Likhosherstov Valerii Dohan David Song Xingyou Gane Andreea Sarlos Tamas Hawkins Peter Davis Jared Mohiuddin Afroz Kaiser Lukasz Belanger David Colwell Lucy Weller Adrian 2020 Rethinking Attention with Performers arXiv 2009 14794 cs CL Schmidhuber Jurgen 1993 Reducing the ratio between learning complexity and number of time varying variables in fully recurrent nets ICANN 1993 Springer pp 460 463 Kumar Chellapilla Sid Puri Patrice Simard 2006 High Performance Convolutional Neural Networks for Document Processing In Lorette Guy ed Tenth International Workshop on Frontiers in Handwriting Recognition Suvisoft Ciresan Dan Ueli Meier Jonathan Masci Luca M Gambardella Jurgen Schmidhuber 2011 Flexible High Performance Convolutional Neural Networks for Image Classification PDF Proceedings of the Twenty Second International Joint Conference on Artificial Intelligence Volume Volume Two 2 1237 1242 Retrieved 17 November 2013 IJCNN 2011 Competition result table OFFICIAL IJCNN2011 COMPETITION 2010 Retrieved 14 January 2019 Schmidhuber Jurgen 17 March 2017 History of computer vision contests won by deep CNNs on GPU Retrieved 14 January 2019 a b c Schmidhuber Jurgen 2015 Deep Learning Scholarpedia 10 11 1527 54 CiteSeerX 10 1 1 76 1541 doi 10 1162 neco 2006 18 7 1527 PMID 16764513 S2CID 2309950 Ciresan Dan Meier Ueli Schmidhuber Jurgen June 2012 Multi column deep neural networks for image classification 2012 IEEE Conference on Computer Vision and Pattern Recognition New York NY Institute of Electrical and Electronics Engineers IEEE pp 3642 3649 arXiv 1202 2745 CiteSeerX 10 1 1 300 3283 doi 10 1109 CVPR 2012 6248110 ISBN 978 1 4673 1226 4 OCLC 812295155 S2CID 2161592 Y LeCun B Boser J S Denker D Henderson R E Howard W Hubbard L D Jackel Backpropagation Applied to Handwritten Zip Code Recognition AT amp T Bell Laboratories Fukushima Neocognitron 1980 A self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position Biological Cybernetics 36 4 193 202 doi 10 1007 bf00344251 PMID 7370364 S2CID 206775608 Weng J Ahuja N Huang TS 1993 Learning recognition and segmentation of 3 D objects from 2 D images Proc 4th International Conf Computer Vision 121 128 a b Oltermann Philip 18 April 2017 Jurgen Schmidhuber on the robot future They will pay as much attention to us as we do to ants The Guardian Retrieved 23 February 2018 Schmidhuber Juergen 7 July 2022 LeCun s 2022 paper on autonomous machine intelligence rehashes but does not cite essential work of 1990 2015 IDSIA Switzerland Archived from the original on 9 February 2023 Retrieved 3 May 2023 Schmidhuber Juergen 30 December 2022 Scientific Integrity and the History of Deep Learning The 2021 Turing Lecture and the 2018 Turing Award Technical Report IDSIA 77 21 IDSIA Switzerland Archived from the original on 7 April 2023 Retrieved 3 May 2023 INNS Awards Recipients International Neural Network Society Accessed December 2016 Recipients Neural Networks Pioneer Award Piscataway NJ IEEE Computational Intelligence Society Accessed January 2019 Members European Academy of Sciences and Arts Accessed December 2016 a b c d e f g h Jones Hessie 23 May 2023 Juergen Schmidhuber Renowned Father Of Modern AI Says His Life s Work Won t Lead To Dystopia Forbes Retrieved 26 May 2023 Heaven Will Douglas 15 October 2020 Artificial general intelligence Are we close and does it even make sense to try Quote Jurgen Schmidhuber sometimes called the father of modern AI MIT Technology Review Retrieved 20 August 2021 Choul woong Yeon 22 February 2023 User Centric AI Creates a New Order for Users Korea IT Times Retrieved 26 May 2023 a b Dunker Anders 2020 Letting loose the AI demon Quote But this man is no crackpot He is the father of modern AI and deep learning foremost in his field Modern Times Review Retrieved 20 August 2021 Enrique Alpanes 25 April 2021 Jurgen Schmidhuber el hombre al que Alexa y Siri llamarian papa si el quisiera hablar con ellas El Pais Accessed August 2021 Razavi Hooman 5 May 2020 iHuman AI amp Ethics of Cinema 2020 Hot Docs Film Festival Quote The documentary interviews range AI top researchers and thinkers as Jurgen Schmidhuber Father of Modern AI Universal Cinema Retrieved 20 August 2021 a b Colton Emma 7 May 2023 Father of AI says tech fears misplaced You cannot stop it Fox News Retrieved 26 May 2023 a b Taylor Josh 7 May 2023 Rise of artificial intelligence is inevitable but should not be feared father of AI says The Guardian Retrieved 26 May 2023 Wong Andrew 16 May 2018 The father of A I urges humans not to fear the technology CNBC Retrieved 27 February 2019 Ruth Fulterer 21 February 2021 Der unbequeme Vater der kunstlichen Intelligenz lebt in der Schweiz The inconvenient father of AI lives in Switzerland NZZ Accessed August 2021 Wang Brian 14 June 2017 Father of deep learning AI on General purpose AI and AI to conquer space in the 2050s Next Big Future Retrieved 27 February 2019 Schmidhuber Jurgen Critique of Paper by Deep Learning Conspiracy Nature 521 p 436 Retrieved 26 December 2019 Retrieved from https en wikipedia org w index php title Jurgen Schmidhuber amp oldid 1170852249, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.