fbpx
Wikipedia

Foundation models

A foundation model is a large artificial intelligence model trained on a vast quantity of unlabeled data at scale (usually by self-supervised learning) resulting in a model that can be adapted to a wide range of downstream tasks.[1][2] Foundation models have helped bring about a major transformation in how AI systems are built since their introduction in 2018. Early examples of foundation models were large pre-trained language models including BERT[3] and GPT-3. Using the same ideas, domain specific models using sequences of other kinds of tokens, such as medical codes, have been built as well.[4] Subsequently, several multimodal foundation models have been produced including DALL-E, Flamingo,[5] and Florence.[6] The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) popularized the term.[1]

Definitions

The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) coined the term foundation model to refer to "any model that is trained on broad data (generally using self-supervision at scale) that can be adapted (e.g., fine-tuned) to a wide range of downstream tasks".[7] This is not a new technique in itself, as it is based on deep neural networks and self-supervised learning, but the scale at which it has been developed in the last years, and the potential for one model to be used for many different purposes, warrants a new term, the Stanford group argue.[7]

A foundation model is a "paradigm for building AI systems" in which a model trained on a large amount of unlabeled data can be adapted to many applications.[8][9] Foundation models are "designed to be adapted (e.g., finetuned) to various downstream cognitive tasks by pre-training on broad data at scale".[10]

Key characteristics of foundation models are emergence and homogenization.[7] Because training data is not labelled by humans, the model emerges rather than being explicitly encoded. Properties that were not anticipated can appear. For example, a model trained on a large language dataset might learn to generate stories of its own, or to do arithmetic, without being explicitly programmed to do so.[11] Homogenization means that the same method is used in many domains, which allows for powerful advances but also the possibility of "single points of failure".[7]

Opportunities and risks

A 2021 arXiv report listed foundation models' capabilities in regards to "language, vision, robotics, reasoning, and human interaction", technical principles, such as "model architectures, training procedures, data, systems, security, evaluation, and theory, their applications, for example in law, healthcare, and education and their potential impact on society, including "inequity, misuse, economic and environmental impact, legal and ethical considerations".[7]

An article about foundation models in The Economist notes that "some worry that the technology’s heedless spread will further concentrate economic and political power".[11]

References

  1. ^ a b "Introducing the Center for Research on Foundation Models (CRFM)". Stanford HAI. Retrieved 11 June 2022.
  2. ^ Goldman, Sharon (2022-09-13). "Foundation models: 2022's AI paradigm shift". VentureBeat. Retrieved 2022-10-24.
  3. ^ Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna (2020). "A Primer in BERTology: What we know about how BERT works". arXiv:2002.12327 [cs.CL].
  4. ^ Steinberg, Ethan; Jung, Ken; Fries, Jason A.; Corbin, Conor K.; Pfohl, Stephen R.; Shah, Nigam H. (January 2021). "Language models are an effective representation learning technique for electronic health record data". Journal of Biomedical Informatics. 113: 103637. doi:10.1016/j.jbi.2020.103637. ISSN 1532-0480. PMC 7863633. PMID 33290879.
  5. ^ Tackling multiple tasks with a single visual language model, 28 April 2022, retrieved 13 June 2022
  6. ^ Yuan, Lu; Chen, Dongdong; Chen, Yi-Ling; Codella, Noel; Dai, Xiyang; Gao, Jianfeng; Hu, Houdong; Huang, Xuedong; Li, Boxin; Li, Chunyuan; Liu, Ce; Liu, Mengchen; Liu, Zicheng; Lu, Yumao; Shi, Yu; Wang, Lijuan; Wang, Jianfeng; Xiao, Bin; Xiao, Zhen; Yang, Jianwei; Zeng, Michael; Zhou, Luowei; Zhang, Pengchuan (2022). "Florence: A New Foundation Model for Computer Vision". arXiv:2111.11432 [cs.CV].
  7. ^ a b c d e Bommasani, Rishi; Hudson, Drew A.; Adeli, Ehsan; Altman, Russ; Arora, Simran; von Arx, Sydney; Bernstein, Michael S.; Bohg, Jeannette; Bosselut, Antoine; Brunskill, Emma; Brynjolfsson, Erik; Buch, Shyamal; Card, Dallas; Castellon, Rodrigo; Chatterji, Niladri; Chen, Annie; Creel, Kathleen; Davis, Jared Quincy; Demszky, Dora; Donahue, Chris; Doumbouya, Moussa; Durmus, Esin; Ermon, Stefano; Etchemendy, John; Ethayarajh, Kawin; Fei-Fei, Li; Finn, Chelsea; Gale, Trevor; Gillespie, Lauren; Goel, Karan; Goodman, Noah; Grossman, Shelby; Guha, Neel; Hashimoto, Tatsunori; Henderson, Peter; Hewitt, John; Ho, Daniel E.; Hong, Jenny; Hsu, Kyle; Huang, Jing; Icard, Thomas; Jain, Saahil; Jurafsky, Dan; Kalluri, Pratyusha; Karamcheti, Siddharth; Keeling, Geoff; Khani, Fereshte; Khattab, Omar; Koh, Pang Wei; Krass, Mark; Krishna, Ranjay; Kuditipudi, Rohith; Kumar, Ananya; Ladhak, Faisal; Lee, Mina; Lee, Tony; Leskovec, Jure; Levent, Isabelle; Li, Xiang Lisa; Li, Xuechen; Ma, Tengyu; Malik, Ali; Manning, Christopher D.; Mirchandani, Suvir; Mitchell, Eric; Munyikwa, Zanele; Nair, Suraj; Narayan, Avanika; Narayanan, Deepak; Newman, Ben; Nie, Allen; Niebles, Juan Carlos; Nilforoshan, Hamed; Nyarko, Julian; Ogut, Giray; Orr, Laurel; Papadimitriou, Isabel; Park, Joon Sung; Piech, Chris; Portelance, Eva; Potts, Christopher; Raghunathan, Aditi; Reich, Rob; Ren, Hongyu; Rong, Frieda; Roohani, Yusuf; Ruiz, Camilo; Ryan, Jack; Ré, Christopher; Sadigh, Dorsa; Sagawa, Shiori; Santhanam, Keshav; Shih, Andy; Srinivasan, Krishnan; Tamkin, Alex; Taori, Rohan; Thomas, Armin W.; Tramèr, Florian; Wang, Rose E.; Wang, William; Wu, Bohan; Wu, Jiajun; Wu, Yuhuai; Xie, Sang Michael; Yasunaga, Michihiro; You, Jiaxuan; Zaharia, Matei; Zhang, Michael; Zhang, Tianyi; Zhang, Xikun; Zhang, Yuhui; Zheng, Lucia; Zhou, Kaitlyn; Liang, Percy (18 August 2021). On the Opportunities and Risks of Foundation Models (Report). arXiv:2108.07258. Retrieved 10 June 2022.
  8. ^ "Stanford CRFM". Retrieved 10 June 2022.
  9. ^ "What are foundation models?". IBM Research Blog. 9 February 2021. Retrieved 10 June 2022.
  10. ^ Fei, Nanyi; Lu, Zhiwu; Gao, Yizhao; Yang, Guoxing; Huo, Yuqi; Wen, Jingyuan; Lu, Haoyu; Song, Ruihua; Gao, Xin; Xiang, Tao; Sun, Hao; Wen, Ji-Rong (December 2022). "Towards artificial general intelligence via a multimodal foundation model". Nature Communications. 13 (1): 3094. doi:10.1038/s41467-022-30761-2. ISSN 2041-1723. PMC 9163040. PMID 35655064.
  11. ^ a b "Huge "foundation models" are turbo-charging AI progress". The Economist. ISSN 0013-0613. Retrieved 2022-10-24.

foundation, models, foundation, model, large, artificial, intelligence, model, trained, vast, quantity, unlabeled, data, scale, usually, self, supervised, learning, resulting, model, that, adapted, wide, range, downstream, tasks, have, helped, bring, about, ma. A foundation model is a large artificial intelligence model trained on a vast quantity of unlabeled data at scale usually by self supervised learning resulting in a model that can be adapted to a wide range of downstream tasks 1 2 Foundation models have helped bring about a major transformation in how AI systems are built since their introduction in 2018 Early examples of foundation models were large pre trained language models including BERT 3 and GPT 3 Using the same ideas domain specific models using sequences of other kinds of tokens such as medical codes have been built as well 4 Subsequently several multimodal foundation models have been produced including DALL E Flamingo 5 and Florence 6 The Stanford Institute for Human Centered Artificial Intelligence s HAI Center for Research on Foundation Models CRFM popularized the term 1 Definitions EditThe Stanford Institute for Human Centered Artificial Intelligence s HAI Center for Research on Foundation Models CRFM coined the term foundation model to refer to any model that is trained on broad data generally using self supervision at scale that can be adapted e g fine tuned to a wide range of downstream tasks 7 This is not a new technique in itself as it is based on deep neural networks and self supervised learning but the scale at which it has been developed in the last years and the potential for one model to be used for many different purposes warrants a new term the Stanford group argue 7 A foundation model is a paradigm for building AI systems in which a model trained on a large amount of unlabeled data can be adapted to many applications 8 9 Foundation models are designed to be adapted e g finetuned to various downstream cognitive tasks by pre training on broad data at scale 10 Key characteristics of foundation models are emergence and homogenization 7 Because training data is not labelled by humans the model emerges rather than being explicitly encoded Properties that were not anticipated can appear For example a model trained on a large language dataset might learn to generate stories of its own or to do arithmetic without being explicitly programmed to do so 11 Homogenization means that the same method is used in many domains which allows for powerful advances but also the possibility of single points of failure 7 Opportunities and risks EditA 2021 arXiv report listed foundation models capabilities in regards to language vision robotics reasoning and human interaction technical principles such as model architectures training procedures data systems security evaluation and theory their applications for example in law healthcare and education and their potential impact on society including inequity misuse economic and environmental impact legal and ethical considerations 7 An article about foundation models in The Economist notes that some worry that the technology s heedless spread will further concentrate economic and political power 11 References Edit a b Introducing the Center for Research on Foundation Models CRFM Stanford HAI Retrieved 11 June 2022 Goldman Sharon 2022 09 13 Foundation models 2022 s AI paradigm shift VentureBeat Retrieved 2022 10 24 Rogers Anna Kovaleva Olga Rumshisky Anna 2020 A Primer in BERTology What we know about how BERT works arXiv 2002 12327 cs CL Steinberg Ethan Jung Ken Fries Jason A Corbin Conor K Pfohl Stephen R Shah Nigam H January 2021 Language models are an effective representation learning technique for electronic health record data Journal of Biomedical Informatics 113 103637 doi 10 1016 j jbi 2020 103637 ISSN 1532 0480 PMC 7863633 PMID 33290879 Tackling multiple tasks with a single visual language model 28 April 2022 retrieved 13 June 2022 Yuan Lu Chen Dongdong Chen Yi Ling Codella Noel Dai Xiyang Gao Jianfeng Hu Houdong Huang Xuedong Li Boxin Li Chunyuan Liu Ce Liu Mengchen Liu Zicheng Lu Yumao Shi Yu Wang Lijuan Wang Jianfeng Xiao Bin Xiao Zhen Yang Jianwei Zeng Michael Zhou Luowei Zhang Pengchuan 2022 Florence A New Foundation Model for Computer Vision arXiv 2111 11432 cs CV a b c d e Bommasani Rishi Hudson Drew A Adeli Ehsan Altman Russ Arora Simran von Arx Sydney Bernstein Michael S Bohg Jeannette Bosselut Antoine Brunskill Emma Brynjolfsson Erik Buch Shyamal Card Dallas Castellon Rodrigo Chatterji Niladri Chen Annie Creel Kathleen Davis Jared Quincy Demszky Dora Donahue Chris Doumbouya Moussa Durmus Esin Ermon Stefano Etchemendy John Ethayarajh Kawin Fei Fei Li Finn Chelsea Gale Trevor Gillespie Lauren Goel Karan Goodman Noah Grossman Shelby Guha Neel Hashimoto Tatsunori Henderson Peter Hewitt John Ho Daniel E Hong Jenny Hsu Kyle Huang Jing Icard Thomas Jain Saahil Jurafsky Dan Kalluri Pratyusha Karamcheti Siddharth Keeling Geoff Khani Fereshte Khattab Omar Koh Pang Wei Krass Mark Krishna Ranjay Kuditipudi Rohith Kumar Ananya Ladhak Faisal Lee Mina Lee Tony Leskovec Jure Levent Isabelle Li Xiang Lisa Li Xuechen Ma Tengyu Malik Ali Manning Christopher D Mirchandani Suvir Mitchell Eric Munyikwa Zanele Nair Suraj Narayan Avanika Narayanan Deepak Newman Ben Nie Allen Niebles Juan Carlos Nilforoshan Hamed Nyarko Julian Ogut Giray Orr Laurel Papadimitriou Isabel Park Joon Sung Piech Chris Portelance Eva Potts Christopher Raghunathan Aditi Reich Rob Ren Hongyu Rong Frieda Roohani Yusuf Ruiz Camilo Ryan Jack Re Christopher Sadigh Dorsa Sagawa Shiori Santhanam Keshav Shih Andy Srinivasan Krishnan Tamkin Alex Taori Rohan Thomas Armin W Tramer Florian Wang Rose E Wang William Wu Bohan Wu Jiajun Wu Yuhuai Xie Sang Michael Yasunaga Michihiro You Jiaxuan Zaharia Matei Zhang Michael Zhang Tianyi Zhang Xikun Zhang Yuhui Zheng Lucia Zhou Kaitlyn Liang Percy 18 August 2021 On the Opportunities and Risks of Foundation Models Report arXiv 2108 07258 Retrieved 10 June 2022 Stanford CRFM Retrieved 10 June 2022 What are foundation models IBM Research Blog 9 February 2021 Retrieved 10 June 2022 Fei Nanyi Lu Zhiwu Gao Yizhao Yang Guoxing Huo Yuqi Wen Jingyuan Lu Haoyu Song Ruihua Gao Xin Xiang Tao Sun Hao Wen Ji Rong December 2022 Towards artificial general intelligence via a multimodal foundation model Nature Communications 13 1 3094 doi 10 1038 s41467 022 30761 2 ISSN 2041 1723 PMC 9163040 PMID 35655064 a b Huge foundation models are turbo charging AI progress The Economist ISSN 0013 0613 Retrieved 2022 10 24 Retrieved from https en wikipedia org w index php title Foundation models amp oldid 1131173552, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.