fbpx
Wikipedia

Topic model

In statistics and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear approximately equally in both. A document typically concerns multiple topics in different proportions; thus, in a document that is 10% about cats and 90% about dogs, there would probably be about 9 times more dog words than cat words. The "topics" produced by topic modeling techniques are clusters of similar words. A topic model captures this intuition in a mathematical framework, which allows examining a set of documents and discovering, based on the statistics of the words in each, what the topics might be and what each document's balance of topics is.

Topic models are also referred to as probabilistic topic models, which refers to statistical algorithms for discovering the latent semantic structures of an extensive text body. In the age of information, the amount of the written material we encounter each day is simply beyond our processing capacity. Topic models can help to organize and offer insights for us to understand large collections of unstructured text bodies. Originally developed as a text-mining tool, topic models have been used to detect instructive structures in data such as genetic information, images, and networks. They also have applications in other fields such as bioinformatics[1] and computer vision.[2]

History Edit

An early topic model was described by Papadimitriou, Raghavan, Tamaki and Vempala in 1998.[3] Another one, called probabilistic latent semantic analysis (PLSA), was created by Thomas Hofmann in 1999.[4] Latent Dirichlet allocation (LDA), perhaps the most common topic model currently in use, is a generalization of PLSA. Developed by David Blei, Andrew Ng, and Michael I. Jordan in 2002, LDA introduces sparse Dirichlet prior distributions over document-topic and topic-word distributions, encoding the intuition that documents cover a small number of topics and that topics often use a small number of words.[5] Other topic models are generally extensions on LDA, such as Pachinko allocation, which improves on LDA by modeling correlations between topics in addition to the word correlations which constitute topics. Hierarchical latent tree analysis () is an alternative to LDA, which models word co-occurrence using a tree of latent variables and the states of the latent variables, which correspond to soft clusters of documents, are interpreted as topics.

Animation of the topic detection process in a document-word matrix through biclustering. Every column corresponds to a document, every row to a word. A cell stores the frequency of a word in a document, with dark cells indicating high word frequencies. This procedure groups documents, which use similar words, as it groups words occurring in a similar set of documents. Such groups of words are then called topics. More usual topic models, such as LDA, only group documents, based on a more sophisticated and probabilistic mechanism.[6]

Topic models for context information Edit

Approaches for temporal information include Block and Newman's determination of the temporal dynamics of topics in the Pennsylvania Gazette during 1728–1800. Griffiths & Steyvers used topic modeling on abstracts from the journal PNAS to identify topics that rose or fell in popularity from 1991 to 2001 whereas Lamba & Madhusushan [7] used topic modeling on full-text research articles retrieved from DJLIT journal from 1981 to 2018. In the field of library and information science, Lamba & Madhusudhan [7][8][9][10] applied topic modeling on different Indian resources like journal articles and electronic theses and resources (ETDs). Nelson [11] has been analyzing change in topics over time in the Richmond Times-Dispatch to understand social and political changes and continuities in Richmond during the American Civil War. Yang, Torget and Mihalcea applied topic modeling methods to newspapers from 1829 to 2008. Mimno used topic modelling with 24 journals on classical philology and archaeology spanning 150 years to look at how topics in the journals change over time and how the journals become more different or similar over time.

Yin et al.[12] introduced a topic model for geographically distributed documents, where document positions are explained by latent regions which are detected during inference.

Chang and Blei[13] included network information between linked documents in the relational topic model, to model the links between websites.

The author-topic model by Rosen-Zvi et al.[14] models the topics associated with authors of documents to improve the topic detection for documents with authorship information.

HLTA was applied to a collection of recent research papers published at major AI and Machine Learning venues. The resulting model is called The AI Tree. The resulting topics are used to index the papers at aipano.cse.ust.hk to help researchers track research trends and identify papers to read, and help conference organizers and journal editors identify reviewers for submissions.

To improve the qualitative aspects and coherency of generated topics, some researchers have explored the efficacy of "coherence scores", or otherwise how computer-extracted clusters (i.e. topics) align with a human benchmark.[15][16] Coherence scores are metrics for optimising the number of topics to extract from a document corpus.[17]

Algorithms Edit

In practice, researchers attempt to fit appropriate model parameters to the data corpus using one of several heuristics for maximum likelihood fit. A survey by D. Blei describes this suite of algorithms.[18] Several groups of researchers starting with Papadimitriou et al.[3] have attempted to design algorithms with provable guarantees. Assuming that the data were actually generated by the model in question, they try to design algorithms that probably find the model that was used to create the data. Techniques used here include singular value decomposition (SVD) and the method of moments. In 2012 an algorithm based upon non-negative matrix factorization (NMF) was introduced that also generalizes to topic models with correlations among topics.[19]

In 2018 a new approach to topic models was proposed: it is based on stochastic block model[20]

Applications of topic models Edit

To quantitative biomedicine Edit

Topic models are being used also in other contexts. For examples uses of topic models in biology and bioinformatics research emerged.[21] Recently topic models has been used to extract information from dataset of cancers' genomic samples.[22] In this case topics are biological latent variables to be inferred.

To analysis of music and creativity Edit

Topic models can be used for analysis of continuous signals like music. For instance, they were used to quantify how musical styles change in time, and identify the influence of specific artists on later music creation. [23]

See also Edit

References Edit

  1. ^ Blei, David (April 2012). "Probabilistic Topic Models". Communications of the ACM. 55 (4): 77–84. doi:10.1145/2133806.2133826. S2CID 753304.
  2. ^ Cao, Liangliang, and Li Fei-Fei. "Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes." 2007 IEEE 11th International Conference on Computer Vision. IEEE, 2007.
  3. ^ a b Papadimitriou, Christos; Raghavan, Prabhakar; Tamaki, Hisao; Vempala, Santosh (1998). "Latent semantic indexing" (Postscript). Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems - PODS '98. pp. 159–168. doi:10.1145/275487.275505. ISBN 978-0897919968. S2CID 1479546.{{cite book}}: CS1 maint: date and year (link)
  4. ^ Hofmann, Thomas (1999). (PDF). Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval. Archived from the original (PDF) on 2010-12-14.
  5. ^ Blei, David M.; Ng, Andrew Y.; Jordan, Michael I; Lafferty, John (January 2003). "Latent Dirichlet allocation". Journal of Machine Learning Research. 3: 993–1022. doi:10.1162/jmlr.2003.3.4-5.993.
  6. ^ http://topicmodels.west.uni-koblenz.de/ckling/tmt/svd_ap.html
  7. ^ a b Lamba, Manika jun (2019). "Mapping of topics in DESIDOC Journal of Library and Information Technology, India: a study". Scientometrics. 120 (2): 477–505. doi:10.1007/s11192-019-03137-5. ISSN 0138-9130. S2CID 174802673.
  8. ^ Lamba, Manika jun (2019). "Metadata Tagging and Prediction Modeling: Case Study of DESIDOC Journal of Library and Information Technology (2008-2017)". World Digital Libraries. 12: 33–89. doi:10.18329/09757597/2019/12103 (inactive 1 August 2023). ISSN 0975-7597.{{cite journal}}: CS1 maint: DOI inactive as of August 2023 (link)
  9. ^ Lamba, Manika may (2019). "Author-Topic Modeling of DESIDOC Journal of Library and Information Technology (2008-2017), India". Library Philosophy and Practice.
  10. ^ Lamba, Manika sep (2018). Metadata Tagging of Library and Information Science Theses: Shodhganga (2013-2017) (PDF). ETD2018:Beyond the boundaries of Rims and Oceans. Taiwan,Taipei.
  11. ^ Nelson, Rob. "Mining the Dispatch". Mining the Dispatch. Digital Scholarship Lab, University of Richmond. Retrieved 26 March 2021.
  12. ^ Yin, Zhijun (2011). "Geographical topic discovery and comparison". Proceedings of the 20th international conference on World wide web. pp. 247–256. doi:10.1145/1963405.1963443. ISBN 9781450306324. S2CID 17883132.{{cite book}}: CS1 maint: date and year (link)
  13. ^ Chang, Jonathan (2009). "Relational Topic Models for Document Networks" (PDF). Aistats. 9: 81–88.
  14. ^ Rosen-Zvi, Michal (2004). "The author-topic model for authors and documents". Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence: 487–494. arXiv:1207.4169.
  15. ^ Nikolenko, Sergey (2017). "Topic modelling for qualitative studies". Journal of Information Science. 43: 88–102. doi:10.1177/0165551515617393. S2CID 30657489.
  16. ^ Reverter-Rambaldi, Marcel (2022). Topic Modelling in Spontaneous Speech Data (Honours thesis). Australian National University. doi:10.25911/M1YF-ZF55.
  17. ^ Newman, David (2010). "Automatic evaluation of topic coherence". Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics: 100–108.
  18. ^ Blei, David M. (April 2012). "Introduction to Probabilistic Topic Models" (PDF). Comm. ACM. 55 (4): 77–84. doi:10.1145/2133806.2133826. S2CID 753304.
  19. ^ Sanjeev Arora; Rong Ge; Ankur Moitra (April 2012). "Learning Topic Models—Going beyond SVD". arXiv:1204.1956 [cs.LG].
  20. ^ Martin Gerlach; Tiago Pexioto; Eduardo Altmann (2018). "A network approach to topic models". Science Advances. 4 (7): eaaq1360. arXiv:1708.01677. Bibcode:2018SciA....4.1360G. doi:10.1126/sciadv.aaq1360. PMC 6051742. PMID 30035215.
  21. ^ Liu, L.; Tang, L.; et al. (2016). "An overview of topic modeling and its current applications in bioinformatics". SpringerPlus. 5 (1): 1608. doi:10.1186/s40064-016-3252-8. PMC 5028368. PMID 27652181. S2CID 16712827.
  22. ^ Valle, F.; Osella, M.; Caselle, M. (2020). "A Topic Modeling Analysis of TCGA Breast and Lung Cancer Transcriptomic Data". Cancers. 12 (12): 3799. doi:10.3390/cancers12123799. PMC 7766023. PMID 33339347. S2CID 229325007.
  23. ^ Shalit, Uri; Weinshall, Daphna; Chechik, Gal (2013-05-13). "Modeling Musical Influence with Topic Models". Proceedings of the 30th International Conference on Machine Learning. PMLR: 244–252.

Further reading Edit

  • Steyvers, Mark; Griffiths, Tom (2007). . In Landauer, T.; McNamara, D; Dennis, S.; et al. (eds.). Handbook of Latent Semantic Analysis (PDF). Psychology Press. ISBN 978-0-8058-5418-3. Archived from the original (PDF) on 2013-06-24.
  • Blei, D.M.; Lafferty, J.D. (2009). "Topic Models" (PDF).
  • Blei, D.; Lafferty, J. (2007). "A correlated topic model of Science". Annals of Applied Statistics. 1 (1): 17–35. arXiv:0708.3601. doi:10.1214/07-AOAS114. S2CID 8872108.
  • Mimno, D. (April 2012). "Computational Historiography: Data Mining in a Century of Classics Journals" (PDF). Journal on Computing and Cultural Heritage. 5 (1): 1–19. doi:10.1145/2160165.2160168. S2CID 12153151.
  • Marwick, Ben (2013). "Discovery of Emergent Issues and Controversies in Anthropology Using Text Mining, Topic Modeling, and Social Network Analysis of Microblog Content". In Yanchang, Zhao; Yonghua, Cen (eds.). Data Mining Applications with R. Elsevier. pp. 63–93.
  • Jockers, M. 2010 Who's your DH Blog Mate: Match-Making the Day of DH Bloggers with Topic Modeling Matthew L. Jockers, posted 19 March 2010
  • Drouin, J. 2011 Foray Into Topic Modeling Ecclesiastical Proust Archive. posted 17 March 2011
  • Templeton, C. 2011 Topic Modeling in the Humanities: An Overview Maryland Institute for Technology in the Humanities Blog. posted 1 August 2011
  • Griffiths, T.; Steyvers, M. (2004). "Finding scientific topics". Proceedings of the National Academy of Sciences. 101 (Suppl 1): 5228–35. Bibcode:2004PNAS..101.5228G. doi:10.1073/pnas.0307752101. PMC 387300. PMID 14872004.
  • Yang, T., A Torget and R. Mihalcea (2011) Topic Modeling on Historical Newspapers. Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. The Association for Computational Linguistics, Madison, WI. pages 96–104.
  • Block, S. (January 2006). "Doing More with Digitization: An introduction to topic modeling of early American sources". Common-place the Interactive Journal of Early American Life. 6 (2).
  • Newman, D.; Block, S. (March 2006). "Probabilistic Topic Decomposition of an Eighteenth-Century Newspaper" (PDF). Journal of the American Society for Information Science and Technology. 57 (5): 753–767. doi:10.1002/asi.20342. S2CID 1484286.

External links Edit

  • Mimno, David. "Topic modeling bibliography".
  • Brett, Megan R. "Topic Modeling: A Basic Introduction". Journal of Digital Humanities.
  • Topic Models Applied to Online News and Reviews Video of a Google Tech Talk presentation by Alice Oh on topic modeling with LDA
  • Modeling Science: Dynamic Topic Models of Scholarly Research Video of a Google Tech Talk presentation by David M. Blei
  • Automated Topic Models in Political Science Video of a presentation by Brandon Stewart at the Tools for Text Workshop, 14 June 2010
  • Shawn Graham, Ian Milligan, and Scott Weingart . The Programming Historian. Archived from the original on 2014-08-28. Retrieved 2014-05-29.
  • Blei, David M.
  • code, demo - example of using LDA for topic modelling

topic, model, this, article, uses, bare, urls, which, uninformative, vulnerable, link, please, consider, converting, them, full, citations, ensure, article, remains, verifiable, maintains, consistent, citation, style, several, templates, tools, available, assi. This article uses bare URLs which are uninformative and vulnerable to link rot Please consider converting them to full citations to ensure the article remains verifiable and maintains a consistent citation style Several templates and tools are available to assist in formatting such as reFill documentation and Citation bot documentation September 2022 Learn how and when to remove this template message In statistics and natural language processing a topic model is a type of statistical model for discovering the abstract topics that occur in a collection of documents Topic modeling is a frequently used text mining tool for discovery of hidden semantic structures in a text body Intuitively given that a document is about a particular topic one would expect particular words to appear in the document more or less frequently dog and bone will appear more often in documents about dogs cat and meow will appear in documents about cats and the and is will appear approximately equally in both A document typically concerns multiple topics in different proportions thus in a document that is 10 about cats and 90 about dogs there would probably be about 9 times more dog words than cat words The topics produced by topic modeling techniques are clusters of similar words A topic model captures this intuition in a mathematical framework which allows examining a set of documents and discovering based on the statistics of the words in each what the topics might be and what each document s balance of topics is Topic models are also referred to as probabilistic topic models which refers to statistical algorithms for discovering the latent semantic structures of an extensive text body In the age of information the amount of the written material we encounter each day is simply beyond our processing capacity Topic models can help to organize and offer insights for us to understand large collections of unstructured text bodies Originally developed as a text mining tool topic models have been used to detect instructive structures in data such as genetic information images and networks They also have applications in other fields such as bioinformatics 1 and computer vision 2 Contents 1 History 2 Topic models for context information 3 Algorithms 4 Applications of topic models 4 1 To quantitative biomedicine 4 2 To analysis of music and creativity 5 See also 6 References 7 Further reading 8 External linksHistory EditAn early topic model was described by Papadimitriou Raghavan Tamaki and Vempala in 1998 3 Another one called probabilistic latent semantic analysis PLSA was created by Thomas Hofmann in 1999 4 Latent Dirichlet allocation LDA perhaps the most common topic model currently in use is a generalization of PLSA Developed by David Blei Andrew Ng and Michael I Jordan in 2002 LDA introduces sparse Dirichlet prior distributions over document topic and topic word distributions encoding the intuition that documents cover a small number of topics and that topics often use a small number of words 5 Other topic models are generally extensions on LDA such as Pachinko allocation which improves on LDA by modeling correlations between topics in addition to the word correlations which constitute topics Hierarchical latent tree analysis HLTA is an alternative to LDA which models word co occurrence using a tree of latent variables and the states of the latent variables which correspond to soft clusters of documents are interpreted as topics source source source source source Animation of the topic detection process in a document word matrix through biclustering Every column corresponds to a document every row to a word A cell stores the frequency of a word in a document with dark cells indicating high word frequencies This procedure groups documents which use similar words as it groups words occurring in a similar set of documents Such groups of words are then called topics More usual topic models such as LDA only group documents based on a more sophisticated and probabilistic mechanism 6 Topic models for context information EditApproaches for temporal information include Block and Newman s determination of the temporal dynamics of topics in the Pennsylvania Gazette during 1728 1800 Griffiths amp Steyvers used topic modeling on abstracts from the journal PNAS to identify topics that rose or fell in popularity from 1991 to 2001 whereas Lamba amp Madhusushan 7 used topic modeling on full text research articles retrieved from DJLIT journal from 1981 to 2018 In the field of library and information science Lamba amp Madhusudhan 7 8 9 10 applied topic modeling on different Indian resources like journal articles and electronic theses and resources ETDs Nelson 11 has been analyzing change in topics over time in the Richmond Times Dispatch to understand social and political changes and continuities in Richmond during the American Civil War Yang Torget and Mihalcea applied topic modeling methods to newspapers from 1829 to 2008 Mimno used topic modelling with 24 journals on classical philology and archaeology spanning 150 years to look at how topics in the journals change over time and how the journals become more different or similar over time Yin et al 12 introduced a topic model for geographically distributed documents where document positions are explained by latent regions which are detected during inference Chang and Blei 13 included network information between linked documents in the relational topic model to model the links between websites The author topic model by Rosen Zvi et al 14 models the topics associated with authors of documents to improve the topic detection for documents with authorship information HLTA was applied to a collection of recent research papers published at major AI and Machine Learning venues The resulting model is called The AI Tree The resulting topics are used to index the papers at aipano cse ust hk to help researchers track research trends and identify papers to read and help conference organizers and journal editors identify reviewers for submissions To improve the qualitative aspects and coherency of generated topics some researchers have explored the efficacy of coherence scores or otherwise how computer extracted clusters i e topics align with a human benchmark 15 16 Coherence scores are metrics for optimising the number of topics to extract from a document corpus 17 Algorithms EditIn practice researchers attempt to fit appropriate model parameters to the data corpus using one of several heuristics for maximum likelihood fit A survey by D Blei describes this suite of algorithms 18 Several groups of researchers starting with Papadimitriou et al 3 have attempted to design algorithms with provable guarantees Assuming that the data were actually generated by the model in question they try to design algorithms that probably find the model that was used to create the data Techniques used here include singular value decomposition SVD and the method of moments In 2012 an algorithm based upon non negative matrix factorization NMF was introduced that also generalizes to topic models with correlations among topics 19 In 2018 a new approach to topic models was proposed it is based on stochastic block model 20 Applications of topic models EditTo quantitative biomedicine Edit Topic models are being used also in other contexts For examples uses of topic models in biology and bioinformatics research emerged 21 Recently topic models has been used to extract information from dataset of cancers genomic samples 22 In this case topics are biological latent variables to be inferred To analysis of music and creativity Edit Topic models can be used for analysis of continuous signals like music For instance they were used to quantify how musical styles change in time and identify the influence of specific artists on later music creation 23 See also EditExplicit semantic analysis Latent semantic analysis Latent Dirichlet allocation Hierarchical Dirichlet process Non negative matrix factorization Statistical classification Unsupervised learning Mallet software project Gensim Sentence embeddingReferences Edit Blei David April 2012 Probabilistic Topic Models Communications of the ACM 55 4 77 84 doi 10 1145 2133806 2133826 S2CID 753304 Cao Liangliang and Li Fei Fei Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes 2007 IEEE 11th International Conference on Computer Vision IEEE 2007 a b Papadimitriou Christos Raghavan Prabhakar Tamaki Hisao Vempala Santosh 1998 Latent semantic indexing Postscript Proceedings of the seventeenth ACM SIGACT SIGMOD SIGART symposium on Principles of database systems PODS 98 pp 159 168 doi 10 1145 275487 275505 ISBN 978 0897919968 S2CID 1479546 a href Template Cite book html title Template Cite book cite book a CS1 maint date and year link Hofmann Thomas 1999 Probabilistic Latent Semantic Indexing PDF Proceedings of the Twenty Second Annual International SIGIR Conference on Research and Development in Information Retrieval Archived from the original PDF on 2010 12 14 Blei David M Ng Andrew Y Jordan Michael I Lafferty John January 2003 Latent Dirichlet allocation Journal of Machine Learning Research 3 993 1022 doi 10 1162 jmlr 2003 3 4 5 993 http topicmodels west uni koblenz de ckling tmt svd ap html a b Lamba Manika jun 2019 Mapping of topics in DESIDOC Journal of Library and Information Technology India a study Scientometrics 120 2 477 505 doi 10 1007 s11192 019 03137 5 ISSN 0138 9130 S2CID 174802673 Lamba Manika jun 2019 Metadata Tagging and Prediction Modeling Case Study of DESIDOC Journal of Library and Information Technology 2008 2017 World Digital Libraries 12 33 89 doi 10 18329 09757597 2019 12103 inactive 1 August 2023 ISSN 0975 7597 a href Template Cite journal html title Template Cite journal cite journal a CS1 maint DOI inactive as of August 2023 link Lamba Manika may 2019 Author Topic Modeling of DESIDOC Journal of Library and Information Technology 2008 2017 India Library Philosophy and Practice Lamba Manika sep 2018 Metadata Tagging of Library and Information Science Theses Shodhganga 2013 2017 PDF ETD2018 Beyond the boundaries of Rims and Oceans Taiwan Taipei Nelson Rob Mining the Dispatch Mining the Dispatch Digital Scholarship Lab University of Richmond Retrieved 26 March 2021 Yin Zhijun 2011 Geographical topic discovery and comparison Proceedings of the 20th international conference on World wide web pp 247 256 doi 10 1145 1963405 1963443 ISBN 9781450306324 S2CID 17883132 a href Template Cite book html title Template Cite book cite book a CS1 maint date and year link Chang Jonathan 2009 Relational Topic Models for Document Networks PDF Aistats 9 81 88 Rosen Zvi Michal 2004 The author topic model for authors and documents Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence 487 494 arXiv 1207 4169 Nikolenko Sergey 2017 Topic modelling for qualitative studies Journal of Information Science 43 88 102 doi 10 1177 0165551515617393 S2CID 30657489 Reverter Rambaldi Marcel 2022 Topic Modelling in Spontaneous Speech Data Honours thesis Australian National University doi 10 25911 M1YF ZF55 Newman David 2010 Automatic evaluation of topic coherence Human Language Technologies The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics 100 108 Blei David M April 2012 Introduction to Probabilistic Topic Models PDF Comm ACM 55 4 77 84 doi 10 1145 2133806 2133826 S2CID 753304 Sanjeev Arora Rong Ge Ankur Moitra April 2012 Learning Topic Models Going beyond SVD arXiv 1204 1956 cs LG Martin Gerlach Tiago Pexioto Eduardo Altmann 2018 A network approach to topic models Science Advances 4 7 eaaq1360 arXiv 1708 01677 Bibcode 2018SciA 4 1360G doi 10 1126 sciadv aaq1360 PMC 6051742 PMID 30035215 Liu L Tang L et al 2016 An overview of topic modeling and its current applications in bioinformatics SpringerPlus 5 1 1608 doi 10 1186 s40064 016 3252 8 PMC 5028368 PMID 27652181 S2CID 16712827 Valle F Osella M Caselle M 2020 A Topic Modeling Analysis of TCGA Breast and Lung Cancer Transcriptomic Data Cancers 12 12 3799 doi 10 3390 cancers12123799 PMC 7766023 PMID 33339347 S2CID 229325007 Shalit Uri Weinshall Daphna Chechik Gal 2013 05 13 Modeling Musical Influence with Topic Models Proceedings of the 30th International Conference on Machine Learning PMLR 244 252 Further reading EditSteyvers Mark Griffiths Tom 2007 Probabilistic Topic Models In Landauer T McNamara D Dennis S et al eds Handbook of Latent Semantic Analysis PDF Psychology Press ISBN 978 0 8058 5418 3 Archived from the original PDF on 2013 06 24 Blei D M Lafferty J D 2009 Topic Models PDF Blei D Lafferty J 2007 A correlated topic model of Science Annals of Applied Statistics 1 1 17 35 arXiv 0708 3601 doi 10 1214 07 AOAS114 S2CID 8872108 Mimno D April 2012 Computational Historiography Data Mining in a Century of Classics Journals PDF Journal on Computing and Cultural Heritage 5 1 1 19 doi 10 1145 2160165 2160168 S2CID 12153151 Marwick Ben 2013 Discovery of Emergent Issues and Controversies in Anthropology Using Text Mining Topic Modeling and Social Network Analysis of Microblog Content In Yanchang Zhao Yonghua Cen eds Data Mining Applications with R Elsevier pp 63 93 Jockers M 2010 Who s your DH Blog Mate Match Making the Day of DH Bloggers with Topic Modeling Matthew L Jockers posted 19 March 2010 Drouin J 2011 Foray Into Topic Modeling Ecclesiastical Proust Archive posted 17 March 2011 Templeton C 2011 Topic Modeling in the Humanities An Overview Maryland Institute for Technology in the Humanities Blog posted 1 August 2011 Griffiths T Steyvers M 2004 Finding scientific topics Proceedings of the National Academy of Sciences 101 Suppl 1 5228 35 Bibcode 2004PNAS 101 5228G doi 10 1073 pnas 0307752101 PMC 387300 PMID 14872004 Yang T A Torget and R Mihalcea 2011 Topic Modeling on Historical Newspapers Proceedings of the 5th ACL HLT Workshop on Language Technology for Cultural Heritage Social Sciences and Humanities The Association for Computational Linguistics Madison WI pages 96 104 Block S January 2006 Doing More with Digitization An introduction to topic modeling of early American sources Common place the Interactive Journal of Early American Life 6 2 Newman D Block S March 2006 Probabilistic Topic Decomposition of an Eighteenth Century Newspaper PDF Journal of the American Society for Information Science and Technology 57 5 753 767 doi 10 1002 asi 20342 S2CID 1484286 External links EditMimno David Topic modeling bibliography Brett Megan R Topic Modeling A Basic Introduction Journal of Digital Humanities Topic Models Applied to Online News and Reviews Video of a Google Tech Talk presentation by Alice Oh on topic modeling with LDA Modeling Science Dynamic Topic Models of Scholarly Research Video of a Google Tech Talk presentation by David M Blei Automated Topic Models in Political Science Video of a presentation by Brandon Stewart at the Tools for Text Workshop 14 June 2010 Shawn Graham Ian Milligan and Scott Weingart Getting Started with Topic Modeling and MALLET The Programming Historian Archived from the original on 2014 08 28 Retrieved 2014 05 29 Blei David M Introductory material and software code demo example of using LDA for topic modelling Retrieved from https en wikipedia org w index php title Topic model amp oldid 1175867233, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.