fbpx
Wikipedia

Question answering

Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP) that is concerned with building systems that automatically answer questions that are posed by humans in a natural language.[1]

Overview edit

A question-answering implementation, usually a computer program, may construct its answers by querying a structured database of knowledge or information, usually a knowledge base. More commonly, question-answering systems can pull answers from an unstructured collection of natural language documents.

Some examples of natural language document collections used for question answering systems include:

Types of question answering edit

Question-answering research attempts to develop ways of answering a wide range of question types, including fact, list, definition, how, why, hypothetical, semantically constrained, and cross-lingual questions.

  • Answering questions related to an article in order to evaluate reading comprehension is one of the simpler form of question answering, since a given article is relatively short compared to the domains of other types of question-answering problems. An example of such a question is "What did Albert Einstein win the Nobel Prize for?" after an article about this subject is given to the system.
  • Closed-book question answering is when a system has memorized some facts during training and can answer questions without explicitly being given a context. This is similar to humans taking closed-book exams.
  • Closed-domain question answering deals with questions under a specific domain (for example, medicine or automotive maintenance) and can exploit domain-specific knowledge frequently formalized in ontologies. Alternatively, "closed-domain" might refer to a situation where only a limited type of questions are accepted, such as questions asking for descriptive rather than procedural information. Question answering systems in the context of[vague] machine reading applications have also been constructed in the medical domain, for instance related to[vague] Alzheimer's disease.[3]
  • Open-domain question answering deals with questions about nearly anything and can only rely on general ontologies and world knowledge. Systems designed for open-domain question answering usually have much more data available from which to extract the answer. An example of an open-domain question is "What did Albert Einstein win the Nobel Prize for?" while no article about this subject is given to the system.

Another way to categorize question-answering systems is by the technical approach used. There are a number of different types of QA systems, including

Rule-based systems use a set of rules to determine the correct answer to a question. Statistical systems use statistical methods to find the most likely answer to a question. Hybrid systems use a combination of rule-based and statistical methods.

History edit

Two early question answering systems were BASEBALL[4] and LUNAR.[5] BASEBALL answered questions about Major League Baseball over a period of one year[ambiguous]. LUNAR answered questions about the geological analysis of rocks returned by the Apollo Moon missions. Both question answering systems were very effective in their chosen domains. LUNAR was demonstrated at a lunar science convention in 1971 and it was able to answer 90% of the questions in its domain that were posed by people untrained on the system. Further restricted-domain question answering systems were developed in the following years. The common feature of all these systems is that they had a core database or knowledge system that was hand-written by experts of the chosen domain. The language abilities of BASEBALL and LUNAR used techniques similar to ELIZA and DOCTOR, the first chatterbot programs.

SHRDLU was a successful question-answering program developed by Terry Winograd in the late 1960s and early 1970s. It simulated the operation of a robot in a toy world (the "blocks world"), and it offered the possibility of asking the robot questions about the state of the world. The strength of this system was the choice of a very specific domain and a very simple world with rules of physics that were easy to encode in a computer program.

In the 1970s, knowledge bases were developed that targeted narrower domains of knowledge. The question answering systems developed to interface with these expert systems produced more repeatable[clarification needed] and valid responses to questions within an area of knowledge. These expert systems closely resembled modern question answering systems except in their internal architecture. Expert systems rely heavily on expert-constructed and organized knowledge bases, whereas many modern question answering systems rely on statistical processing of a large, unstructured, natural language text corpus.

The 1970s and 1980s saw the development of comprehensive theories in computational linguistics, which led to the development of ambitious projects in text comprehension and question answering. One example was the Unix Consultant (UC), developed by Robert Wilensky at U.C. Berkeley in the late 1980s. The system answered questions pertaining to the Unix operating system. It had a comprehensive, hand-crafted knowledge base of its domain, and it aimed at phrasing the answer to accommodate various types of users. Another project was LILOG, a text-understanding system that operated on the domain of tourism information in a German city. The systems developed in the UC and LILOG projects never went past the stage of simple demonstrations, but they helped the development of theories on computational linguistics and reasoning.

Specialized natural-language question answering systems have been developed, such as EAGLi for health and life scientists.[6]

Applications edit

QA systems are used in a variety of applications, including

  • Fact-checking if a fact is verified, by posing a question like: is fact X true or false?
  • customer service,
  • technical support,
  • market research,
  • generating reports or conducting research.

Architecture edit

As of 2001, question-answering systems typically included a question classifier module that determined the type of question and the type of answer.[7]

Different types of question-answering systems employ different architectures. For example, modern open-domain question answering systems may use a retriever-reader architecture. The retriever is aimed at retrieving relevant documents related to a given question, while the reader is used to infer the answer from the retrieved documents. Systems such as GPT-3, T5,[8] and BART[9] use an end-to-end[jargon] architecture in which a transformer-based[jargon] architecture stores large-scale textual data in the underlying parameters. Such models can answer questions without accessing any external knowledge sources.

Question answering methods edit

Question answering is dependent on a good search corpus; without documents containing the answer, there is little any question answering system can do. Larger collections generally mean better question answering performance, unless the question domain is orthogonal to the collection. Data redundancy in massive collections, such as the web, means that nuggets of information are likely to be phrased in many different ways in differing contexts and documents,[10] leading to two benefits:

  1. If the right information appears in many forms, the question answering system needs to perform fewer complex NLP techniques to understand the text.
  2. Correct answers can be filtered from false positives because the system can rely on versions of the correct answer appearing more times in the corpus than incorrect ones.

Some question answering systems rely heavily on automated reasoning.[11][12]

Open domain question answering edit

In information retrieval, an open-domain question answering system tries to return an answer in response to the user's question. The returned answer is in the form of short texts rather than a list of relevant documents.[13] The system finds answers by using a combination of techniques from computational linguistics, information retrieval, and knowledge representation.

The system takes a natural language question as an input rather than a set of keywords, for example: "When is the national day of China?" It then transforms this input sentence into a query in its logical form. Accepting natural language questions makes the system more user-friendly, but harder to implement, as there are a variety of question types and the system will have to identify the correct one in order to give a sensible answer. Assigning a question type to the question is a crucial task; the entire answer extraction process relies on finding the correct question type and hence the correct answer type.

Keyword extraction is the first step in identifying the input question type.[14] In some cases, words clearly indicate the question type, e.g., "Who", "Where", "When", or "How many"—these words might suggest to the system that the answers should be of type "Person", "Location", "Date", or "Number", respectively. POS (part-of-speech) tagging and syntactic parsing techniques can also determine the answer type. In the example above, the subject is "Chinese National Day", the predicate is "is" and the adverbial modifier is "when", therefore the answer type is "Date". Unfortunately, some interrogative words like "Which", "What", or "How" do not correspond to unambiguous answer types: Each can represent more than one type. In situations like this, other words in the question need to be considered. A lexical dictionary such as WordNet can be used for understanding the context.

Once the system identifies the question type, it uses an information retrieval system to find a set of documents that contain the correct keywords. A tagger and NP/Verb Group chunker can verify whether the correct entities and relations are mentioned in the found documents. For questions such as "Who" or "Where", a named-entity recogniser finds relevant "Person" and "Location" names from the retrieved documents. Only the relevant paragraphs are selected for ranking.[clarification needed]

A vector space model can classify the candidate answers. Check[who?] if the answer is of the correct type as determined in the question type analysis stage. An inference technique can validate the candidate answers. A score is then given to each of these candidates according to the number of question words it contains and how close these words are to the candidate—the more and the closer the better. The answer is then translated by parsing into a compact and meaningful representation. In the previous example, the expected output answer is "1st Oct."

Mathematical question answering edit

An open-source, math-aware, question answering system called MathQA, based on Ask Platypus and Wikidata, was published in 2018.[15] It takes an English or Hindi natural language question as input and returns a mathematical formula retrieved from Wikidata as a succinct answer, translated into a computable form that allows the user to insert values for the variables. It retrieves names and values of variables and common constants from Wikidata if those are available. It is claimed that the system outperforms a commercial computational mathematical knowledge engine on a test set.[citation needed] MathQA is hosted by Wikimedia at https://mathqa.wmflabs.org/. In 2022, it was extended to answer 15 math question types.[16]

MathQA methods need to combine natural and formula language. One possible approach is to perform supervised annotation via Entity Linking. The "ARQMath Task" at CLEF 2020[17] was launched to address the problem of linking newly posted questions from the platform Math Stack Exchange to existing ones that were already answered by the community.[further explanation needed][18] The lab was motivated by the fact that 20% of mathematical queries in general-purpose search engines are expressed as well-formed questions.[19] It[ambiguous] contained two separate sub-tasks. Task 1: "Answer retrieval" matching old post answers to newly posed questions, and Task 2: "Formula retrieval" matching old post formulae to new questions. Starting with the domain of mathematics, which involves formula language, the goal is to later extend the task to other domains (e.g., STEM disciplines, such as chemistry, biology, etc.), which employ other types of special notation (e.g., chemical formulae).[17][18]

The inverse of mathematical question answering—mathematical question generation—has also been researched. The PhysWikiQuiz physics question generation and test engine retrieves mathematical formulae from Wikidata together with semantic information about their constituting identifiers (names and values of variables).[20] The formulae are then rearranged to generate a set of formula variants. Subsequently, the variables are substituted with random values to generate a large number of different questions suitable for individual student tests. PhysWikiquiz is hosted by Wikimedia at https://physwikiquiz.wmflabs.org/.

Progress edit

Question answering systems have been extended in recent[may be outdated as of April 2023] years to encompass additional domains of knowledge[21] For example, systems have been developed to automatically answer temporal and geospatial questions, questions of definition and terminology, biographical questions, multilingual questions, and questions about the content of audio, images,[22] and video.[23] Current question answering research topics include:

In 2011, Watson, a question answering computer system developed by IBM, competed in two exhibition matches of Jeopardy! against Brad Rutter and Ken Jennings, winning by a significant margin.[32]Facebook Research made their DrQA system[33] available under an open source license. This system uses Wikipedia as knowledge source.[2] The open source framework Haystack by deepset combines open-domain question answering with generative question answering and supports the domain adaptation[clarification needed] of the underlying[clarification needed] language models for industry use cases[vague]. [34][35]

References edit

  1. ^ Philipp Cimiano; Christina Unger; John McCrae (1 March 2014). Ontology-Based Interpretation of Natural Language. Morgan & Claypool Publishers. ISBN 978-1-60845-990-2.
  2. ^ a b Chen, Danqi; Fisch, Adam; Weston, Jason; Bordes, Antoine (2017). "Reading Wikipedia to Answer Open-Domain Questions". arXiv:1704.00051 [cs.CL].
  3. ^ Roser Morante, Martin Krallinger, Alfonso Valencia and Walter Daelemans. Machine Reading of Biomedical Texts about Alzheimer's Disease. CLEF 2012 Evaluation Labs and Workshop. September 17, 2012
  4. ^ GREEN JR, Bert F; et al. (1961). "Baseball: an automatic question-answerer" (PDF). Western Joint IRE-AIEE-ACM Computer Conference: 219–224.
  5. ^ Woods, William A; Kaplan, R. (1977). "Lunar rocks in natural English: Explorations in natural language question answering". Linguistic Structures Processing 5. 5: 521–569.
  6. ^ "EAGLi platform - Question Answering in MEDLINE". candy.hesge.ch. Retrieved 2021-12-02.
  7. ^ Hirschman, L. & Gaizauskas, R. (2001) Natural Language Question Answering. The View from Here. Natural Language Engineering (2001), 7:4:275-300 Cambridge University Press.
  8. ^ Raffel, Colin; Shazeer, Noam; Roberts, Adam; Lee, Katherine; Narang, Sharan; Matena, Michael; Zhou, Yanqi; Li, Wei; Liu, Peter J. (2019). "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". arXiv:1910.10683 [cs.LG].
  9. ^ Lewis, Mike; Liu, Yinhan; Goyal, Naman; Ghazvininejad, Marjan; Mohamed, Abdelrahman; Levy, Omer; Stoyanov, Ves; Zettlemoyer, Luke (2019). "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension". arXiv:1910.13461 [cs.CL].
  10. ^ Lin, J. (2002). The Web as a Resource for Question Answering: Perspectives and Challenges. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002).
  11. ^ Moldovan, Dan, et al. "Cogex: A logic prover for question answering." Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. Association for Computational Linguistics, 2003.
  12. ^ Furbach, Ulrich, Ingo Glöckner, and Björn Pelzer. "." Ai Communications 23.2-3 (2010): 241–265.
  13. ^ Sun, Haitian; Dhingra, Bhuwan; Zaheer, Manzil; Mazaitis, Kathryn; Salakhutdinov, Ruslan; Cohen, William (2018). "Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text". Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. pp. 4231–4242. arXiv:1809.00782. doi:10.18653/v1/D18-1455. S2CID 52154304.{{cite book}}: CS1 maint: location missing publisher (link)
  14. ^ Harabagiu, Sanda; Hickl, Andrew (2006). "Methods for using textual entailment in open-domain question answering". Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06. pp. 905–912. doi:10.3115/1220175.1220289.
  15. ^ Moritz Schubotz; Philipp Scharpf; et al. (12 September 2018). "Introducing MathQA: a Math-Aware question answering system". Information Discovery and Delivery. Emerald Publishing Limited. 46 (4): 214–224. arXiv:1907.01642. doi:10.1108/IDD-06-2018-0022.
  16. ^ Scharpf, P. Schubotz, M. Gipp, B. Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling ACM/IEEE Joint Conference on Digital Libraries, 2022.
  17. ^ a b Zanibbi, Richard; Oard, Douglas W.; Agarwal, Anurag; Mansouri, Behrooz (2020), "Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math", Experimental IR Meets Multilinguality, Multimodality, and Interaction, Lecture Notes in Computer Science, vol. 12260, Cham: Springer International Publishing, pp. 169–193, doi:10.1007/978-3-030-58219-7_15, ISBN 978-3-030-58218-0, S2CID 221351064, retrieved 2021-06-09
  18. ^ a b Scharpf; et al. (2020-12-04). ARQMath Lab: An Incubator for Semantic Formula Search in zbMATH Open?. OCLC 1228449497.
  19. ^ Mansouri, Behrooz; Zanibbi, Richard; Oard, Douglas W. (June 2019). "Characterizing Searches for Mathematical Concepts". 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL). IEEE. pp. 57–66. doi:10.1109/jcdl.2019.00019. ISBN 978-1-7281-1547-4. S2CID 198972305.
  20. ^ Scharpf, Philipp; Schubotz, Moritz; Spitz, Andreas; Greiner-Petter, Andre; Gipp, Bela (2022). "Collaborative and AI-aided Exam Question Generation using Wikidata in Education". arXiv:2211.08361. doi:10.13140/RG.2.2.30988.18568. S2CID 253270181. {{cite journal}}: Cite journal requires |journal= (help)
  21. ^ Paşca, Marius (2005). "Book Review New Directions in Question Answering Mark T. Maybury (editor) (MITRE Corporation) Menlo Park, CA: AAAI Press and Cambridge, MA: The MIT Press, 2004, xi+336 pp; paperbound, ISBN 0-262-63304-3, $40.00, £25.95". Computational Linguistics. 31 (3): 413–417. doi:10.1162/089120105774321055. S2CID 12705839.
  22. ^ a b Anderson, Peter, et al. "Bottom-up and top-down attention for image captioning and visual question answering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
  23. ^ Zhu, Linchao; Xu, Zhongwen; Yang, Yi; Hauptmann, Alexander G. (2015). "Uncovering Temporal Context for Video Question and Answering". arXiv:1511.04670 [cs.CV].
  24. ^ Quarteroni, Silvia, and Suresh Manandhar. "Designing an interactive open-domain question answering system." Natural Language Engineering 15.1 (2009): 73–95.
  25. ^ Light, Marc, et al. "Reuse in Question Answering: A Preliminary Study." New Directions in Question Answering. 2003.
  26. ^ Yih, Wen-tau, Xiaodong He, and Christopher Meek. "Semantic parsing for single-relation question answering." Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2014.
  27. ^ Perera, R., Nand, P. and Naeem, A. 2017. Utilizing typed dependency subtree patterns for answer sentence generation in question answering systems.
  28. ^ de Salvo Braz, Rodrigo, et al. "An inference model for semantic entailment in natural language." Machine Learning Challenges Workshop. Springer, Berlin, Heidelberg, 2005.
  29. ^ . Archived from the original on October 27, 2012. Retrieved 2012-05-29.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  30. ^ Perera, R. and Perera, U. 2012. Towards a thematic role based target identification model for question answering.
  31. ^ Das, Abhishek, et al. "Embodied question answering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
  32. ^ Markoff, John (2011-02-16). "On 'Jeopardy!' Watson Win is All but Trivial". The New York Times.
  33. ^ "DrQA".
  34. ^ Tunstall, Lewis (5 July 2022). Natural Language Processing with Transformers: Building Language Applications with Hugging Face (2nd ed.). O'Reilly UK Ltd. p. Chapter 7. ISBN 978-1098136796.
  35. ^ "Haystack documentation". deepset. Retrieved 4 November 2022.

Further reading edit

  • Dragomir R. Radev, John Prager, and Valerie Samn. Ranking suspected answers to natural language questions using predictive annotation 2011-08-26 at the Wayback Machine. In Proceedings of the 6th Conference on Applied Natural Language Processing, Seattle, WA, May 2000.
  • John Prager, Eric Brown, Anni Coden, and Dragomir Radev. Question-answering by predictive annotation 2011-08-23 at the Wayback Machine. In Proceedings, 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, July 2000.
  • Hutchins, W. John; Harold L. Somers (1992). An Introduction to Machine Translation. London: Academic Press. ISBN 978-0-12-362830-5.
  • L. Fortnow, Steve Homer (2002/2003). A Short History of Computational Complexity. In D. van Dalen, J. Dawson, and A. Kanamori, editors, The History of Mathematical Logic. North-Holland, Amsterdam.
  • Tunstall, Lewis (5 July 2022). Natural Language Processing with Transformers: Building Language Applications with Hugging Face (2nd ed.). O'Reilly UK Ltd. p. Chapter 7. ISBN 978-1098136796.

External links edit

  • Question Answering Evaluation at TREC
  • Question Answering Evaluation at CLEF

question, answering, other, uses, question, answer, computer, science, discipline, within, fields, information, retrieval, natural, language, processing, that, concerned, with, building, systems, that, automatically, answer, questions, that, posed, humans, nat. For other uses see Question and Answer Question answering QA is a computer science discipline within the fields of information retrieval and natural language processing NLP that is concerned with building systems that automatically answer questions that are posed by humans in a natural language 1 Contents 1 Overview 2 Types of question answering 3 History 4 Applications 5 Architecture 6 Question answering methods 6 1 Open domain question answering 6 2 Mathematical question answering 7 Progress 8 References 9 Further reading 10 External linksOverview editA question answering implementation usually a computer program may construct its answers by querying a structured database of knowledge or information usually a knowledge base More commonly question answering systems can pull answers from an unstructured collection of natural language documents Some examples of natural language document collections used for question answering systems include a local clarification needed collection of reference texts internal organization ambiguous documents and web pages compiled newswire reports a set of Wikipedia pages 2 a subset of World Wide Web pagesTypes of question answering editQuestion answering research attempts to develop ways of answering a wide range of question types including fact list definition how why hypothetical semantically constrained and cross lingual questions Answering questions related to an article in order to evaluate reading comprehension is one of the simpler form of question answering since a given article is relatively short compared to the domains of other types of question answering problems An example of such a question is What did Albert Einstein win the Nobel Prize for after an article about this subject is given to the system Closed book question answering is when a system has memorized some facts during training and can answer questions without explicitly being given a context This is similar to humans taking closed book exams Closed domain question answering deals with questions under a specific domain for example medicine or automotive maintenance and can exploit domain specific knowledge frequently formalized in ontologies Alternatively closed domain might refer to a situation where only a limited type of questions are accepted such as questions asking for descriptive rather than procedural information Question answering systems in the context of vague machine reading applications have also been constructed in the medical domain for instance related to vague Alzheimer s disease 3 Open domain question answering deals with questions about nearly anything and can only rely on general ontologies and world knowledge Systems designed for open domain question answering usually have much more data available from which to extract the answer An example of an open domain question is What did Albert Einstein win the Nobel Prize for while no article about this subject is given to the system Another way to categorize question answering systems is by the technical approach used There are a number of different types of QA systems including rule based systems statistical systems and hybrid systems Rule based systems use a set of rules to determine the correct answer to a question Statistical systems use statistical methods to find the most likely answer to a question Hybrid systems use a combination of rule based and statistical methods History editTwo early question answering systems were BASEBALL 4 and LUNAR 5 BASEBALL answered questions about Major League Baseball over a period of one year ambiguous LUNAR answered questions about the geological analysis of rocks returned by the Apollo Moon missions Both question answering systems were very effective in their chosen domains LUNAR was demonstrated at a lunar science convention in 1971 and it was able to answer 90 of the questions in its domain that were posed by people untrained on the system Further restricted domain question answering systems were developed in the following years The common feature of all these systems is that they had a core database or knowledge system that was hand written by experts of the chosen domain The language abilities of BASEBALL and LUNAR used techniques similar to ELIZA and DOCTOR the first chatterbot programs SHRDLU was a successful question answering program developed by Terry Winograd in the late 1960s and early 1970s It simulated the operation of a robot in a toy world the blocks world and it offered the possibility of asking the robot questions about the state of the world The strength of this system was the choice of a very specific domain and a very simple world with rules of physics that were easy to encode in a computer program In the 1970s knowledge bases were developed that targeted narrower domains of knowledge The question answering systems developed to interface with these expert systems produced more repeatable clarification needed and valid responses to questions within an area of knowledge These expert systems closely resembled modern question answering systems except in their internal architecture Expert systems rely heavily on expert constructed and organized knowledge bases whereas many modern question answering systems rely on statistical processing of a large unstructured natural language text corpus The 1970s and 1980s saw the development of comprehensive theories in computational linguistics which led to the development of ambitious projects in text comprehension and question answering One example was the Unix Consultant UC developed by Robert Wilensky at U C Berkeley in the late 1980s The system answered questions pertaining to the Unix operating system It had a comprehensive hand crafted knowledge base of its domain and it aimed at phrasing the answer to accommodate various types of users Another project was LILOG a text understanding system that operated on the domain of tourism information in a German city The systems developed in the UC and LILOG projects never went past the stage of simple demonstrations but they helped the development of theories on computational linguistics and reasoning Specialized natural language question answering systems have been developed such as EAGLi for health and life scientists 6 Applications editQA systems are used in a variety of applications including Fact checking if a fact is verified by posing a question like is fact X true or false customer service technical support market research generating reports or conducting research Architecture editAs of 2001 update question answering systems typically included a question classifier module that determined the type of question and the type of answer 7 Different types of question answering systems employ different architectures For example modern open domain question answering systems may use a retriever reader architecture The retriever is aimed at retrieving relevant documents related to a given question while the reader is used to infer the answer from the retrieved documents Systems such as GPT 3 T5 8 and BART 9 use an end to end jargon architecture in which a transformer based jargon architecture stores large scale textual data in the underlying parameters Such models can answer questions without accessing any external knowledge sources Question answering methods editQuestion answering is dependent on a good search corpus without documents containing the answer there is little any question answering system can do Larger collections generally mean better question answering performance unless the question domain is orthogonal to the collection Data redundancy in massive collections such as the web means that nuggets of information are likely to be phrased in many different ways in differing contexts and documents 10 leading to two benefits If the right information appears in many forms the question answering system needs to perform fewer complex NLP techniques to understand the text Correct answers can be filtered from false positives because the system can rely on versions of the correct answer appearing more times in the corpus than incorrect ones Some question answering systems rely heavily on automated reasoning 11 12 Open domain question answering edit This section needs additional citations for verification Please help improve this article by adding citations to reliable sources in this section Unsourced material may be challenged and removed January 2016 Learn how and when to remove this template message In information retrieval an open domain question answering system tries to return an answer in response to the user s question The returned answer is in the form of short texts rather than a list of relevant documents 13 The system finds answers by using a combination of techniques from computational linguistics information retrieval and knowledge representation The system takes a natural language question as an input rather than a set of keywords for example When is the national day of China It then transforms this input sentence into a query in its logical form Accepting natural language questions makes the system more user friendly but harder to implement as there are a variety of question types and the system will have to identify the correct one in order to give a sensible answer Assigning a question type to the question is a crucial task the entire answer extraction process relies on finding the correct question type and hence the correct answer type Keyword extraction is the first step in identifying the input question type 14 In some cases words clearly indicate the question type e g Who Where When or How many these words might suggest to the system that the answers should be of type Person Location Date or Number respectively POS part of speech tagging and syntactic parsing techniques can also determine the answer type In the example above the subject is Chinese National Day the predicate is is and the adverbial modifier is when therefore the answer type is Date Unfortunately some interrogative words like Which What or How do not correspond to unambiguous answer types Each can represent more than one type In situations like this other words in the question need to be considered A lexical dictionary such as WordNet can be used for understanding the context Once the system identifies the question type it uses an information retrieval system to find a set of documents that contain the correct keywords A tagger and NP Verb Group chunker can verify whether the correct entities and relations are mentioned in the found documents For questions such as Who or Where a named entity recogniser finds relevant Person and Location names from the retrieved documents Only the relevant paragraphs are selected for ranking clarification needed A vector space model can classify the candidate answers Check who if the answer is of the correct type as determined in the question type analysis stage An inference technique can validate the candidate answers A score is then given to each of these candidates according to the number of question words it contains and how close these words are to the candidate the more and the closer the better The answer is then translated by parsing into a compact and meaningful representation In the previous example the expected output answer is 1st Oct Mathematical question answering edit An open source math aware question answering system called MathQA based on Ask Platypus and Wikidata was published in 2018 15 It takes an English or Hindi natural language question as input and returns a mathematical formula retrieved from Wikidata as a succinct answer translated into a computable form that allows the user to insert values for the variables It retrieves names and values of variables and common constants from Wikidata if those are available It is claimed that the system outperforms a commercial computational mathematical knowledge engine on a test set citation needed MathQA is hosted by Wikimedia at https mathqa wmflabs org In 2022 it was extended to answer 15 math question types 16 MathQA methods need to combine natural and formula language One possible approach is to perform supervised annotation via Entity Linking The ARQMath Task at CLEF 2020 17 was launched to address the problem of linking newly posted questions from the platform Math Stack Exchange to existing ones that were already answered by the community further explanation needed 18 The lab was motivated by the fact that 20 of mathematical queries in general purpose search engines are expressed as well formed questions 19 It ambiguous contained two separate sub tasks Task 1 Answer retrieval matching old post answers to newly posed questions and Task 2 Formula retrieval matching old post formulae to new questions Starting with the domain of mathematics which involves formula language the goal is to later extend the task to other domains e g STEM disciplines such as chemistry biology etc which employ other types of special notation e g chemical formulae 17 18 The inverse of mathematical question answering mathematical question generation has also been researched The PhysWikiQuiz physics question generation and test engine retrieves mathematical formulae from Wikidata together with semantic information about their constituting identifiers names and values of variables 20 The formulae are then rearranged to generate a set of formula variants Subsequently the variables are substituted with random values to generate a large number of different questions suitable for individual student tests PhysWikiquiz is hosted by Wikimedia at https physwikiquiz wmflabs org Progress editQuestion answering systems have been extended in recent may be outdated as of April 2023 years to encompass additional domains of knowledge 21 For example systems have been developed to automatically answer temporal and geospatial questions questions of definition and terminology biographical questions multilingual questions and questions about the content of audio images 22 and video 23 Current question answering research topics include interactivity clarification of questions or answers further explanation needed 24 answer reuse or caching 25 semantic parsing 26 answer presentation further explanation needed 27 knowledge representation and semantic entailment 28 social media analysis further explanation needed with question answering systems sentiment analysis 29 utilization of thematic roles 30 Image captioning for visual question answering 22 Embodied question answering 31 In 2011 Watson a question answering computer system developed by IBM competed in two exhibition matches of Jeopardy against Brad Rutter and Ken Jennings winning by a significant margin 32 Facebook Research made their DrQA system 33 available under an open source license This system uses Wikipedia as knowledge source 2 The open source framework Haystack by deepset combines open domain question answering with generative question answering and supports the domain adaptation clarification needed of the underlying clarification needed language models for industry use cases vague 34 35 References edit Philipp Cimiano Christina Unger John McCrae 1 March 2014 Ontology Based Interpretation of Natural Language Morgan amp Claypool Publishers ISBN 978 1 60845 990 2 a b Chen Danqi Fisch Adam Weston Jason Bordes Antoine 2017 Reading Wikipedia to Answer Open Domain Questions arXiv 1704 00051 cs CL Roser Morante Martin Krallinger Alfonso Valencia and Walter Daelemans Machine Reading of Biomedical Texts about Alzheimer s Disease CLEF 2012 Evaluation Labs and Workshop September 17 2012 GREEN JR Bert F et al 1961 Baseball an automatic question answerer PDF Western Joint IRE AIEE ACM Computer Conference 219 224 Woods William A Kaplan R 1977 Lunar rocks in natural English Explorations in natural language question answering Linguistic Structures Processing 5 5 521 569 EAGLi platform Question Answering in MEDLINE candy hesge ch Retrieved 2021 12 02 Hirschman L amp Gaizauskas R 2001 Natural Language Question Answering The View from Here Natural Language Engineering 2001 7 4 275 300 Cambridge University Press Raffel Colin Shazeer Noam Roberts Adam Lee Katherine Narang Sharan Matena Michael Zhou Yanqi Li Wei Liu Peter J 2019 Exploring the Limits of Transfer Learning with a Unified Text to Text Transformer arXiv 1910 10683 cs LG Lewis Mike Liu Yinhan Goyal Naman Ghazvininejad Marjan Mohamed Abdelrahman Levy Omer Stoyanov Ves Zettlemoyer Luke 2019 BART Denoising Sequence to Sequence Pre training for Natural Language Generation Translation and Comprehension arXiv 1910 13461 cs CL Lin J 2002 The Web as a Resource for Question Answering Perspectives and Challenges In Proceedings of the Third International Conference on Language Resources and Evaluation LREC 2002 Moldovan Dan et al Cogex A logic prover for question answering Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology Volume 1 Association for Computational Linguistics 2003 Furbach Ulrich Ingo Glockner and Bjorn Pelzer An application of automated reasoning in natural language question answering Ai Communications 23 2 3 2010 241 265 Sun Haitian Dhingra Bhuwan Zaheer Manzil Mazaitis Kathryn Salakhutdinov Ruslan Cohen William 2018 Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing Brussels Belgium pp 4231 4242 arXiv 1809 00782 doi 10 18653 v1 D18 1455 S2CID 52154304 a href Template Cite book html title Template Cite book cite book a CS1 maint location missing publisher link Harabagiu Sanda Hickl Andrew 2006 Methods for using textual entailment in open domain question answering Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL ACL 06 pp 905 912 doi 10 3115 1220175 1220289 Moritz Schubotz Philipp Scharpf et al 12 September 2018 Introducing MathQA a Math Aware question answering system Information Discovery and Delivery Emerald Publishing Limited 46 4 214 224 arXiv 1907 01642 doi 10 1108 IDD 06 2018 0022 Scharpf P Schubotz M Gipp B Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling ACM IEEE Joint Conference on Digital Libraries 2022 a b Zanibbi Richard Oard Douglas W Agarwal Anurag Mansouri Behrooz 2020 Overview of ARQMath 2020 CLEF Lab on Answer Retrieval for Questions on Math Experimental IR Meets Multilinguality Multimodality and Interaction Lecture Notes in Computer Science vol 12260 Cham Springer International Publishing pp 169 193 doi 10 1007 978 3 030 58219 7 15 ISBN 978 3 030 58218 0 S2CID 221351064 retrieved 2021 06 09 a b Scharpf et al 2020 12 04 ARQMath Lab An Incubator for Semantic Formula Search in zbMATH Open OCLC 1228449497 Mansouri Behrooz Zanibbi Richard Oard Douglas W June 2019 Characterizing Searches for Mathematical Concepts 2019 ACM IEEE Joint Conference on Digital Libraries JCDL IEEE pp 57 66 doi 10 1109 jcdl 2019 00019 ISBN 978 1 7281 1547 4 S2CID 198972305 Scharpf Philipp Schubotz Moritz Spitz Andreas Greiner Petter Andre Gipp Bela 2022 Collaborative and AI aided Exam Question Generation using Wikidata in Education arXiv 2211 08361 doi 10 13140 RG 2 2 30988 18568 S2CID 253270181 a href Template Cite journal html title Template Cite journal cite journal a Cite journal requires journal help Pasca Marius 2005 Book Review New Directions in Question Answering Mark T Maybury editor MITRE Corporation Menlo Park CA AAAI Press and Cambridge MA The MIT Press 2004 xi 336 pp paperbound ISBN 0 262 63304 3 40 00 25 95 Computational Linguistics 31 3 413 417 doi 10 1162 089120105774321055 S2CID 12705839 a b Anderson Peter et al Bottom up and top down attention for image captioning and visual question answering Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018 Zhu Linchao Xu Zhongwen Yang Yi Hauptmann Alexander G 2015 Uncovering Temporal Context for Video Question and Answering arXiv 1511 04670 cs CV Quarteroni Silvia and Suresh Manandhar Designing an interactive open domain question answering system Natural Language Engineering 15 1 2009 73 95 Light Marc et al Reuse in Question Answering A Preliminary Study New Directions in Question Answering 2003 Yih Wen tau Xiaodong He and Christopher Meek Semantic parsing for single relation question answering Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics Volume 2 Short Papers 2014 Perera R Nand P and Naeem A 2017 Utilizing typed dependency subtree patterns for answer sentence generation in question answering systems de Salvo Braz Rodrigo et al An inference model for semantic entailment in natural language Machine Learning Challenges Workshop Springer Berlin Heidelberg 2005 BitCrawl by Hobson Lane Archived from the original on October 27 2012 Retrieved 2012 05 29 a href Template Cite web html title Template Cite web cite web a CS1 maint bot original URL status unknown link Perera R and Perera U 2012 Towards a thematic role based target identification model for question answering Das Abhishek et al Embodied question answering Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018 Markoff John 2011 02 16 On Jeopardy Watson Win is All but Trivial The New York Times DrQA Tunstall Lewis 5 July 2022 Natural Language Processing with Transformers Building Language Applications with Hugging Face 2nd ed O Reilly UK Ltd p Chapter 7 ISBN 978 1098136796 Haystack documentation deepset Retrieved 4 November 2022 Further reading editDragomir R Radev John Prager and Valerie Samn Ranking suspected answers to natural language questions using predictive annotation Archived 2011 08 26 at the Wayback Machine In Proceedings of the 6th Conference on Applied Natural Language Processing Seattle WA May 2000 John Prager Eric Brown Anni Coden and Dragomir Radev Question answering by predictive annotation Archived 2011 08 23 at the Wayback Machine In Proceedings 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Athens Greece July 2000 Hutchins W John Harold L Somers 1992 An Introduction to Machine Translation London Academic Press ISBN 978 0 12 362830 5 L Fortnow Steve Homer 2002 2003 A Short History of Computational Complexity In D van Dalen J Dawson and A Kanamori editors The History of Mathematical Logic North Holland Amsterdam Tunstall Lewis 5 July 2022 Natural Language Processing with Transformers Building Language Applications with Hugging Face 2nd ed O Reilly UK Ltd p Chapter 7 ISBN 978 1098136796 External links editQuestion Answering Evaluation at TREC Question Answering Evaluation at CLEF Retrieved from https en wikipedia org w index php title Question answering amp oldid 1186824674, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.