fbpx
Wikipedia

SemEval

SemEval (Semantic Evaluation) is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive.

SemEval
Academics
Disciplines:Natural language processing
Computational linguistics
Semantics
Umbrella
Organization:
ACL-SIGLEX hhdgdhc
Workshop Overview
Founded:1998 (Senseval)
Latest:SemEval-2015
NAACL @ Denver, USA
Upcoming:SemEval-2018
History
Senseval-11998 @ Sussex
Senseval-22001 @ Toulouse
Senseval-32004 @ Barcelona
SemEval-20072007 @ Prague
SemEval-20102010 @ Uppsala
SemEval-20122012 @ Montreal
SemEval-20132013 @ Atlanta
SemEval-20142014 @ Dublin
SemEval-20152015 @ Denver
SemEval-20162016 @ San Diego

This series of evaluations is providing a mechanism to characterize in more precise terms exactly what is necessary to compute in meaning. As such, the evaluations provide an emergent mechanism to identify the problems and solutions for computations with meaning. These exercises have evolved to articulate more of the dimensions that are involved in our use of language. They began with apparently simple attempts to identify word senses computationally. They have evolved to investigate the interrelationships among the elements in a sentence (e.g., semantic role labeling), relations between sentences (e.g., coreference), and the nature of what we are saying (semantic relations and sentiment analysis).

The purpose of the SemEval and Senseval exercises is to evaluate semantic analysis systems. "Semantic Analysis" refers to a formal analysis of meaning, and "computational" refer to approaches that in principle support effective implementation.[1]

The first three evaluations, Senseval-1 through Senseval-3, were focused on word sense disambiguation (WSD), each time growing in the number of languages offered in the tasks and in the number of participating teams. Beginning with the fourth workshop, SemEval-2007 (SemEval-1), the nature of the tasks evolved to include semantic analysis tasks outside of word sense disambiguation.[2]

Triggered by the conception of the *SEM conference, the SemEval community had decided to hold the evaluation workshops yearly in association with the *SEM conference. It was also the decision that not every evaluation task will be run every year, e.g. none of the WSD tasks were included in the SemEval-2012 workshop.

History edit

Early evaluation of algorithms for word sense disambiguation edit

From the earliest days, assessing the quality of word sense disambiguation algorithms had been primarily a matter of intrinsic evaluation, and “almost no attempts had been made to evaluate embedded WSD components”.[3] Only very recently (2006) had extrinsic evaluations begun to provide some evidence for the value of WSD in end-user applications.[4] Until 1990 or so, discussions of the sense disambiguation task focused mainly on illustrative examples rather than comprehensive evaluation. The early 1990s saw the beginnings of more systematic and rigorous intrinsic evaluations, including more formal experimentation on small sets of ambiguous words.[5]

Senseval to SemEval edit

In April 1997, Martha Palmer and Marc Light organized a workshop entitled Tagging with Lexical Semantics: Why, What, and How? in conjunction with the Conference on Applied Natural Language Processing.[6] At the time, there was a clear recognition that manually annotated corpora had revolutionized other areas of NLP, such as part-of-speech tagging and parsing, and that corpus-driven approaches had the potential to revolutionize automatic semantic analysis as well.[7] Kilgarriff recalled that there was "a high degree of consensus that the field needed evaluation", and several practical proposals by Resnik and Yarowsky kicked off a discussion that led to the creation of the Senseval evaluation exercises.[8][9][10]

SemEval's 3, 2 or 1 year(s) cycle edit

After SemEval-2010, many participants feel that the 3-year cycle is a long wait. Many other shared tasks such as Conference on Natural Language Learning (CoNLL) and Recognizing Textual Entailments (RTE) run annually. For this reason, the SemEval coordinators gave the opportunity for task organizers to choose between a 2-year or a 3-year cycle.[11] The SemEval community favored the 3-year cycle.
Although the votes within the SemEval community favored a 3-year cycle, organizers and coordinators had settled to split the SemEval task into 2 evaluation workshops. This was triggered by the introduction of the new *SEM conference. The SemEval organizers thought it would be appropriate to associate our event with the *SEM conference and collocate the SemEval workshop with the *SEM conference. The organizers got very positive responses (from the task coordinators/organizers and participants) about the association with the yearly *SEM, and 8 tasks were willing to switch to 2012. Thus was born SemEval-2012 and SemEval-2013. The current plan is to switch to a yearly SemEval schedule to associate it with the *SEM conference but not every task needs to run every year.[12]

List of Senseval and SemEval Workshops edit

  • Senseval-1 took place in the summer of 1998 for English, French, and Italian, culminating in a workshop held at Herstmonceux Castle, Sussex, England on September 2–4.
  • Senseval-2 took place in the summer of 2001, and was followed by a workshop held in July 2001 in Toulouse, in conjunction with ACL 2001. Senseval-2 included tasks for Basque, Chinese, Czech, Danish, Dutch, English, Estonian, Italian, Japanese, Korean, Spanish and Swedish.
  • Senseval-3 took place in March–April 2004, followed by a workshop held in July 2004 in Barcelona, in conjunction with ACL 2004. Senseval-3 included 14 different tasks for core word sense disambiguation, as well as identification of semantic roles, multilingual annotations, logic forms, subcategorization acquisition.
  • SemEval-2007 (Senseval-4) took place in 2007, followed by a workshop held in conjunction with ACL in Prague. SemEval-2007 included 18 different tasks targeting the evaluation of systems for the semantic analysis of text. A special issue of Language Resources and Evaluation is devoted to the result.[13]
  • SemEval-2010 took place in 2010, followed by a workshop held in conjunction with ACL in Uppsala. SemEval-2010 included 18 different tasks targeting the evaluation of semantic analysis systems.
  • SemEval-2012 took place in 2012; it was associated with the new *SEM, First Joint Conference on Lexical and Computational Semantics, and co-located with NAACL, Montreal, Canada. SemEval-2012 included 8 different tasks targeting at evaluating computational semantic systems. However, there was no WSD task involved in SemEval-2012, the WSD related tasks were scheduled in the upcoming SemEval-2013.
  • SemEval-2013 was associated with NAACL 2013, North American Association of Computational Linguistics, Georgia, USA and took place in 2013. It included 13 different tasks targeting at evaluating computational semantic systems.
  • SemEval-2014 took place in 2014. It was co-located with COLING 2014, 25th International Conference on Computational Linguistics and *SEM 2014, Second Joint Conference on Lexical and Computational Semantics, Dublin, Ireland. There were 10 different tasks in SemEval-2014 evaluating various computational semantic systems.
  • SemEval-2015 took place in 2015. It was co-located with NAACL-HLT 2015, 2015 Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies and *SEM 2015, Third Joint Conference on Lexical and Computational Semantics, Denver, USA. There were 17 different tasks in SemEval-2015 evaluating various computational semantic systems.

SemEval Workshop framework edit

The framework of the SemEval/Senseval evaluation workshops emulates the Message Understanding Conferences (MUCs) and other evaluation workshops ran by ARPA (Advanced Research Projects Agency, renamed the Defense Advanced Research Projects Agency (DARPA)).

 
SemEval Framework, adapted from MUC introduction

Stages of SemEval/Senseval evaluation workshops[14]

  1. Firstly, all likely participants were invited to express their interest and participate in the exercise design.
  2. A timetable towards a final workshop was worked out.
  3. A plan for selecting evaluation materials was agreed.
  4. 'Gold standards' for the individual tasks were acquired, often human annotators were considered as a gold standard to measure precision and recall scores of computer systems. These 'gold standards' are what the computational systems strive towards. In WSD tasks, human annotators were set on the task of generating a set of correct WSD answers (i.e. the correct sense for a given word in a given context)
  5. The gold standard materials, without answers, were released to participants, who then had a short time to run their programs over them and return their sets of answers to the organizers.
  6. The organizers then scored the answers and the scores were announced and discussed at a workshop.

Semantic evaluation tasks edit

Senseval-1 & Senseval-2 focused on evaluation WSD systems on major languages that were available corpus and computerized dictionary. Senseval-3 looked beyond the lexemes and started to evaluate systems that looked into wider areas of semantics, such as Semantic Roles (technically known as Theta roles in formal semantics), Logic Form Transformation (commonly semantics of phrases, clauses or sentences were represented in first-order logic forms) and Senseval-3 explored performances of semantics analysis on Machine translation.

As the types of different computational semantic systems grew beyond the coverage of WSD, Senseval evolved into SemEval, where more aspects of computational semantic systems were evaluated.

Overview of Issues in Semantic Analysis edit

The SemEval exercises provide a mechanism for examining issues in semantic analysis of texts. The topics of interest fall short of the logical rigor that is found in formal computational semantics, attempting to identify and characterize the kinds of issues relevant to human understanding of language. The primary goal is to replicate human processing by means of computer systems. The tasks (shown below) are developed by individuals and groups to deal with identifiable issues, as they take on some concrete form.

The first major area in semantic analysis is the identification of the intended meaning at the word level (taken to include idiomatic expressions). This is word-sense disambiguation (a concept that is evolving away from the notion that words have discrete senses, but rather are characterized by the ways in which they are used, i.e., their contexts). The tasks in this area include lexical sample and all-word disambiguation, multi- and cross-lingual disambiguation, and lexical substitution. Given the difficulties of identifying word senses, other tasks relevant to this topic include word-sense induction, subcategorization acquisition, and evaluation of lexical resources.

The second major area in semantic analysis is the understanding of how different sentence and textual elements fit together. Tasks in this area include semantic role labeling, semantic relation analysis, and coreference resolution. Other tasks in this area look at more specialized issues of semantic analysis, such as temporal information processing, metonymy resolution, and sentiment analysis. The tasks in this area have many potential applications, such as information extraction, question answering, document summarization, machine translation, construction of thesauri and semantic networks, language modeling, paraphrasing, and recognizing textual entailment. In each of these potential applications, the contribution of the types of semantic analysis constitutes the most outstanding research issue.

For example, in the word sense induction and disambiguation task, there are three separate phases:

  1. In the training phase, evaluation task participants were asked to use a training dataset to induce the sense inventories for a set of polysemous words. The training dataset consisting of a set of polysemous nouns/verbs and the sentence instances that they occurred in. No other resources were allowed other than morphological and syntactic Natural Language Processing components, such as morphological analyzers, Part-Of-Speech taggers and syntactic parsers.
  2. In the testing phase, participants were provided with a test set for the disambiguating subtask using the induced sense inventory from the training phase.
  3. In the evaluation phase, answers of to the testing phase were evaluated in a supervised an unsupervised framework.

The unsupervised evaluation for WSI considered two types of evaluation V Measure (Rosenberg and Hirschberg, 2007), and paired F-Score (Artiles et al., 2009). This evaluation follows the supervised evaluation of SemEval-2007 WSI task (Agirre and Soroa, 2007)

Senseval and SemEval tasks overview edit

The tables below reflects the workshop growth from Senseval to SemEval and gives an overview of which area of computational semantics was evaluated throughout the Senseval/SemEval workshops.

Workshop No. of Tasks Areas of study Languages of Data Evaluated
Senseval-1 3 Word Sense Disambiguation (WSD) - Lexical Sample WSD tasks English, French, Italian
Senseval-2 12 Word Sense Disambiguation (WSD) - Lexical Sample, All Words, Translation WSD tasks Czech, Dutch, English, Estonian, Basque, Chinese, Danish, English, Italian, Japanese, Korean, Spanish, Swedish
Senseval-3 16
(incl. 2 cancelled)
Logic Form Transformation, Machine Translation (MT) Evaluation, Semantic Role Labelling, WSD Basque, Catalan, Chinese, English, Italian, Romanian, Spanish
SemEval2007 19
(incl. 1 cancelled)
Cross-lingual, Frame Extraction, Information Extraction, Lexical Substitution, Lexical Sample, Metonymy, Semantic Annotation, Semantic Relations, Semantic Role Labelling, Sentiment Analysis, Time Expression, WSD Arabic, Catalan, Chinese, English, Spanish, Turkish
SemEval2010 18
(incl. 1 cancelled)
Coreference, Cross-lingual, Ellipsis, Information Extraction, Lexical Substitution, Metonymy, Noun Compounds, Parsing, Semantic Relations, Semantic Role Labeling, Sentiment Analysis, Textual Entailment, Time Expressions, WSD Catalan, Chinese, Dutch, English, French, German, Italian, Japanese, Spanish
SemEval2012 8 Common Sense Reasoning, Lexical Simplification, Relational Similarity, Spatial Role Labelling, Semantic Dependency Parsing, Semantic and Textual Similarity Chinese, English
SemEval2013 14 Temporal Annotation, Sentiment Analysis, Spatial Role Labeling, Noun Compounds, Phrasal Semantics, Textual Similarity, Response Analysis, Cross-lingual Textual Entailment, BioMedical Texts, Cross and Multilingual WSD, Word Sense Induction, and Lexical Sample Catalan, French, German, English, Italian, Spanish
SemEval2014 10 Compositional Distributional Semantic, Grammar Induction for Spoken Dialogue Systems, Cross-Level Semantic Similarity, Sentiment Analysis, L2 Writing Assistant, Supervised Semantic Parsing, Clinical Text Analysis, Semantic Dependency Parsing, Sentiment Analysis in Twitter, Multilingual Semantic Textual Similarity English, Spanish, French, German, Dutch,
SemEval2015 18
(incl. 1 cancelled)
Text Similarity and Question Answering, Time and Space, Sentiment, Word Sense Disambiguation and Induction, Learning Semantic Relations English, Spanish, Arabic, Italian
SemEval2016 14 Textual Similarity and Question Answering, Sentiment Analysis, Semantic Parsing, Semantic Analysis, Semantic Taxonomy
SemEval2017 12[15] Semantic comparison for words and texts, Detecting sentiment, humor, and truth and Parsing semantic structures
SemEval2018 12[16] Affect and Creative Language in Tweets, Coreference, Information Extraction, Lexical Semantics and Reading Comprehension and Reasoning

The Multilingual WSD task was introduced for the SemEval-2013 workshop.[17] The task is aimed at evaluating Word Sense Disambiguation systems in a multilingual scenario using BabelNet as its sense inventory. Unlike similar task like crosslingual WSD or the multilingual lexical substitution task, where no fixed sense inventory is specified, Multilingual WSD uses the BabelNet as its sense inventory. Prior to the development of BabelNet, a bilingual lexical sample WSD evaluation task was carried out in SemEval-2007 on Chinese-English bitexts.[18]

The Cross-lingual WSD task was introduced in the SemEval-2007 evaluation workshop and re-proposed in the SemEval-2013 workshop .[19] To facilitate the ease of integrating WSD systems into other Natural Language Processing (NLP) applications, such as Machine Translation and multilingual Information Retrieval, the cross-lingual WSD evaluation task was introduced a language-independent and knowledge-lean approach to WSD. The task is an unsupervised Word Sense Disambiguation task for English nouns by means of parallel corpora. It follows the lexical-sample variant of the Classic WSD task, restricted to only 20 polysemous nouns.

It is worth noting that the SemEval-2014 have only two tasks that were multilingual/crosslingual, i.e. (i) the L2 Writing Assistant task, which is a crosslingual WSD task that includes English, Spanish, German, French and Dutch and (ii) the Multilingual Semantic Textual Similarity task that evaluates systems on English and Spanish texts.

Areas of evaluation edit

The major tasks in semantic evaluation include the following areas of natural language processing. This list is expected to grow as the field progresses.[20]

The following table shows the areas of studies that were involved in Senseval-1 through SemEval-2014 (S refers to Senseval and SE refers to SemEval, e.g. S1 refers to Senseval-1 and SE07 refers to SemEval2007):

Areas of Study S1 S2 S3 SE07 SE10 SE12 SE13 SE14 SE15 SE16 SE17
Bioinfomatics / Clinical Text Analysis      
Common Sense Reasoning (COPA)  
Coreference Resolution  
Noun Compounds (Information Extraction)    
Ellipsis  
Grammar Induction  
Keyphrase Extraction (Information Extraction)  
Lexical Simplification  
Lexical Substitution (Multilingual or Crosslingual)      
Lexical Complexity  
Metonymy (Information Extraction)    
Paraphrases        
Question Answering    
Relational Similarity    
Rumour and veracity  
Semantic Parsing          
Semantic Relation Identification    
Semantic Role Labeling        
Semantic Similarity          
Semantic Similarity (Crosslingual)  
Semantic Similarity (Multilingual)      
Sentiment Analysis            
Spatial Role Labelling    
Taxonomy Induction/Enrichment    
Textual Entailment        
Textual Entailment (Cross-lingual)  
Temporal annotation            
Twitter Analysis        
Word sense disambiguation (Lexical Sample)          
Word sense disambiguation (All-Words)            
Word sense disambiguation (Multilingual)    
Word sense disambiguation (Cross-lingual)    
Word sense induction      

Type of Semantic Annotations edit

SemEval tasks have created many types of semantic annotations, each type with various schema. In SemEval-2015, the organizers have decided to group tasks together into several tracks. These tracks are by the type of semantic annotations that the task hope to achieve.[21] Here lists the type of semantic annotations involved in the SemEval workshops:

  1. Learning Semantic Relations
  2. Question and Answering
  3. Semantic Parsing
  4. Semantic Taxonomy
  5. Sentiment Analysis
  6. Text Similarity
  7. Time and Space
  8. Word Sense Disambiguation and Induction

A task and its track allocation is flexible; a task might develop into its own track, e.g. the taxonomy evaluation task in SemEval-2015 was under the Learning Semantic Relations track and in SemEval-2016, there is a dedicated track for Semantic Taxonomy with a new Semantic Taxonomy Enrichment task.[22][23]

See also edit

References edit

  1. ^ Blackburn, P., and Bos, J. (2005), Representation and Inference for Natural Language: A First Course in Computational Semantics, CSLI Publications. ISBN 1-57586-496-7.
  2. ^ Navigli, R (2009). "Word sense disambiguation". ACM Computing Surveys. 41 (2): 1–69. doi:10.1145/1459352.1459355. S2CID 461624.
  3. ^ Palmer, M., Ng, H.T., & Hoa, T.D. (2006), Evaluation of WSD systems, in Eneko Agirre & Phil Edmonds (eds.), Word Sense Disambiguation: Algorithms and Applications, Text, Speech and Language Technology, vol. 33. Amsterdam: Springer, 75–106.
  4. ^ Resnik, P. (2006), WSD in NLP applications, in Eneko Agirre & Phil Edmonds (eds.), Word Sense Disambiguation: Algorithms and Applications. Dordrecht: Springer, 299–338.
  5. ^ Yarowsky, D. (1992), Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. Proceedings of the 14th Conference on Computational Linguistics, 454–60. doi:10.3115/992133.992140
  6. ^ Palmer, M., & Light, M. (1999), Tagging with Lexical Semantics: Why, What, and How?| ACL SIGLEX workshop on tagging text with lexical semantics: what, why, and how? 2010-07-15 at the Wayback Machine Natural Language Engineering 5(2): i–iv.
  7. ^ Ng, H.T. (1997), Getting serious about word sense disambiguation. Proceedings of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How? 1–7.
  8. ^ Philip Resnik and Jimmy Lin (2010). Evaluation of NLP Systems. In Alexander Clark, Chris Fox, and Shalom Lappin, editors. The Handbook of Computational Linguistics and Natural Language Processing. Wiley-Blackwellis. 11:271
  9. ^ Adam Kilgarriff and Martha Palmer (ed. 2000). Special Issue of Computers and the Humanities, SENSEVAL98:Evaluating Word Sense Disambiguation Systems. Kluwer, 34: 1–2.
  10. ^ Scott Cotton, Phil Edmonds, Adam Kilgarriff, and Martha Palmer (ed. 2001). SENSEVAL-2: Second International Workshop on Evaluating Word Sense Disambiguation Systems. SIGLEX Workshop, ACL03, Toulouse, France.
  11. ^ SIGLEX: Message Board (2010) Retrieved on Aug 15, 2012, from http://www.clres.com/siglex/messdisp.php?id=111
  12. ^ SemEval 3 Google Group post. Retrieved on Aug 15, 2012, from https://groups.google.com/forum/?fromgroups#!topic/semeval3/8YXMvVlH-CM%5B1-25%5D
  13. ^ Language Resources and Evaluation Volume 43, Number 2[dead link]
  14. ^ Kilgarriff, A. (1998). SENSEVAL: An Exercise in Evaluating Word Sense Disambiguation Programs. In Proc. LREC, Granada, May 1998. Pp 581--588
  15. ^ "Tasks < SemEval-2017". alt.qcri.org. Retrieved 2018-05-04.
  16. ^ "Tasks < SemEval-2018". alt.qcri.org. Retrieved 2018-05-04.
  17. ^ Navigli, R., Jurgens, D., & Vannella, D. (2013, June). Semeval-2013 task 12: Multilingual word sense disambiguation. In Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013), in conjunction with the Second Joint Conference on Lexical and Computational Semantics (* SEM 2013) (pp. 222-231).
  18. ^ Peng Jin, Yunfang Wu and Shiwen Yu. SemEval-2007 task 05: multilingual Chinese-English lexical sample. Proceedings of the 4th International Workshop on Semantic Evaluations, p.19-23, June 23–24, 2007, Prague, Czech Republic.
  19. ^ Lefever, E., & Hoste, V. (2013, June). Semeval-2013 task 10: Cross-lingual word sense disambiguation. In Second joint conference on lexical and computational semantics (Vol. 2, pp. 158-166).
  20. ^ SemEval Portal (n.d.). In ACLwiki. Retrieved August 12, 2010 from http://aclweb.org/aclwiki/index.php?title=SemEval_Portal
  21. ^ SemEval-2015 website. Retrieved Nov 14, 2014 http://alt.qcri.org/semeval2015/index.php?id=tasks
  22. ^ Georgeta Bordea, Paul Buitelaar, Stefano Faralli and Roberto Navigli. 2015. Semeval-2015 task 17: Taxonomy Extraction Evaluation (TExEval). In Proceedings of the 9th International Workshop on Semantic Evaluation. Denver, USA.
  23. ^ SemEval-2016 website. Retrieved Jun 4 2015 http://alt.qcri.org/semeval2016/

External links edit

  • Special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL)
  • Semeval-2010 – Semantic Evaluation Workshop (endorsed by SIGLEX)
  • Senseval - international organization devoted to the evaluation of Word Sense Disambiguation Systems (endorsed by SIGLEX)
  • SemEval Portal on the Wiki of the Association for Computational Linguistics
  • Senseval / SemEval tasks:
    • Senseval-1 – the first evaluation exercise on word sense disambiguation systems; the lexical-sample task was evaluated on English, French and Italian
    • Senseval-2 – evaluated word sense disambiguation systems on three types of tasks (the all-words, lexical-sample and the translation task)
    • Senseval-3 – included tasks for word sense disambiguation, as well as identification of semantic roles, multilingual annotations, logic forms, subcategorization acquisition.
    • SemEval-2007 – included tasks which were more elaborate than Senseval as it crosses the different areas of studies in Natural Language Processing
    • SemEval-2010 – added tasks that were from new areas of studies in computational semantics, viz., Coreference, Ellipsis, Keyphrase Extraction, Noun Compounds and Textual Entailment.
    • SemEval-2012 – co-located with the first *SEM conference and the Semantic Similarity Task was being promoted as the *Sem Shared Task
    • SemEval-2013 – SemEval moved from 2–3 years cycle to an annual workshop
    • SemEval-2014 – the first time SemEval is located in a non-ACL event in COLING
    • SemEval-2015 – the first SemEval with tasks categorized into various tracks
    • SemEval-2016 – the second SemEval without a WSD task (the first was in SemEval-2012)
    • *SEM – conference for SemEval-related papers other than task systems.
  • Message Understanding Conferences (MUCs)
  • BabelNet
  • Open Multilingual WordNet – Compilation of WordNets with Open licenses

semeval, semantic, evaluation, ongoing, series, evaluations, computational, semantic, analysis, systems, evolved, from, senseval, word, sense, evaluation, series, evaluations, intended, explore, nature, meaning, language, while, meaning, intuitive, humans, tra. SemEval Semantic Evaluation is an ongoing series of evaluations of computational semantic analysis systems it evolved from the Senseval word sense evaluation series The evaluations are intended to explore the nature of meaning in language While meaning is intuitive to humans transferring those intuitions to computational analysis has proved elusive SemEvalAcademicsDisciplines Natural language processingComputational linguisticsSemanticsUmbrella Organization ACL SIGLEX hhdgdhcWorkshop OverviewFounded 1998 Senseval Latest SemEval 2015 NAACL Denver USAUpcoming SemEval 2018HistorySenseval 11998 SussexSenseval 22001 ToulouseSenseval 32004 BarcelonaSemEval 20072007 PragueSemEval 20102010 UppsalaSemEval 20122012 MontrealSemEval 20132013 AtlantaSemEval 20142014 DublinSemEval 20152015 DenverSemEval 20162016 San Diegovte This series of evaluations is providing a mechanism to characterize in more precise terms exactly what is necessary to compute in meaning As such the evaluations provide an emergent mechanism to identify the problems and solutions for computations with meaning These exercises have evolved to articulate more of the dimensions that are involved in our use of language They began with apparently simple attempts to identify word senses computationally They have evolved to investigate the interrelationships among the elements in a sentence e g semantic role labeling relations between sentences e g coreference and the nature of what we are saying semantic relations and sentiment analysis The purpose of the SemEval and Senseval exercises is to evaluate semantic analysis systems Semantic Analysis refers to a formal analysis of meaning and computational refer to approaches that in principle support effective implementation 1 The first three evaluations Senseval 1 through Senseval 3 were focused on word sense disambiguation WSD each time growing in the number of languages offered in the tasks and in the number of participating teams Beginning with the fourth workshop SemEval 2007 SemEval 1 the nature of the tasks evolved to include semantic analysis tasks outside of word sense disambiguation 2 Triggered by the conception of the SEM conference the SemEval community had decided to hold the evaluation workshops yearly in association with the SEM conference It was also the decision that not every evaluation task will be run every year e g none of the WSD tasks were included in the SemEval 2012 workshop Contents 1 History 1 1 Early evaluation of algorithms for word sense disambiguation 1 2 Senseval to SemEval 1 3 SemEval s 3 2 or 1 year s cycle 1 3 1 List of Senseval and SemEval Workshops 2 SemEval Workshop framework 3 Semantic evaluation tasks 3 1 Overview of Issues in Semantic Analysis 3 2 Senseval and SemEval tasks overview 3 3 Areas of evaluation 3 4 Type of Semantic Annotations 4 See also 5 References 6 External linksHistory editEarly evaluation of algorithms for word sense disambiguation edit From the earliest days assessing the quality of word sense disambiguation algorithms had been primarily a matter of intrinsic evaluation and almost no attempts had been made to evaluate embedded WSD components 3 Only very recently 2006 had extrinsic evaluations begun to provide some evidence for the value of WSD in end user applications 4 Until 1990 or so discussions of the sense disambiguation task focused mainly on illustrative examples rather than comprehensive evaluation The early 1990s saw the beginnings of more systematic and rigorous intrinsic evaluations including more formal experimentation on small sets of ambiguous words 5 Senseval to SemEval edit In April 1997 Martha Palmer and Marc Light organized a workshop entitled Tagging with Lexical Semantics Why What and How in conjunction with the Conference on Applied Natural Language Processing 6 At the time there was a clear recognition that manually annotated corpora had revolutionized other areas of NLP such as part of speech tagging and parsing and that corpus driven approaches had the potential to revolutionize automatic semantic analysis as well 7 Kilgarriff recalled that there was a high degree of consensus that the field needed evaluation and several practical proposals by Resnik and Yarowsky kicked off a discussion that led to the creation of the Senseval evaluation exercises 8 9 10 SemEval s 3 2 or 1 year s cycle edit After SemEval 2010 many participants feel that the 3 year cycle is a long wait Many other shared tasks such as Conference on Natural Language Learning CoNLL and Recognizing Textual Entailments RTE run annually For this reason the SemEval coordinators gave the opportunity for task organizers to choose between a 2 year or a 3 year cycle 11 The SemEval community favored the 3 year cycle Although the votes within the SemEval community favored a 3 year cycle organizers and coordinators had settled to split the SemEval task into 2 evaluation workshops This was triggered by the introduction of the new SEM conference The SemEval organizers thought it would be appropriate to associate our event with the SEM conference and collocate the SemEval workshop with the SEM conference The organizers got very positive responses from the task coordinators organizers and participants about the association with the yearly SEM and 8 tasks were willing to switch to 2012 Thus was born SemEval 2012 and SemEval 2013 The current plan is to switch to a yearly SemEval schedule to associate it with the SEM conference but not every task needs to run every year 12 List of Senseval and SemEval Workshops edit Senseval 1 took place in the summer of 1998 for English French and Italian culminating in a workshop held at Herstmonceux Castle Sussex England on September 2 4 Senseval 2 took place in the summer of 2001 and was followed by a workshop held in July 2001 in Toulouse in conjunction with ACL 2001 Senseval 2 included tasks for Basque Chinese Czech Danish Dutch English Estonian Italian Japanese Korean Spanish and Swedish Senseval 3 took place in March April 2004 followed by a workshop held in July 2004 in Barcelona in conjunction with ACL 2004 Senseval 3 included 14 different tasks for core word sense disambiguation as well as identification of semantic roles multilingual annotations logic forms subcategorization acquisition SemEval 2007 Senseval 4 took place in 2007 followed by a workshop held in conjunction with ACL in Prague SemEval 2007 included 18 different tasks targeting the evaluation of systems for the semantic analysis of text A special issue of Language Resources and Evaluation is devoted to the result 13 SemEval 2010 took place in 2010 followed by a workshop held in conjunction with ACL in Uppsala SemEval 2010 included 18 different tasks targeting the evaluation of semantic analysis systems SemEval 2012 took place in 2012 it was associated with the new SEM First Joint Conference on Lexical and Computational Semantics and co located with NAACL Montreal Canada SemEval 2012 included 8 different tasks targeting at evaluating computational semantic systems However there was no WSD task involved in SemEval 2012 the WSD related tasks were scheduled in the upcoming SemEval 2013 SemEval 2013 was associated with NAACL 2013 North American Association of Computational Linguistics Georgia USA and took place in 2013 It included 13 different tasks targeting at evaluating computational semantic systems SemEval 2014 took place in 2014 It was co located with COLING 2014 25th International Conference on Computational Linguistics and SEM 2014 Second Joint Conference on Lexical and Computational Semantics Dublin Ireland There were 10 different tasks in SemEval 2014 evaluating various computational semantic systems SemEval 2015 took place in 2015 It was co located with NAACL HLT 2015 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies and SEM 2015 Third Joint Conference on Lexical and Computational Semantics Denver USA There were 17 different tasks in SemEval 2015 evaluating various computational semantic systems SemEval Workshop framework editThe framework of the SemEval Senseval evaluation workshops emulates the Message Understanding Conferences MUCs and other evaluation workshops ran by ARPA Advanced Research Projects Agency renamed the Defense Advanced Research Projects Agency DARPA nbsp SemEval Framework adapted from MUC introduction Stages of SemEval Senseval evaluation workshops 14 Firstly all likely participants were invited to express their interest and participate in the exercise design A timetable towards a final workshop was worked out A plan for selecting evaluation materials was agreed Gold standards for the individual tasks were acquired often human annotators were considered as a gold standard to measure precision and recall scores of computer systems These gold standards are what the computational systems strive towards In WSD tasks human annotators were set on the task of generating a set of correct WSD answers i e the correct sense for a given word in a given context The gold standard materials without answers were released to participants who then had a short time to run their programs over them and return their sets of answers to the organizers The organizers then scored the answers and the scores were announced and discussed at a workshop Semantic evaluation tasks editSenseval 1 amp Senseval 2 focused on evaluation WSD systems on major languages that were available corpus and computerized dictionary Senseval 3 looked beyond the lexemes and started to evaluate systems that looked into wider areas of semantics such as Semantic Roles technically known as Theta roles in formal semantics Logic Form Transformation commonly semantics of phrases clauses or sentences were represented in first order logic forms and Senseval 3 explored performances of semantics analysis on Machine translation As the types of different computational semantic systems grew beyond the coverage of WSD Senseval evolved into SemEval where more aspects of computational semantic systems were evaluated Overview of Issues in Semantic Analysis edit The SemEval exercises provide a mechanism for examining issues in semantic analysis of texts The topics of interest fall short of the logical rigor that is found in formal computational semantics attempting to identify and characterize the kinds of issues relevant to human understanding of language The primary goal is to replicate human processing by means of computer systems The tasks shown below are developed by individuals and groups to deal with identifiable issues as they take on some concrete form The first major area in semantic analysis is the identification of the intended meaning at the word level taken to include idiomatic expressions This is word sense disambiguation a concept that is evolving away from the notion that words have discrete senses but rather are characterized by the ways in which they are used i e their contexts The tasks in this area include lexical sample and all word disambiguation multi and cross lingual disambiguation and lexical substitution Given the difficulties of identifying word senses other tasks relevant to this topic include word sense induction subcategorization acquisition and evaluation of lexical resources The second major area in semantic analysis is the understanding of how different sentence and textual elements fit together Tasks in this area include semantic role labeling semantic relation analysis and coreference resolution Other tasks in this area look at more specialized issues of semantic analysis such as temporal information processing metonymy resolution and sentiment analysis The tasks in this area have many potential applications such as information extraction question answering document summarization machine translation construction of thesauri and semantic networks language modeling paraphrasing and recognizing textual entailment In each of these potential applications the contribution of the types of semantic analysis constitutes the most outstanding research issue For example in the word sense induction and disambiguation task there are three separate phases In the training phase evaluation task participants were asked to use a training dataset to induce the sense inventories for a set of polysemous words The training dataset consisting of a set of polysemous nouns verbs and the sentence instances that they occurred in No other resources were allowed other than morphological and syntactic Natural Language Processing components such as morphological analyzers Part Of Speech taggers and syntactic parsers In the testing phase participants were provided with a test set for the disambiguating subtask using the induced sense inventory from the training phase In the evaluation phase answers of to the testing phase were evaluated in a supervised an unsupervised framework The unsupervised evaluation for WSI considered two types of evaluation V Measure Rosenberg and Hirschberg 2007 and paired F Score Artiles et al 2009 This evaluation follows the supervised evaluation of SemEval 2007 WSI task Agirre and Soroa 2007 Senseval and SemEval tasks overview edit The tables below reflects the workshop growth from Senseval to SemEval and gives an overview of which area of computational semantics was evaluated throughout the Senseval SemEval workshops Workshop No of Tasks Areas of study Languages of Data Evaluated Senseval 1 3 Word Sense Disambiguation WSD Lexical Sample WSD tasks English French Italian Senseval 2 12 Word Sense Disambiguation WSD Lexical Sample All Words Translation WSD tasks Czech Dutch English Estonian Basque Chinese Danish English Italian Japanese Korean Spanish Swedish Senseval 3 16 incl 2 cancelled Logic Form Transformation Machine Translation MT Evaluation Semantic Role Labelling WSD Basque Catalan Chinese English Italian Romanian Spanish SemEval2007 19 incl 1 cancelled Cross lingual Frame Extraction Information Extraction Lexical Substitution Lexical Sample Metonymy Semantic Annotation Semantic Relations Semantic Role Labelling Sentiment Analysis Time Expression WSD Arabic Catalan Chinese English Spanish Turkish SemEval2010 18 incl 1 cancelled Coreference Cross lingual Ellipsis Information Extraction Lexical Substitution Metonymy Noun Compounds Parsing Semantic Relations Semantic Role Labeling Sentiment Analysis Textual Entailment Time Expressions WSD Catalan Chinese Dutch English French German Italian Japanese Spanish SemEval2012 8 Common Sense Reasoning Lexical Simplification Relational Similarity Spatial Role Labelling Semantic Dependency Parsing Semantic and Textual Similarity Chinese English SemEval2013 14 Temporal Annotation Sentiment Analysis Spatial Role Labeling Noun Compounds Phrasal Semantics Textual Similarity Response Analysis Cross lingual Textual Entailment BioMedical Texts Cross and Multilingual WSD Word Sense Induction and Lexical Sample Catalan French German English Italian Spanish SemEval2014 10 Compositional Distributional Semantic Grammar Induction for Spoken Dialogue Systems Cross Level Semantic Similarity Sentiment Analysis L2 Writing Assistant Supervised Semantic Parsing Clinical Text Analysis Semantic Dependency Parsing Sentiment Analysis in Twitter Multilingual Semantic Textual Similarity English Spanish French German Dutch SemEval2015 18 incl 1 cancelled Text Similarity and Question Answering Time and Space Sentiment Word Sense Disambiguation and Induction Learning Semantic Relations English Spanish Arabic Italian SemEval2016 14 Textual Similarity and Question Answering Sentiment Analysis Semantic Parsing Semantic Analysis Semantic Taxonomy SemEval2017 12 15 Semantic comparison for words and texts Detecting sentiment humor and truth and Parsing semantic structures SemEval2018 12 16 Affect and Creative Language in Tweets Coreference Information Extraction Lexical Semantics and Reading Comprehension and Reasoning The Multilingual WSD task was introduced for the SemEval 2013 workshop 17 The task is aimed at evaluating Word Sense Disambiguation systems in a multilingual scenario using BabelNet as its sense inventory Unlike similar task like crosslingual WSD or the multilingual lexical substitution task where no fixed sense inventory is specified Multilingual WSD uses the BabelNet as its sense inventory Prior to the development of BabelNet a bilingual lexical sample WSD evaluation task was carried out in SemEval 2007 on Chinese English bitexts 18 The Cross lingual WSD task was introduced in the SemEval 2007 evaluation workshop and re proposed in the SemEval 2013 workshop 19 To facilitate the ease of integrating WSD systems into other Natural Language Processing NLP applications such as Machine Translation and multilingual Information Retrieval the cross lingual WSD evaluation task was introduced a language independent and knowledge lean approach to WSD The task is an unsupervised Word Sense Disambiguation task for English nouns by means of parallel corpora It follows the lexical sample variant of the Classic WSD task restricted to only 20 polysemous nouns It is worth noting that the SemEval 2014 have only two tasks that were multilingual crosslingual i e i the L2 Writing Assistant task which is a crosslingual WSD task that includes English Spanish German French and Dutch and ii the Multilingual Semantic Textual Similarity task that evaluates systems on English and Spanish texts Areas of evaluation edit The major tasks in semantic evaluation include the following areas of natural language processing This list is expected to grow as the field progresses 20 The following table shows the areas of studies that were involved in Senseval 1 through SemEval 2014 S refers to Senseval and SE refers to SemEval e g S1 refers to Senseval 1 and SE07 refers to SemEval2007 Areas of Study S1 S2 S3 SE07 SE10 SE12 SE13 SE14 SE15 SE16 SE17 Bioinfomatics Clinical Text Analysis nbsp nbsp nbsp Common Sense Reasoning COPA nbsp Coreference Resolution nbsp Noun Compounds Information Extraction nbsp nbsp Ellipsis nbsp Grammar Induction nbsp Keyphrase Extraction Information Extraction nbsp Lexical Simplification nbsp Lexical Substitution Multilingual or Crosslingual nbsp nbsp nbsp Lexical Complexity nbsp Metonymy Information Extraction nbsp nbsp Paraphrases nbsp nbsp nbsp nbsp Question Answering nbsp nbsp Relational Similarity nbsp nbsp Rumour and veracity nbsp Semantic Parsing nbsp nbsp nbsp nbsp nbsp Semantic Relation Identification nbsp nbsp Semantic Role Labeling nbsp nbsp nbsp nbsp Semantic Similarity nbsp nbsp nbsp nbsp nbsp Semantic Similarity Crosslingual nbsp Semantic Similarity Multilingual nbsp nbsp nbsp Sentiment Analysis nbsp nbsp nbsp nbsp nbsp nbsp Spatial Role Labelling nbsp nbsp Taxonomy Induction Enrichment nbsp nbsp Textual Entailment nbsp nbsp nbsp nbsp Textual Entailment Cross lingual nbsp Temporal annotation nbsp nbsp nbsp nbsp nbsp nbsp Twitter Analysis nbsp nbsp nbsp nbsp Word sense disambiguation Lexical Sample nbsp nbsp nbsp nbsp nbsp Word sense disambiguation All Words nbsp nbsp nbsp nbsp nbsp nbsp Word sense disambiguation Multilingual nbsp nbsp Word sense disambiguation Cross lingual nbsp nbsp Word sense induction nbsp nbsp nbsp Type of Semantic Annotations edit SemEval tasks have created many types of semantic annotations each type with various schema In SemEval 2015 the organizers have decided to group tasks together into several tracks These tracks are by the type of semantic annotations that the task hope to achieve 21 Here lists the type of semantic annotations involved in the SemEval workshops Learning Semantic Relations Question and Answering Semantic Parsing Semantic Taxonomy Sentiment Analysis Text Similarity Time and Space Word Sense Disambiguation and Induction A task and its track allocation is flexible a task might develop into its own track e g the taxonomy evaluation task in SemEval 2015 was under the Learning Semantic Relations track and in SemEval 2016 there is a dedicated track for Semantic Taxonomy with a new Semantic Taxonomy Enrichment task 22 23 See also editList of computer science awards Computational semantics Natural language processing Word sense Word sense disambiguation Different variants of WSD evaluations Semantic analysis computational References edit Blackburn P and Bos J 2005 Representation and Inference for Natural Language A First Course in Computational Semantics CSLI Publications ISBN 1 57586 496 7 Navigli R 2009 Word sense disambiguation ACM Computing Surveys 41 2 1 69 doi 10 1145 1459352 1459355 S2CID 461624 Palmer M Ng H T amp Hoa T D 2006 Evaluation of WSD systems in Eneko Agirre amp Phil Edmonds eds Word Sense Disambiguation Algorithms and Applications Text Speech and Language Technology vol 33 Amsterdam Springer 75 106 Resnik P 2006 WSD in NLP applications in Eneko Agirre amp Phil Edmonds eds Word Sense Disambiguation Algorithms and Applications Dordrecht Springer 299 338 Yarowsky D 1992 Word sense disambiguation using statistical models of Roget s categories trained on large corpora Proceedings of the 14th Conference on Computational Linguistics 454 60 doi 10 3115 992133 992140 Palmer M amp Light M 1999 Tagging with Lexical Semantics Why What and How ACL SIGLEX workshop on tagging text with lexical semantics what why and how Archived 2010 07 15 at the Wayback Machine Natural Language Engineering 5 2 i iv Ng H T 1997 Getting serious about word sense disambiguation Proceedings of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics Why What and How 1 7 Philip Resnik and Jimmy Lin 2010 Evaluation of NLP Systems In Alexander Clark Chris Fox and Shalom Lappin editors The Handbook of Computational Linguistics and Natural Language Processing Wiley Blackwellis 11 271 Adam Kilgarriff and Martha Palmer ed 2000 Special Issue of Computers and the Humanities SENSEVAL98 Evaluating Word Sense Disambiguation Systems Kluwer 34 1 2 Scott Cotton Phil Edmonds Adam Kilgarriff and Martha Palmer ed 2001 SENSEVAL 2 Second International Workshop on Evaluating Word Sense Disambiguation Systems SIGLEX Workshop ACL03 Toulouse France SIGLEX Message Board 2010 Retrieved on Aug 15 2012 from http www clres com siglex messdisp php id 111 SemEval 3 Google Group post Retrieved on Aug 15 2012 from https groups google com forum fromgroups topic semeval3 8YXMvVlH CM 5B1 25 5D Language Resources and Evaluation Volume 43 Number 2 dead link Kilgarriff A 1998 SENSEVAL An Exercise in Evaluating Word Sense Disambiguation Programs In Proc LREC Granada May 1998 Pp 581 588 Tasks lt SemEval 2017 alt qcri org Retrieved 2018 05 04 Tasks lt SemEval 2018 alt qcri org Retrieved 2018 05 04 Navigli R Jurgens D amp Vannella D 2013 June Semeval 2013 task 12 Multilingual word sense disambiguation In Proceedings of the 7th International Workshop on Semantic Evaluation SemEval 2013 in conjunction with the Second Joint Conference on Lexical and Computational Semantics SEM 2013 pp 222 231 Peng Jin Yunfang Wu and Shiwen Yu SemEval 2007 task 05 multilingual Chinese English lexical sample Proceedings of the 4th International Workshop on Semantic Evaluations p 19 23 June 23 24 2007 Prague Czech Republic Lefever E amp Hoste V 2013 June Semeval 2013 task 10 Cross lingual word sense disambiguation In Second joint conference on lexical and computational semantics Vol 2 pp 158 166 SemEval Portal n d In ACLwiki Retrieved August 12 2010 from http aclweb org aclwiki index php title SemEval Portal SemEval 2015 website Retrieved Nov 14 2014 http alt qcri org semeval2015 index php id tasks Georgeta Bordea Paul Buitelaar Stefano Faralli and Roberto Navigli 2015 Semeval 2015 task 17 Taxonomy Extraction Evaluation TExEval In Proceedings of the 9th International Workshop on Semantic Evaluation Denver USA SemEval 2016 website Retrieved Jun 4 2015 http alt qcri org semeval2016 External links editSpecial Interest Group on the Lexicon SIGLEX of the Association for Computational Linguistics ACL Semeval 2010 Semantic Evaluation Workshop endorsed by SIGLEX Senseval international organization devoted to the evaluation of Word Sense Disambiguation Systems endorsed by SIGLEX SemEval Portal on the Wiki of the Association for Computational Linguistics Senseval SemEval tasks Senseval 1 the first evaluation exercise on word sense disambiguation systems the lexical sample task was evaluated on English French and Italian Senseval 2 evaluated word sense disambiguation systems on three types of tasks the all words lexical sample and the translation task Senseval 3 included tasks for word sense disambiguation as well as identification of semantic roles multilingual annotations logic forms subcategorization acquisition SemEval 2007 included tasks which were more elaborate than Senseval as it crosses the different areas of studies in Natural Language Processing SemEval 2010 added tasks that were from new areas of studies in computational semantics viz Coreference Ellipsis Keyphrase Extraction Noun Compounds and Textual Entailment SemEval 2012 co located with the first SEM conference and the Semantic Similarity Task was being promoted as the Sem Shared Task SemEval 2013 SemEval moved from 2 3 years cycle to an annual workshop SemEval 2014 the first time SemEval is located in a non ACL event in COLING SemEval 2015 the first SemEval with tasks categorized into various tracks SemEval 2016 the second SemEval without a WSD task the first was in SemEval 2012 SEM conference for SemEval related papers other than task systems Message Understanding Conferences MUCs BabelNet Open Multilingual WordNet Compilation of WordNets with Open licenses Retrieved from https en wikipedia org w index php title SemEval amp oldid 1209880872, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.