目录 导读…1 Contributors…16 Foreword…19 Preface…23 1 Introduction…1 Eneko Agirre and Philip Edmonds 1.1 Word Sense Disambiguation…1 1.2 A Brief History of WSD Research…4 1.3 What is a Word Sense?…8 1.4 Applications of WSD…10 1.5 Basic Approaches to WSD…12 1.6 State-of-the-Art Performance…14 1.7 Promising Directions…15 1.8 Overview of This Book…19 1.9 Further Reading…21 References…22 2 Word Senses…29 Adam Kilgarriff 2.1 Introduction…29 2.2 Lexicographers…30 2.3 Philosophy…32 2.3.1 Meaning is Something You Do…32 2.3.2 The Fregean Tradition and Reification…33 2.3.3 Two Incompatible Semantics?…33 2.3.4 Implications for Word Senses…34 2.4 Lexicalization…35 2.5 Corpus Evidence…39 2.5.1 Lexicon Size…41 2.5.2 Quotations…42 2.6 Conclusion…43 2.7 Further Reading…44 Acknowledgments …45 References…45 3 Making Sense About Sense…47 Nancy Ide and Yorick Wilks 3.1 Introduction…47 3.2 WSD and the Lexicographers…49 3.3 WSD and Sense Inventories…51 3.4 NLP Applications and WSD…55 3.5 What Level of Sense Distinctions Do We Need for NLP, If Any?…58 3.6 What Now for WSD?…64 3.7 Conclusion…68 References…68 4 Evaluation of WSD Systems…75 Martha Palmer, Hwee Tou Ng and Hoa Trang Dang 4.1 Introduction…75 4.1.1 Terminology …76 4.1.2 Overview…80 4.2 Background…81 4.2.1 WordNet and Semcor…81 4.2.2 The Line and Interest Corpora…83 4.2.3 The DSO Corpus…84 4.2.4 Open Mind Word Expert…85 4.3 Evaluation Using Pseudo-Words…86 4.4 Senseval Evaluation Exercises…86 4.4.1 Senseval-1…87 Evaluation and Scoring…88 4.4.2 Senseval-2…88 English All-Words Task…89 English Lexical Sample Task…89 4.4.3 Comparison of Tagging Exercises…91 4.5 Sources of Inter-Annotator Disagreement…92 4.6 Granularity of Sense: Groupings for WordNet…95 4.6.1 Criteria for WordNet Sense Grouping…96 4.6.2 Analysis of Sense Grouping…97 4.7 Senseval-3…98 4.8 Discussion…99 References…102 5 Knowledge-Based Methods for WSD…107 Rada Mihalcea 5.1 Introduction…107 5.2 Lesk Algorithm…108 5.2.1 Variations of the Lesk Algorithm…110 Simulated Annealing…110 Simplified Lesk Algorithm…111 Augmented Semantic Spaces…113 Summary…113 5.3 Semantic Similarity…114 5.3.1 Measures of Semantic Similarity…114 5.3.2 Using Semantic Similarity Within a Local Context…117 5.3.3 Using Semantic Similarity Within a Global Context…118 5.4 Selectional Preferences…119 5.4.1 Preliminaries: Learning Word-to-Word Relations…120 5.4.2 Learning Selectional Preferences…120 5.4.3 Using Selectional Preferences…122 5.5 Heuristics for Word Sense Disambiguation…123 5.5.1 Most Frequent Sense…123 5.5.2 One Sense Per Discourse…124 5.5.3 One Sense Per Collocation…124 5.6 Knowledge-Based Methods at Senseval-2 …125 5.7 Conclusions…126 References…127 6 Unsupervised Corpus-Based Methods for WSD…133 Ted Pedersen 6.1 Introduction…133 6.1.1 Scope…134 6.1.2 Motivation…136 Distributional Methods…137 Translational Equivalence…139 6.1.3 Approaches…140 6.2 Type-Based Discrimination…141 6.2.1 Representation of Context…142 6.2.2 Algorithms…145 Latent Semantic Analysis (LSA)…146 Hyperspace Analogue to Language (HAL)…147 Clustering By Committee (CBC)…148 6.2.3 Discussion…150 6.3 Token-Based Discrimination…150 6.3.1 Representation of Context…151 6.3.2 Algorithms…151 Context Group Discrimination…152 McQuitty’s Similarity Analysis…154 6.3.3 Discussion…157 6.4 Translational Equivalence …158 6.4.1 Representation of Context…159 6.4.2 Algorithms…159 6.4.3 Discussion…160 6.5 Conclusions and the Way Forward…161 Acknowledgments…162 References…162 7 Supervised Corpus-Based Methods for WSD…167 Lluís M??rquez, Gerard Escudero, David Martínez and German Rigau 7.1 Introduction to Supervised WSD…167 7.1.1 Machine Learning for Classification …168 An Example on WSD…170 7.2 A Survey of Supervised WSD…171 7.2.1 Main Corpora Used…172 7.2.2 Main Sense Repositories…173 7.2.3 Representation of Examples by Means of Features…174 7.2.4 Main Approaches to Supervised WSD…175 Probabilistic Methods…175 Methods Based on the Similarity of the Examples…176 Methods Based on Discriminating Rules…177 Methods Based on Rule Combination…179 Linear Classifiers and Kernel-Based Approaches…179 Discourse Properties: The Yarowsky Bootstrapping Algorithm…181 7.2.5 Supervised Systems in the Senseval Evaluations…183 7. 3 An Empirical Study of Supervised Algorithms for WSD…184 7.3.1 Five Learning Algorithms Under Study…185 Na?ve Bayes (NB)…185 Exemplar-Based Learning (kNN)…186 Decision Lists (DL)…187 AdaBoost (AB)…187 Support Vector Machines (SVM)…189 7.3.2 Empirical Evaluation on the DSO Corpus…190 Experiments…191 7.4 Current Challenges of the Supervised Approach…195 7.4.1 Right-Sized Training Sets…195 7.4.2 Porting Across Corpora…196 7.4.3 The Knowledge Acquisition Bottleneck…197 Automatic Acquisition of Training Examples…198 Active Learning…199 Combining Training Examples from Different Words…199 Parallel Corpora…200 7.4.4 Bootstrapping…201 7.4.5 Feature Selection and Parameter Optimization…202 7.4.6 Combination of Algorithms and Knowledge Sources…203 7.5 Conclusions and Future Trends…205 Acknowledgments…206 References…207 8 Knowledge Sources for WSD…217 Eneko Agirre and Mark Stevenson 8. 1 Introduction…217 8.2 Knowledge Sources Relevant to WSD…218 8.2.1 Syntactic…219 Part of Speech (KS 1)…219 Morphology (KS 2)…219 Collocations (KS 3)…220 Subcategorization (KS 4)…220 8.2.2 Semantic…220 Frequency of Senses (KS 5)…220 Semantic Word Associations (KS 6)…221 Selectional Preferences (KS 7)…221 Semantic Roles (KS 8)…222 8.2.3 Pragmatic/Topical…222 Domain (KS 9)…222 Topical Word Association (KS 10)…222 Pragmatics (KS 11)…223 8.3 Features and Lexical Resources…223 8.3.1 Target-Word Specific Features…224 8.3.2 Local Features…225 8.3.3 Global Features…227 8.4 Identifying Knowledge Sources in Actual Systems…228 8.4.1 Senseval-2 Systems…229 8.4.2 Senseval-3 Systems…231 8.5 Comparison of Experimental Results…231 8.5.1 Senseval Results…232 8.5.2 Yarowsky and Florian (2002)…233 8.5.3 Lee and Ng (2002)…234 8.5.4 Martínez et al. (2002)…237 8.5.5 Agirre and Martínez (2001 a)…238 8.5.6 Stevenson and Wilks (2001)…240 8.6 Discussion…242 8.7 Conclusions…245 Acknowledgments…246 References…247 9 Automatic Acquisition of Lexical Information and Examples…253 Julio Gonzalo and Felisa Verdejo 9.1 Introduction…253 9.2 Mining Topical Knowledge About Word Senses…254 9.2.1 Topic Signatures…255 9.2.2 Association of Web Directories to Word Senses…257 9.3 Automatic Acquisition of Sense-Tagged Corpora…258 9.3.1 Acquisition by Direct Web Searching…258 9.3.2 Bootstrapping from Seed Examples…261 9.3.3 Acquisition via Web Directories…263 9.3.4 Acquisition via Cross-Language Evidence…264 9.3.5 Web-Based Cooperative Annotation…268 9.4 Discussion…269 Acknowledgments…271 References…272 10 Domain-Specific WSD…275 Paul Buitelaar, Bernardo Magnini, Carlo Strapparava and Piek Vossen 10.1 Introduction…275 10.2 Approaches to Domain-Specific WSD…277 10.2.1 Subject Codes…277 10.2.2 Topic Signatures and Topic Variation…282 Topic Signatures…282 Topic Variation…283 10.2.3 Domain Tuning…284 Top-down Domain Tuning…285 Bottom-up Domain Tuning…285 10.3 Domain-Specific Disambiguation in Applications…288 10.3.1 User-Modeling for Recommender Systems…288 10.3.2 Cross-Lingual Information Retrieval…289 10.3.3 The MEANING Project…292 10.4 Conclusions…295 References…296 11 WSD in NLP Applications…299 Philip Resnik 11.1 Introduction…299 11.2 Why WSD?…300 Argument from Faith…300 Argument by Analogy… 301 Argument from Specific Applications…302 11.3 Traditional WSD in Applications…303 11.3.1 WSD in Traditional Information Retrieval…304 11.3.2 WSD in Applications Related to Information Retrieval…307 Cross-Language IR…308 Question Answering…309 Document Classification…312 11.3.3 WSD in Traditional Machine Translation…313 11.3.4 Sense Ambiguity in Statistical Machine Translation…315 11.3.5 Other Emerging Applications…317 11.4 Alternative Conceptions of Word Sense…320 11.4.1 Richer Linguistic Representations…320 11.4.2 Patterns of Usage…321 11.4.3 Cross-Language Relationships…323 11.5 Conclusions…325 Acknowledgments…325 References…326 A Resources for WSD…339 A.1 Sense Inventories…339 A.1.1 Dictionaries…339 A.1.2 Thesauri…341 A.1.3 Lexical Knowledge Bases…341 A.2 Corpora…343 A.2.1 Raw Corpora…343 A.2.2 Sense-Tagged Corpora…345 A.2.3 Automatically Tagged Corpora…347 A.3 Other Resources…348 A.3.1 Software…348 A.3.2 Utilities, Demos, and Data…349 A.3.3 Language Data Providers…350 A.3.4 Organizations and Mailing Lists…350 Index of Terms…353 Index of Authors and Algorithms…361
内容摘要 On the other hand, it is certainly possible that sufficiently separate senses can be identified using multi-lingual criteria-i.e., by identifying senses of the same homograph that have different translations in some sig-nificant number of other languages-as discussed in Section 3.3.For example, the two senses of paper cited above are translated in French as journal and papier, respectivcly; similarly, the two etymologically-related senses of nail (fingernail and the metal object that one hammers) are,trans-lated as ongle and ctou.At the same time, there is a danger in relying on cross-lingualism as the basis of sense, since the same histori
以下为对购买帮助不大的评价