000 05032nam a22005415i 4500
001 978-3-031-43811-0
003 DE-He213
005 20240730171520.0
007 cr nn 008mamaa
008 240313s2024 sz | s |||| 0|eng d
020 _a9783031438110
_9978-3-031-43811-0
024 7 _a10.1007/978-3-031-43811-0
_2doi
050 4 _aQA76.9.N38
072 7 _aUYQL
_2bicssc
072 7 _aCOM073000
_2bisacsh
072 7 _aUYQL
_2thema
082 0 4 _a006.35
_223
245 1 0 _aLinguistic Resources for Natural Language Processing
_h[electronic resource] :
_bOn the Necessity of Using Linguistic Methods to Develop NLP Software /
_cedited by Max Silberztein.
250 _a1st ed. 2024.
264 1 _aCham :
_bSpringer Nature Switzerland :
_bImprint: Springer,
_c2024.
300 _aXXII, 217 p. 118 illus., 101 illus. in color.
_bonline resource.
336 _atext
_btxt
_2rdacontent
337 _acomputer
_bc
_2rdamedia
338 _aonline resource
_bcr
_2rdacarrier
347 _atext file
_bPDF
_2rda
505 0 _aIn honor of Peter -- Foreword. - Preface -- About this book. Part 1. Introduction -- 1. The Limitations of Corpus-based Methods in NLP -- Part 2 -- 2. Developing Linguistic-based NLP Software -- 3. Linguistic Resources for the Automatic Generation of Texts in Natural Language -- 4. Towards a More Efficient Arabic-French Translation -- 5. Linguistic Resources and Methods and Algorithms for Belarusian Natural Language Processing -- Part 3 -- Linguistic Resources for Low-resource Languages -- 6. A New Set of Linguistic Resources for Ukrainian -- 7. Formalization of the Quechua Morphology -- 8. The Challenging Task of Translating the Language of Tango -- 9. A Polylectal Linguistic Resource for Rromani -- Part 4. Processing Multiword Units: The Linguistic Approach -- 10. Using Linguistic Criteria to Define Multiword Units -- 11. A Linguistic Approach to English Phrasal Verbs -- 12. Analysis of Indonesian Multiword Expressions: Linguistic vs Data-driven Approach.
520 _aEmpirical - data-driven, neural network-based, probabilistic, and statistical - methods seem to be the modern trend. Recently, OpenAI's ChatGPT, Google's Bard and Microsoft's Sydney chatbots have been garnering a lot of attention for their detailed answers across many knowledge domains. In consequence, most AI researchers are no longer interested in trying to understand what common intelligence is or how intelligent agents construct scenarios to solve various problems. Instead, they now develop systems that extract solutions from massive databases used as cheat sheets. In the same manner, Natural Language Processing (NLP) software that uses training corpora associated with empirical methods are trendy, as most researchers in NLP today use large training corpora, always to the detriment of the development of formalized dictionaries and grammars. Not questioning the intrinsic value of many software applications based on empirical methods, this volume aims at rehabilitating the linguistic approach to NLP. In an introduction, the editor uncovers several limitations and flaws of using training corpora to develop NLP applications, even the simplest ones, such as automatic taggers. The first part of the volume is dedicated to showing how carefully handcrafted linguistic resources could be successfully used to enhance current NLP software applications. The second part presents two representative cases where data-driven approaches cannot be implemented simply because there is not enough data available for low-resource languages. The third part addresses the problem of how to treat multiword units in NLP software, which is arguably the weakest point of NLP applications today but has a simple and elegant linguistic solution. It is the editor's belief that readers interested in Natural Language Processing will appreciate the importance of this volume, both for its questioning of the training corpus-based approaches and for the intrinsic value of the linguistic formalization and the underlying methodology presented.
650 0 _aNatural language processing (Computer science).
_94741
650 0 _aComputational linguistics.
_96146
650 0 _aArtificial intelligence.
_93407
650 0 _aDigital humanities.
_999054
650 1 4 _aNatural Language Processing (NLP).
_931587
650 2 4 _aComputational Linguistics.
_96146
650 2 4 _aArtificial Intelligence.
_93407
650 2 4 _aDigital Humanities.
_999056
700 1 _aSilberztein, Max.
_eeditor.
_4edt
_4http://id.loc.gov/vocabulary/relators/edt
_999057
710 2 _aSpringerLink (Online service)
_999059
773 0 _tSpringer Nature eBook
776 0 8 _iPrinted edition:
_z9783031438103
776 0 8 _iPrinted edition:
_z9783031438127
776 0 8 _iPrinted edition:
_z9783031438134
856 4 0 _uhttps://doi.org/10.1007/978-3-031-43811-0
912 _aZDB-2-SCS
912 _aZDB-2-SXCS
942 _cEBK
999 _c87646
_d87646