000 05326nam a22004935i 4500
001 978-3-031-02322-4
003 DE-He213
005 20240730164418.0
007 cr nn 008mamaa
008 220601s2020 sz | s |||| 0|eng d
020 _a9783031023224
_9978-3-031-02322-4
024 7 _a10.1007/978-3-031-02322-4
_2doi
050 4 _aTK5105.5-5105.9
072 7 _aUKN
_2bicssc
072 7 _aCOM043000
_2bisacsh
072 7 _aUKN
_2thema
082 0 4 _a004.6
_223
100 1 _aFerreira, Anderson A.
_eauthor.
_4aut
_4http://id.loc.gov/vocabulary/relators/aut
_984329
245 1 0 _aAutomatic Disambiguation of Author Names in Bibliographic Repositories
_h[electronic resource] /
_cby Anderson A. Ferreira, Marcos André Gonçalves, Alberto H. F. Laender.
250 _a1st ed. 2020.
264 1 _aCham :
_bSpringer International Publishing :
_bImprint: Springer,
_c2020.
300 _aXX, 126 p.
_bonline resource.
336 _atext
_btxt
_2rdacontent
337 _acomputer
_bc
_2rdamedia
338 _aonline resource
_bcr
_2rdacarrier
347 _atext file
_bPDF
_2rda
490 1 _aSynthesis Lectures on Information Concepts, Retrieval, and Services,
_x1947-9468
505 0 _aPreface -- Introduction -- The Author Name Disambiguation Task -- Foundations -- Taxonomy -- Heuristic-Based Hierarchical Clustering Disambiguation -- SAND: Self-Training Author Name Disambiguator -- Incremental Author Name Disambiguation -- Additional Methods for Author Name Disambiguation -- Bibliography -- Authors' Biographies.
520 _aThis book deals with a hard problem that is inherent to human language: ambiguity. In particular, we focus on author name ambiguity, a type of ambiguity that exists in digital bibliographic repositories, which occurs when an author publishes works under distinct names or distinct authors publish works under similar names. This problem may be caused by a number of reasons, including the lack of standards and common practices, and the decentralized generation of bibliographic content. As a consequence, the quality of the main services of digital bibliographic repositories such as search, browsing, and recommendation may be severely affected by author name ambiguity. The focal point of the book is on automatic methods, since manual solutions do not scale to the size of the current repositories or the speed in which they are updated. Accordingly, we provide an ample view on the problem of automatic disambiguation of author names, summarizing the results of more than a decade of research onthis topic conducted by our group, which were reported in more than a dozen publications that received over 900 citations so far, according to Google Scholar. We start by discussing its motivational issues (Chapter 1). Next, we formally define the author name disambiguation task (Chapter 2) and use this formalization to provide a brief, taxonomically organized, overview of the literature on the topic (Chapter 3). We then organize, summarize and integrate the efforts of our own group on developing solutions for the problem that have historically produced state-of-the-art (by the time of their proposals) results in terms of the quality of the disambiguation results. Thus, Chapter 4 covers HHC - Heuristic-based Clustering, an author name disambiguation method that is based on two specific real-world assumptions regarding scientific authorship. Then, Chapter 5 describes SAND - Self-training Author Name Disambiguator and Chapter 6 presents two incremental author name disambiguation methods, namely INDi - Incremental Unsupervised Name Disambiguation and INC- Incremental Nearest Cluster. Finally, Chapter 7 provides an overview of recent author name disambiguation methods that address new specific approaches such as graph-based representations, alternative predefined similarity functions, visualization facilities and approaches based on artificial neural networks. The chapters are followed by three appendices that cover, respectively: (i) a pattern matching function for comparing proper names and used by some of the methods addressed in this book; (ii) a tool for generating synthetic collections of citation records for distinct experimental tasks; and (iii) a number of datasets commonly used to evaluate author name disambiguation methods. In summary, the book organizes a large body of knowledge and work in the area of author name disambiguation in the last decade, hoping to consolidate a solid basis for future developments in the field.
650 0 _aComputer networks .
_931572
650 1 4 _aComputer Communication Networks.
_984332
700 1 _aGonçalves, Marcos André.
_eauthor.
_4aut
_4http://id.loc.gov/vocabulary/relators/aut
_984333
700 1 _aLaender, Alberto H. F.
_eauthor.
_4aut
_4http://id.loc.gov/vocabulary/relators/aut
_984334
710 2 _aSpringerLink (Online service)
_984336
773 0 _tSpringer Nature eBook
776 0 8 _iPrinted edition:
_z9783031002298
776 0 8 _iPrinted edition:
_z9783031011948
776 0 8 _iPrinted edition:
_z9783031034503
830 0 _aSynthesis Lectures on Information Concepts, Retrieval, and Services,
_x1947-9468
_984338
856 4 0 _uhttps://doi.org/10.1007/978-3-031-02322-4
912 _aZDB-2-SXSC
942 _cEBK
999 _c85649
_d85649