Literary Detective Work on the Computer
Computational linguistics can be used to uncover mysteries in text which are not always obvious to visual inspection. For example, the computer analysis of writing style can show who might be the true author of a text in cases of disputed authorship or suspected plagiarism. The theoretical background to authorship attribution is presented in a step by step manner, and comprehensive reviews of the field are given in two specialist areas, the writings of William Shakespeare and his contemporaries, and the various writing styles seen in religious texts. The final chapter looks at the progress computers have made in the decipherment of lost languages. This book is written for students and researchers of general linguistics, computational and corpus linguistics, and computer forensics. It will inspire future researchers to study these topics for themselves, and gives sufficient details of the methods and resources to get them started.
[Natural Language Processing, 12] 2014. x, 283 pp.
Publishing status: Available
Published online on 28 April 2014
Published online on 28 April 2014
© John Benjamins Publishing Company
Table of Contents
-
Preface | pp. ix–x
-
Chapter 1. Author identification | pp. 1–58
-
Chapter 2. Plagiarism and spam filtering | pp. 59–98
-
Chapter 3. Computer studies of Shakespearean authorship | pp. 99–148
-
Chapter 4. Stylometric analysis of religious texts | pp. 149–206
-
Chapter 5. Computers and decipherment | pp. 207–258
-
-
Index | pp. 281–283
“Interesting, packed and wide-ranging.”
Prof. Ward E.Y. Elliott, Claremont McKenna College
“This book will prove a valuable resource for anyone wishing to gain a working knowledge of the methods and achievements of computational stylometry. It covers a wide range of studies in the field, explaining the main results and the techniques used to find them in an accessible manner. A strong point is that it includes a number of worked examples showing, with the aid of small-scale data sets, how some of the more important quantitative methods can be implemented. A further strength is its use of the public-domain system R to illustrate how certain important ideas could be put into practice.
Both newcomers and experienced bardographers will find much of interest in Chapter 3, which gives a dispassionate, empirically grounded overview of a number of key studies of the authorship of the Shakespearean canon. It also includes a clear step-by-step exposition of how Bayes's Rule may be used in investigations of this kind. Chapter 5, on decipherment, contains fascinating accounts of the attempts to decipher the Rongorongo glyphs of Rapanui (Easter Island) and the ancient seals of the lost Indus Valley civilization, among others -- introducing the basic notions of Information Theory and Markov modelling as it does so.”
Both newcomers and experienced bardographers will find much of interest in Chapter 3, which gives a dispassionate, empirically grounded overview of a number of key studies of the authorship of the Shakespearean canon. It also includes a clear step-by-step exposition of how Bayes's Rule may be used in investigations of this kind. Chapter 5, on decipherment, contains fascinating accounts of the attempts to decipher the Rongorongo glyphs of Rapanui (Easter Island) and the ancient seals of the lost Indus Valley civilization, among others -- introducing the basic notions of Information Theory and Markov modelling as it does so.”
Dr. Richard S. Forsyth, Freelance researcher
“In my view this is an excellent and much-missed overview of, and introduction to, the use of statistical tests, methods and approaches to language decipherment and recognition. The in-depth discussions of the methods employed in the so-far unsuccessful decipherment of Rongorongo and the Indus Valley texts is an especially engaging read.”
Dr. Jarle Ebeling, University of Oslo
“The chapter, illustrated with several examples of problems in New Testament Studies, will provide a good introduction and overview of the subject area, targeted at computer scientists who are specialists neither in statistics nor in biblical studies. The emphasis on the methods of multivariate statistical analysis, such as correspondence analysis, is a welcome feature.”
Very Rev. Dr. Andris Abakuks, Birkbeck College
“Very comprehensive and easy to read.”
Dr. Paul D. Clough, University of Sheffield
“This book is a valuable repository of techniques, methods, tasks, cases, and background relevant to
computational stylometry. I admire the way in which Oakes’ interpretation of his own research and that of others supports deeper understanding of the tasks tackled.”
computational stylometry. I admire the way in which Oakes’ interpretation of his own research and that of others supports deeper understanding of the tasks tackled.”
Walter Daelemans, University of Antwerp, in Digital Scholarship in the Humanities, 2016
Cited by (19)
Cited by 19 other publications
Mohamed, Emad
Buckland, Warren
ÇETİN BAYCANLAR, Sema & B. Tahir TAHİROĞLU
Kopotev, Mikhail, Andrey Rostovtsev & Mikhail Sokolov
Singh, Manan & Kavi Narayana Murthy
Schneider, Gerold
2020. Changes in society and language. In Corpora and the changing society [Studies in Corpus Linguistics, 96], ► pp. 29 ff. 
Schneider, Gerold
2022. Medical topics and style from 1500 to 2018. In Corpus Pragmatic Studies on the History of Medical Discourse [Pragmatics & Beyond New Series, 330], ► pp. 49 ff. 
Schneider, Gerold
2024. Digital Dickens. In Crossing Boundaries through Corpora [Studies in Corpus Linguistics, 119], ► pp. 62 ff. 
Grieve, Jack, Isobelle Clarke, Emily Chiang, Hannah Gideon, Annina Heini, Andrea Nini & Emily Waibel
Može, Sara & Emad Mohamed
Nini, Andrea
Urbina Nájera, Argelia B., Jorge de la Calleja & Ma. Auxilio Medina
Franklin, Emma & Michael Oakes
Mealand, David L.
Altmeyer, Stefan, Constantin Klein, Barbara Keller, Christopher F. Silver, Ralph W. Hood & Heinz Streib
2015. Subjective definitions of spirituality and religion. International Journal of Corpus Linguistics 20:4 ► pp. 526 ff. 
Klaussner, Carmen, John Nerbonne & Çağrı Çöltekin
This list is based on CrossRef data as of 21 december 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
Subjects
Main BIC Subject
CFX: Computational linguistics
Main BISAC Subject
LAN009000: LANGUAGE ARTS & DISCIPLINES / Linguistics / General