219-7677
10
7500817
John Benjamins Publishing Company
Marketing Department / Karin Plijnaar, Pieter Lamers
onix@benjamins.nl
201608250348
ONIX title feed
eng
01
EUR
76015968
03
01
01
JB
John Benjamins Publishing Company
01
JB code
Z 195 Eb
15
9789027268457
06
10.1075/z.195
13
2015019027
DG
002
02
<TitleType>01</TitleType>
<TitleText textformat="02">How to do Linguistics with R</TitleText>
<Subtitle textformat="02">Data exploration and statistical analysis</Subtitle>
01
z.195
01
https://benjamins.com
02
https://benjamins.com/catalog/z.195
1
A01
Natalia Levshina
Levshina, Natalia
Natalia
Levshina
Université catholique de Louvain
01
eng
454
xi
443
LAN009000
v.2006
CFX
2
24
JB Subject Scheme
LIN.COGN
Cognition and language
24
JB Subject Scheme
LIN.COMPUT
Computational & corpus linguistics
24
JB Subject Scheme
LIN.THEOR
Theoretical linguistics
06
01
This book provides a linguist with a statistical toolkit for exploration and analysis of linguistic data. It employs R, a free software environment for statistical computing, which is increasingly popular among linguists. <i>How to do Linguistics with R: Data exploration and statistical analysis</i> is unique in its scope, as it covers a wide range of classical and cutting-edge statistical methods, including different flavours of regression analysis and ANOVA, random forests and conditional inference trees, as well as specific linguistic approaches, among which are Behavioural Profiles, Vector Space Models and various measures of association between words and constructions. The statistical topics are presented comprehensively, but without too much technical detail, and illustrated with linguistic case studies that answer non-trivial research questions. The book also demonstrates how to visualize linguistic data with the help of attractive informative graphs, including the popular ggplot2 system and Google visualization tools.<br />This book has a companion website: <a href="http://doi.org/10.1075/z.195.website">http://doi.org/10.1075/z.195.website</a>
05
Levshina’s book achieves something few other books on doing linguistics with R have achieved. She has written a book that makes sense even for novice users of R and for linguists not accustomed to statistical computing. Levshina writes in a pedagogically sensitive style, in friendly language, and with just the right amount of explanatory prose to lead the reader to insightful analyses. Taken together, the chapters introduce the reader to a sparkling variety of statistical methods. Best of all, from the point of view of linguists, real *linguistic *problems take centre stage throughout the book– the statistical methods are the means to answer intriguing linguistic questions, not an end in themselves.
John Newman, University of Alberta
05
This is a fantastic textbook: extremely comprehensive (the book surveys almost all major analysis technique used in the linguistic literature – from descriptive statistics over regression analysis to semantic vector space modeling), well-written, and with a much appreciated emphasis on good data visualization. Both beginning and more experienced quantitative linguists will find this book an invaluable resource.
Benedikt Szmrecsanyi, University of Leuven
04
09
01
https://benjamins.com/covers/475/z.195.png
04
03
01
https://benjamins.com/covers/475_jpg/9789027212245.jpg
04
03
01
https://benjamins.com/covers/475_tif/9789027212245.tif
06
09
01
https://benjamins.com/covers/1200_front/z.195.hb.png
07
09
01
https://benjamins.com/covers/125/z.195.png
25
09
01
https://benjamins.com/covers/1200_back/z.195.hb.png
27
09
01
https://benjamins.com/covers/3d_web/z.195.hb.png
10
01
JB code
z.195.ack
xi
xii
2
Article
1
<TitleType>01</TitleType>
<TitleText textformat="02">Acknowledgements</TitleText>
10
01
JB code
z.195.intro
1
6
6
Article
2
<TitleType>01</TitleType>
<TitleText textformat="02">Introduction</TitleText>
10
01
JB code
z.195.c1
7
20
14
Article
3
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 1. What is statistics?</TitleText>
<Subtitle textformat="02">Main statistical notions and principles</Subtitle>
01
What is statistics? What can and cannot statistics do for you? How to formulate and test research hypotheses? What kind of statistical tests are there? These and many other questions are discussed in this chapter. In addition, you will also learn about different types of variables, parametric and non-parametric tests, <i>p</i>-values and many other things which you will need in order to understand explanations provided in the following chapters.
10
01
JB code
z.195.c2
21
40
20
Article
4
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 2. Introduction to R</TitleText>
01
In this chapter you will learn to install the basic distribution of R, as well as add-on packages. The chapter also introduces the basics of R syntax and demonstrates how to perform simple operations with different R objects. Special attention is paid to importing and exporting your own data to and from R and saving your graphical output. You will also be able to interpret error messages and warnings that R may give you and search for additional information on R functions.
10
01
JB code
z.195.c3
41
68
28
Article
5
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 3. Descriptive statistics for quantitative variables</TitleText>
01
This chapter shows how to compute basic descriptive statistics for a quantitative variable. You will learn the most popular measures of central tendency (the mean, the median and the mode) and dispersion (variance, standard deviation, range, IQR, median absolute deviation). The chapter will also demonstrate how to produce different graphs (box-and-whisker plots, histograms, density plots, Q–Q plots, line charts), which visualize univariate distributions and help one determine whether a variable is normally distributed. From the case studies you will learn how to analyse the distribution of word lengths in a sample, to detect suspicious values in subjects’ reaction times in a lexical decision task, and to correct some problems with the shape of a distribution.
10
01
JB code
z.195.c4
69
86
18
Article
6
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 4. How to explore qualitative variables</TitleText>
<Subtitle textformat="02">proportions and their visualizations</Subtitle>
01
This chapter demonstrates how to explore a categorical variable with the help of tables of counts and proportions. As in the previous chapter, graphs (pie charts, bar plots and dot charts) will play a very important role. You will also learn how to change values of a categorical variable. In addition, we will discuss how one can use Deviation of Proportions to measure dispersion of words in a corpus. This approach will be illustrated by a case study of the Basic Colour Terms in English.
10
01
JB code
z.195.c5
87
114
28
Article
7
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 5. Comparing two groups</TitleText>
<Subtitle textformat="02"><i>t</i>-test and Wilcoxon and Mann-Whitney tests for independent and dependent samples</Subtitle>
01
Do language learners who are taught by an innovative method show better results than those who are taught traditionally? Do speakers of one language variety speak faster than speakers of another variety? Do people of one gender use more hedging constructions than people of another? In this chapter, you will learn how to make such comparisons using the parametric <i>t</i>-test and the non-parametric Wilcoxon and Mann-Whitney tests for dependent and independent samples. You will learn how to compute the standard error and confidence intervals for the mean. The case studies will involve differences between high- and low-frequency nouns with regard to the number of associations that they trigger and their abstractness/concreteness scores.
10
01
JB code
z.195.c6
115
138
24
Article
8
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 6. Relationships between two quantitative variables</TitleText>
<Subtitle textformat="02">Correlation analysis with elements of linear regression modelling</Subtitle>
01
Will your knowledge of statistics improve as you read more and more books on the subject? Is there a relationship between the length of a word and its frequency? Does grammatical proficiency of children depend on the number of lexical items which they have mastered? Does the number of phonemes in a language depend on the number of speakers? All these questions involve correlation between two variables. This chapter explains the principles of correlation analysis and demonstrates how it can be carried out using popular parametric and non-parametric tests. You will also learn how to produce correlograms and scatter plots with a regression line. Some fundamental notions of regression analysis, such as residuals, homo- and heteroscedasticity, will be introduced. The case studies investigate the relationship between word frequency and mean reaction time in a lexical decision task and the correlation between vocabulary size and grammatical proficiency in first language acquisition.
10
01
JB code
z.195.c7
139
170
32
Article
9
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 7. More on frequencies and reaction times</TitleText>
<Subtitle textformat="02">Linear regression</Subtitle>
01
After the previous chapter has introduced some basic elements of regression analysis, this chapter will provide a more thorough discussion of linear regression. This method enables one to model and explain the relationships between one or more explanatory variables at any level of measurement, on the one hand, and one ratio- or interval-scaled response variable, on the other hand. In addition, one can investigate interactions between explanatory variables. You will learn how to fit a multiple linear regression model, to perform its diagnostics and to interpret the results. You will also learn how to carry out non-parametric linear regression with the help of bootstrap. The case study investigates the relationship between reaction times in a lexical decision task, and such factors as word length, corpus frequency and part of speech of lexical stimuli. In contrast with the previous case studies, all these factors are tested here simultaneously in a multiple linear regression model.
10
01
JB code
z.195.c8
171
198
28
Article
10
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 8. Finding differences between several groups</TitleText>
<Subtitle textformat="02">Sign language, linguistic relativity and ANOVA</Subtitle>
01
This chapter introduces ANOVA (analysis of variance), a special case of linear regression with binary or categorical independent variables. This method is widely used in experimental linguistics, when the researcher compares several groups of experimental objects that undergo different treatments. In this chapter you will learn several types of ANOVA: one-way ANOVA with one factor as an independent variable, factorial ANOVA with two or more categorical independent variables, and repeated-measures and mixed ANOVA. The methods are illustrated by three case studies. The first two focus on grammatical features of an emergent sign language. The third case study deals with cross-linguistic differences in time conceptualization, which are interpreted as evidence in favour of the linguistic relativity hypothesis.
10
01
JB code
z.195.c9
199
222
24
Article
11
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 9. Measuring associations between two categorical variables</TitleText>
<Subtitle textformat="02">Conceptual metaphors and tests of independence</Subtitle>
01
This chapter focuses on associations between two categorical variables. You will learn how to measure the association strength using odds ratios, Cramér’s <i>V</i> and the φ-coefficient. You will also learn how to test whether the association is statistically significant with the help of the <i>χ2</i>-test and the Fisher exact test. Bar plots, mosaic plots and association plots are used as visualization tools for cross-tabulated data. All these concepts and tools will be illustrated by case studies of metaphoric and non-metaphoric uses of the preposition <i>over</i> and the verb <i>see</i> in different registers.
10
01
JB code
z.195.c10
223
240
18
Article
12
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 10. Association measures</TitleText>
<Subtitle textformat="02">collocations and collostructions</Subtitle>
01
Collocations, as well as colligations and other co-occurrence patterns, play an important role in corpus linguistics, psycholinguistics and usage-based grammar and lexicology. To measure the degree of attraction between words and other units, one can use diverse association measures, such as collostructional strength, Pointwise Mutual Information or <i>ΔP</i>. From this chapter you will learn how to compute a variety of association measures using a small set of different co-occurrence frequencies. The case study is based on co-occurrence frequencies of different verbs in the Russian ditransitive construction.
10
01
JB code
z.195.c11
241
252
12
Article
13
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 11. Geographic variation of quite: Distinctive collexeme analysis</TitleText>
01
This chapter introduces distinctive collexeme analysis, which employs bidirectional association measures discussed in the previous chapter. This method is based on the co-occurrence frequencies of words that occur in two near-synonymous constructions, or in two or more dialectal or diachronic variants of the same construction. Here we will compare the variants of <i>quite + </i>ADJ constructions in different national varieties of English. We will first present a canonical distinctive collexeme analysis with only two varieties, British and American English, and then will show how this approach can be extended to more lects, presenting a unified approach to multiple distinctive collexeme analysis.
10
01
JB code
z.195.c12
253
276
24
Article
14
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 12. Probabilistic multifactorial grammar and lexicology</TitleText>
<Subtitle textformat="02">Binomial logistic regression</Subtitle>
01
In this chapter you will learn how to model the speaker’s choice between two near synonymous words or constructions on the basis of contextual features. The most popular statistical tool that is used to create such models is logistic regression. The approach is illustrated by a case study of two Dutch causative auxiliaries. As in the case of linear regression, you will learn how to create, test and interpret a logistic model with the help of different R tools.
10
01
JB code
z.195.c13
277
290
14
Article
15
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 13. Multinomial (polytomous) logistic regression models of three and more near synonyms</TitleText>
01
This chapter continues the discussion of logistic regression models, which can be used to predict the speaker’s choice between different near synonyms or variants. This time you will learn to model situations when the number of possible outcomes is greater than two. Such models are called multinomial, or polytomous. The method will be illustrated with a case study of three near synonyms: <i>let</i>, <i>allow </i>and <i>permit</i>.
10
01
JB code
z.195.c14
291
300
10
Article
16
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 14. Conditional inference trees and random forests</TitleText>
01
This chapter discusses conditional inference trees and random forests. These are non-parametric tree-structure models of regression and classification that can serve as an alternative to multiple regression. They are especially useful in the presence of many high-order interactions and in situations when the sample size is small, but the number of predictors is large. You will learn how to fit such models, interpret their results and evaluate their quality. The case study that illustrates the techniques deals with three English causative constructions <i>make + V</i>, <i>cause + to V</i> and <i>have + V </i>and identifies the set of independent semantic variables that are important for distinguishing between the constructions.
10
01
JB code
z.195.c15
301
322
22
Article
17
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 15. Behavioural profiles, distance metrics and cluster analysis</TitleText>
01
This chapter presents the Behavioural Profiles approach, which involves the comparison of contextual features of words or constructions in a corpus. The chapter also discusses several clustering algorithms, which are based on different distance metrics. Cluster analysis is a family of techniques that can help you discover groups of similar objects in the data. Several popular methods of cluster validation and diagnostics are discussed, which involve the computation of average silhouette widths and multiscale bootstrap resampling. The chapter also demonstrates how to interpret clusters with the help of the snake plot and effect size measures. In addition, you will learn to create and interpret scree plots, which are useful for determining the optimal number of clusters.
10
01
JB code
z.195.c16
323
332
10
Article
18
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 16. Introduction to Semantic Vector Spaces</TitleText>
<Subtitle textformat="02">Cosine as a measure of semantic similarity</Subtitle>
01
This chapter introduces Semantic Vector Spaces, another distributional approach to semantics. This method originates in Natural Language Processing. Unlike Behavioural Profiles discussed in the previous chapter, it uses automatically extracted co-occurrences of target words and contextual features. The characteristic features of the method are weighted co-occurrence frequencies and the use of the cosine as the most popular similarity measure. This chapter provides a general introduction to the method, with a case study of English cooking verbs as an illustration.
10
01
JB code
z.195.c17
333
350
18
Article
19
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 17. Language and space</TitleText>
<Subtitle textformat="02">Dialects, maps and Multidimensional Scaling</Subtitle>
01
This chapter introduces another popular method that deals with distance matrices. This method is called Multidimensional Scaling. It is a dimensionality reduction technique that represents distances between objects in a low-dimensional space. You will learn how to perform different types of metric and non-metric scaling and carry out the diagnostics of solutions by using the scree plot, the Shepard plot and goodness-of-fit measures. The chapter also shows how one can use R for creation of geographical maps with points and text labels. Finally, you will learn how to measure the correlation between two distance matrices with the help of the Mantel test. The case studies are based on geographic coordinates and several linguistic features of varieties of English all over the world.
10
01
JB code
z.195.c18
351
366
16
Article
20
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 18. Multidimensional analysis of register variation</TitleText>
<Subtitle textformat="02">Principal Components Analysis and Factor Analysis</Subtitle>
01
In this chapter you will learn about Principal Components Analysis and Factor Analysis. The aim of these methods is to reduce a large number of correlated quantitative variables to a small set of underlying dimensions. You will learn how to use these methods to perform corpus-based multidimensional analysis of register variation.
10
01
JB code
z.195.c19
367
386
20
Article
21
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 19. Exemplars, categories, prototypes</TitleText>
<Subtitle textformat="02">Simple and multiple correspondence analysis</Subtitle>
01
This chapter introduces Correspondence Analysis. It is similar to PCA, but is designed for visualization and exploration of bivariate and multivariate categorical data. The first case study focuses on register variation of English Basic Colour Terms by using Simple Correspondence Analysis, which can be used for visualization of bivariate categorical data in two-dimensional contingency tables. In the second case study of German lexical categories <i>Stuhl</i> ‘chair’ and <i>Sessel</i> ‘armchair’, you will learn how to perform Multiple Correspondence Analysis with higher-dimensional tables.
10
01
JB code
z.195.c20
387
394
8
Article
22
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 20. Constructional change and motion charts</TitleText>
01
This chapter introduces motion charts as a method for dynamic visualization of language change. More specifically, they enable one to detect and explore changes in the use of constructions by visualizing the relative frequencies of different lexemes that fill in the constructional slots. The method is illustrated with a case study that explores the changes in the use of future markers <i>will</i> and <i>be going to </i>by comparing the frequencies of infinitives that follow the markers.
10
01
JB code
z.195.epi
395
396
2
Article
23
<TitleType>01</TitleType>
<TitleText textformat="02">Epilogue</TitleText>
10
01
JB code
z.195.app1
397
408
12
Article
24
<TitleType>01</TitleType>
<TitleText textformat="02">The most important R objects and basic operations with them</TitleText>
<TitlePrefix>The </TitlePrefix>
<TitleWithoutPrefix textformat="02">most important R objects and basic operations with them</TitleWithoutPrefix>
10
01
JB code
z.195.app2
409
424
16
Article
25
<TitleType>01</TitleType>
<TitleText textformat="02">Main plotting functions and graphical parameters in R</TitleText>
10
01
JB code
z.195.refs
425
432
8
Article
26
<TitleType>01</TitleType>
<TitleText textformat="02">References</TitleText>
10
01
JB code
z.195.si
433
440
8
Article
27
<TitleType>01</TitleType>
<TitleText textformat="02">Subject Index</TitleText>
10
01
JB code
z.195.ri
441
443
3
Article
28
<TitleType>01</TitleType>
<TitleText textformat="02">Index of R functions and packages</TitleText>
02
JBENJAMINS
John Benjamins Publishing Company
01
John Benjamins Publishing Company
Amsterdam/Philadelphia
NL
04
20151125
2015
John Benjamins B.V.
02
WORLD
13
15
9789027212245
01
JB
3
John Benjamins e-Platform
03
jbe-platform.com
09
WORLD
21
01
06
Institutional price
00
105.00
EUR
R
01
05
Consumer price
00
36.00
EUR
R
01
06
Institutional price
00
88.00
GBP
Z
01
05
Consumer price
00
30.00
GBP
Z
01
06
Institutional price
inst
00
158.00
USD
S
01
05
Consumer price
cons
00
54.00
USD
S
412015967
03
01
01
JB
John Benjamins Publishing Company
01
JB code
Z 195 Hb
15
9789027212245
13
2015016708
BB
<TitleType>01</TitleType>
<TitleText textformat="02">How to do Linguistics with R</TitleText>
<Subtitle textformat="02">Data exploration and statistical analysis</Subtitle>
01
z.195
01
https://benjamins.com
02
https://benjamins.com/catalog/z.195
1
A01
Natalia Levshina
Levshina, Natalia
Natalia
Levshina
Université catholique de Louvain
01
eng
454
xi
443
LAN009000
v.2006
CFX
2
24
JB Subject Scheme
LIN.COGN
Cognition and language
24
JB Subject Scheme
LIN.COMPUT
Computational & corpus linguistics
24
JB Subject Scheme
LIN.THEOR
Theoretical linguistics
06
01
This book provides a linguist with a statistical toolkit for exploration and analysis of linguistic data. It employs R, a free software environment for statistical computing, which is increasingly popular among linguists. <i>How to do Linguistics with R: Data exploration and statistical analysis</i> is unique in its scope, as it covers a wide range of classical and cutting-edge statistical methods, including different flavours of regression analysis and ANOVA, random forests and conditional inference trees, as well as specific linguistic approaches, among which are Behavioural Profiles, Vector Space Models and various measures of association between words and constructions. The statistical topics are presented comprehensively, but without too much technical detail, and illustrated with linguistic case studies that answer non-trivial research questions. The book also demonstrates how to visualize linguistic data with the help of attractive informative graphs, including the popular ggplot2 system and Google visualization tools.<br />This book has a companion website: <a href="http://doi.org/10.1075/z.195.website">http://doi.org/10.1075/z.195.website</a>
05
Levshina’s book achieves something few other books on doing linguistics with R have achieved. She has written a book that makes sense even for novice users of R and for linguists not accustomed to statistical computing. Levshina writes in a pedagogically sensitive style, in friendly language, and with just the right amount of explanatory prose to lead the reader to insightful analyses. Taken together, the chapters introduce the reader to a sparkling variety of statistical methods. Best of all, from the point of view of linguists, real *linguistic *problems take centre stage throughout the book– the statistical methods are the means to answer intriguing linguistic questions, not an end in themselves.
John Newman, University of Alberta
05
This is a fantastic textbook: extremely comprehensive (the book surveys almost all major analysis technique used in the linguistic literature – from descriptive statistics over regression analysis to semantic vector space modeling), well-written, and with a much appreciated emphasis on good data visualization. Both beginning and more experienced quantitative linguists will find this book an invaluable resource.
Benedikt Szmrecsanyi, University of Leuven
04
09
01
https://benjamins.com/covers/475/z.195.png
04
03
01
https://benjamins.com/covers/475_jpg/9789027212245.jpg
04
03
01
https://benjamins.com/covers/475_tif/9789027212245.tif
06
09
01
https://benjamins.com/covers/1200_front/z.195.hb.png
07
09
01
https://benjamins.com/covers/125/z.195.png
25
09
01
https://benjamins.com/covers/1200_back/z.195.hb.png
27
09
01
https://benjamins.com/covers/3d_web/z.195.hb.png
10
01
JB code
z.195.ack
xi
xii
2
Article
1
<TitleType>01</TitleType>
<TitleText textformat="02">Acknowledgements</TitleText>
10
01
JB code
z.195.intro
1
6
6
Article
2
<TitleType>01</TitleType>
<TitleText textformat="02">Introduction</TitleText>
10
01
JB code
z.195.c1
7
20
14
Article
3
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 1. What is statistics?</TitleText>
<Subtitle textformat="02">Main statistical notions and principles</Subtitle>
01
What is statistics? What can and cannot statistics do for you? How to formulate and test research hypotheses? What kind of statistical tests are there? These and many other questions are discussed in this chapter. In addition, you will also learn about different types of variables, parametric and non-parametric tests, <i>p</i>-values and many other things which you will need in order to understand explanations provided in the following chapters.
10
01
JB code
z.195.c2
21
40
20
Article
4
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 2. Introduction to R</TitleText>
01
In this chapter you will learn to install the basic distribution of R, as well as add-on packages. The chapter also introduces the basics of R syntax and demonstrates how to perform simple operations with different R objects. Special attention is paid to importing and exporting your own data to and from R and saving your graphical output. You will also be able to interpret error messages and warnings that R may give you and search for additional information on R functions.
10
01
JB code
z.195.c3
41
68
28
Article
5
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 3. Descriptive statistics for quantitative variables</TitleText>
01
This chapter shows how to compute basic descriptive statistics for a quantitative variable. You will learn the most popular measures of central tendency (the mean, the median and the mode) and dispersion (variance, standard deviation, range, IQR, median absolute deviation). The chapter will also demonstrate how to produce different graphs (box-and-whisker plots, histograms, density plots, Q–Q plots, line charts), which visualize univariate distributions and help one determine whether a variable is normally distributed. From the case studies you will learn how to analyse the distribution of word lengths in a sample, to detect suspicious values in subjects’ reaction times in a lexical decision task, and to correct some problems with the shape of a distribution.
10
01
JB code
z.195.c4
69
86
18
Article
6
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 4. How to explore qualitative variables</TitleText>
<Subtitle textformat="02">proportions and their visualizations</Subtitle>
01
This chapter demonstrates how to explore a categorical variable with the help of tables of counts and proportions. As in the previous chapter, graphs (pie charts, bar plots and dot charts) will play a very important role. You will also learn how to change values of a categorical variable. In addition, we will discuss how one can use Deviation of Proportions to measure dispersion of words in a corpus. This approach will be illustrated by a case study of the Basic Colour Terms in English.
10
01
JB code
z.195.c5
87
114
28
Article
7
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 5. Comparing two groups</TitleText>
<Subtitle textformat="02"><i>t</i>-test and Wilcoxon and Mann-Whitney tests for independent and dependent samples</Subtitle>
01
Do language learners who are taught by an innovative method show better results than those who are taught traditionally? Do speakers of one language variety speak faster than speakers of another variety? Do people of one gender use more hedging constructions than people of another? In this chapter, you will learn how to make such comparisons using the parametric <i>t</i>-test and the non-parametric Wilcoxon and Mann-Whitney tests for dependent and independent samples. You will learn how to compute the standard error and confidence intervals for the mean. The case studies will involve differences between high- and low-frequency nouns with regard to the number of associations that they trigger and their abstractness/concreteness scores.
10
01
JB code
z.195.c6
115
138
24
Article
8
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 6. Relationships between two quantitative variables</TitleText>
<Subtitle textformat="02">Correlation analysis with elements of linear regression modelling</Subtitle>
01
Will your knowledge of statistics improve as you read more and more books on the subject? Is there a relationship between the length of a word and its frequency? Does grammatical proficiency of children depend on the number of lexical items which they have mastered? Does the number of phonemes in a language depend on the number of speakers? All these questions involve correlation between two variables. This chapter explains the principles of correlation analysis and demonstrates how it can be carried out using popular parametric and non-parametric tests. You will also learn how to produce correlograms and scatter plots with a regression line. Some fundamental notions of regression analysis, such as residuals, homo- and heteroscedasticity, will be introduced. The case studies investigate the relationship between word frequency and mean reaction time in a lexical decision task and the correlation between vocabulary size and grammatical proficiency in first language acquisition.
10
01
JB code
z.195.c7
139
170
32
Article
9
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 7. More on frequencies and reaction times</TitleText>
<Subtitle textformat="02">Linear regression</Subtitle>
01
After the previous chapter has introduced some basic elements of regression analysis, this chapter will provide a more thorough discussion of linear regression. This method enables one to model and explain the relationships between one or more explanatory variables at any level of measurement, on the one hand, and one ratio- or interval-scaled response variable, on the other hand. In addition, one can investigate interactions between explanatory variables. You will learn how to fit a multiple linear regression model, to perform its diagnostics and to interpret the results. You will also learn how to carry out non-parametric linear regression with the help of bootstrap. The case study investigates the relationship between reaction times in a lexical decision task, and such factors as word length, corpus frequency and part of speech of lexical stimuli. In contrast with the previous case studies, all these factors are tested here simultaneously in a multiple linear regression model.
10
01
JB code
z.195.c8
171
198
28
Article
10
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 8. Finding differences between several groups</TitleText>
<Subtitle textformat="02">Sign language, linguistic relativity and ANOVA</Subtitle>
01
This chapter introduces ANOVA (analysis of variance), a special case of linear regression with binary or categorical independent variables. This method is widely used in experimental linguistics, when the researcher compares several groups of experimental objects that undergo different treatments. In this chapter you will learn several types of ANOVA: one-way ANOVA with one factor as an independent variable, factorial ANOVA with two or more categorical independent variables, and repeated-measures and mixed ANOVA. The methods are illustrated by three case studies. The first two focus on grammatical features of an emergent sign language. The third case study deals with cross-linguistic differences in time conceptualization, which are interpreted as evidence in favour of the linguistic relativity hypothesis.
10
01
JB code
z.195.c9
199
222
24
Article
11
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 9. Measuring associations between two categorical variables</TitleText>
<Subtitle textformat="02">Conceptual metaphors and tests of independence</Subtitle>
01
This chapter focuses on associations between two categorical variables. You will learn how to measure the association strength using odds ratios, Cramér’s <i>V</i> and the φ-coefficient. You will also learn how to test whether the association is statistically significant with the help of the <i>χ2</i>-test and the Fisher exact test. Bar plots, mosaic plots and association plots are used as visualization tools for cross-tabulated data. All these concepts and tools will be illustrated by case studies of metaphoric and non-metaphoric uses of the preposition <i>over</i> and the verb <i>see</i> in different registers.
10
01
JB code
z.195.c10
223
240
18
Article
12
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 10. Association measures</TitleText>
<Subtitle textformat="02">collocations and collostructions</Subtitle>
01
Collocations, as well as colligations and other co-occurrence patterns, play an important role in corpus linguistics, psycholinguistics and usage-based grammar and lexicology. To measure the degree of attraction between words and other units, one can use diverse association measures, such as collostructional strength, Pointwise Mutual Information or <i>ΔP</i>. From this chapter you will learn how to compute a variety of association measures using a small set of different co-occurrence frequencies. The case study is based on co-occurrence frequencies of different verbs in the Russian ditransitive construction.
10
01
JB code
z.195.c11
241
252
12
Article
13
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 11. Geographic variation of quite: Distinctive collexeme analysis</TitleText>
01
This chapter introduces distinctive collexeme analysis, which employs bidirectional association measures discussed in the previous chapter. This method is based on the co-occurrence frequencies of words that occur in two near-synonymous constructions, or in two or more dialectal or diachronic variants of the same construction. Here we will compare the variants of <i>quite + </i>ADJ constructions in different national varieties of English. We will first present a canonical distinctive collexeme analysis with only two varieties, British and American English, and then will show how this approach can be extended to more lects, presenting a unified approach to multiple distinctive collexeme analysis.
10
01
JB code
z.195.c12
253
276
24
Article
14
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 12. Probabilistic multifactorial grammar and lexicology</TitleText>
<Subtitle textformat="02">Binomial logistic regression</Subtitle>
01
In this chapter you will learn how to model the speaker’s choice between two near synonymous words or constructions on the basis of contextual features. The most popular statistical tool that is used to create such models is logistic regression. The approach is illustrated by a case study of two Dutch causative auxiliaries. As in the case of linear regression, you will learn how to create, test and interpret a logistic model with the help of different R tools.
10
01
JB code
z.195.c13
277
290
14
Article
15
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 13. Multinomial (polytomous) logistic regression models of three and more near synonyms</TitleText>
01
This chapter continues the discussion of logistic regression models, which can be used to predict the speaker’s choice between different near synonyms or variants. This time you will learn to model situations when the number of possible outcomes is greater than two. Such models are called multinomial, or polytomous. The method will be illustrated with a case study of three near synonyms: <i>let</i>, <i>allow </i>and <i>permit</i>.
10
01
JB code
z.195.c14
291
300
10
Article
16
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 14. Conditional inference trees and random forests</TitleText>
01
This chapter discusses conditional inference trees and random forests. These are non-parametric tree-structure models of regression and classification that can serve as an alternative to multiple regression. They are especially useful in the presence of many high-order interactions and in situations when the sample size is small, but the number of predictors is large. You will learn how to fit such models, interpret their results and evaluate their quality. The case study that illustrates the techniques deals with three English causative constructions <i>make + V</i>, <i>cause + to V</i> and <i>have + V </i>and identifies the set of independent semantic variables that are important for distinguishing between the constructions.
10
01
JB code
z.195.c15
301
322
22
Article
17
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 15. Behavioural profiles, distance metrics and cluster analysis</TitleText>
01
This chapter presents the Behavioural Profiles approach, which involves the comparison of contextual features of words or constructions in a corpus. The chapter also discusses several clustering algorithms, which are based on different distance metrics. Cluster analysis is a family of techniques that can help you discover groups of similar objects in the data. Several popular methods of cluster validation and diagnostics are discussed, which involve the computation of average silhouette widths and multiscale bootstrap resampling. The chapter also demonstrates how to interpret clusters with the help of the snake plot and effect size measures. In addition, you will learn to create and interpret scree plots, which are useful for determining the optimal number of clusters.
10
01
JB code
z.195.c16
323
332
10
Article
18
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 16. Introduction to Semantic Vector Spaces</TitleText>
<Subtitle textformat="02">Cosine as a measure of semantic similarity</Subtitle>
01
This chapter introduces Semantic Vector Spaces, another distributional approach to semantics. This method originates in Natural Language Processing. Unlike Behavioural Profiles discussed in the previous chapter, it uses automatically extracted co-occurrences of target words and contextual features. The characteristic features of the method are weighted co-occurrence frequencies and the use of the cosine as the most popular similarity measure. This chapter provides a general introduction to the method, with a case study of English cooking verbs as an illustration.
10
01
JB code
z.195.c17
333
350
18
Article
19
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 17. Language and space</TitleText>
<Subtitle textformat="02">Dialects, maps and Multidimensional Scaling</Subtitle>
01
This chapter introduces another popular method that deals with distance matrices. This method is called Multidimensional Scaling. It is a dimensionality reduction technique that represents distances between objects in a low-dimensional space. You will learn how to perform different types of metric and non-metric scaling and carry out the diagnostics of solutions by using the scree plot, the Shepard plot and goodness-of-fit measures. The chapter also shows how one can use R for creation of geographical maps with points and text labels. Finally, you will learn how to measure the correlation between two distance matrices with the help of the Mantel test. The case studies are based on geographic coordinates and several linguistic features of varieties of English all over the world.
10
01
JB code
z.195.c18
351
366
16
Article
20
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 18. Multidimensional analysis of register variation</TitleText>
<Subtitle textformat="02">Principal Components Analysis and Factor Analysis</Subtitle>
01
In this chapter you will learn about Principal Components Analysis and Factor Analysis. The aim of these methods is to reduce a large number of correlated quantitative variables to a small set of underlying dimensions. You will learn how to use these methods to perform corpus-based multidimensional analysis of register variation.
10
01
JB code
z.195.c19
367
386
20
Article
21
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 19. Exemplars, categories, prototypes</TitleText>
<Subtitle textformat="02">Simple and multiple correspondence analysis</Subtitle>
01
This chapter introduces Correspondence Analysis. It is similar to PCA, but is designed for visualization and exploration of bivariate and multivariate categorical data. The first case study focuses on register variation of English Basic Colour Terms by using Simple Correspondence Analysis, which can be used for visualization of bivariate categorical data in two-dimensional contingency tables. In the second case study of German lexical categories <i>Stuhl</i> ‘chair’ and <i>Sessel</i> ‘armchair’, you will learn how to perform Multiple Correspondence Analysis with higher-dimensional tables.
10
01
JB code
z.195.c20
387
394
8
Article
22
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 20. Constructional change and motion charts</TitleText>
01
This chapter introduces motion charts as a method for dynamic visualization of language change. More specifically, they enable one to detect and explore changes in the use of constructions by visualizing the relative frequencies of different lexemes that fill in the constructional slots. The method is illustrated with a case study that explores the changes in the use of future markers <i>will</i> and <i>be going to </i>by comparing the frequencies of infinitives that follow the markers.
10
01
JB code
z.195.epi
395
396
2
Article
23
<TitleType>01</TitleType>
<TitleText textformat="02">Epilogue</TitleText>
10
01
JB code
z.195.app1
397
408
12
Article
24
<TitleType>01</TitleType>
<TitleText textformat="02">The most important R objects and basic operations with them</TitleText>
<TitlePrefix>The </TitlePrefix>
<TitleWithoutPrefix textformat="02">most important R objects and basic operations with them</TitleWithoutPrefix>
10
01
JB code
z.195.app2
409
424
16
Article
25
<TitleType>01</TitleType>
<TitleText textformat="02">Main plotting functions and graphical parameters in R</TitleText>
10
01
JB code
z.195.refs
425
432
8
Article
26
<TitleType>01</TitleType>
<TitleText textformat="02">References</TitleText>
10
01
JB code
z.195.si
433
440
8
Article
27
<TitleType>01</TitleType>
<TitleText textformat="02">Subject Index</TitleText>
10
01
JB code
z.195.ri
441
443
3
Article
28
<TitleType>01</TitleType>
<TitleText textformat="02">Index of R functions and packages</TitleText>
02
JBENJAMINS
John Benjamins Publishing Company
01
John Benjamins Publishing Company
Amsterdam/Philadelphia
NL
04
20151125
2015
John Benjamins B.V.
02
WORLD
01
245
mm
02
174
mm
08
995
gr
01
JB
1
John Benjamins Publishing Company
+31 20 6304747
+31 20 6739773
bookorder@benjamins.nl
01
https://benjamins.com
01
WORLD
US CA MX
21
4
16
01
02
JB
1
00
105.00
EUR
R
02
02
JB
1
00
111.30
EUR
R
01
JB
10
bebc
+44 1202 712 934
+44 1202 712 913
sales@bebc.co.uk
03
GB
21
16
02
02
JB
1
00
88.00
GBP
Z
01
JB
2
John Benjamins North America
+1 800 562-5666
+1 703 661-1501
benjamins@presswarehouse.com
01
https://benjamins.com
01
US CA MX
21
16
01
gen
02
JB
1
00
158.00
USD
180016118
03
01
01
JB
John Benjamins Publishing Company
01
JB code
Z 195 Pb
15
9789027212252
13
2015016708
BC
<TitleType>01</TitleType>
<TitleText textformat="02">How to do Linguistics with R</TitleText>
<Subtitle textformat="02">Data exploration and statistical analysis</Subtitle>
01
z.195
01
https://benjamins.com
02
https://benjamins.com/catalog/z.195
1
A01
Natalia Levshina
Levshina, Natalia
Natalia
Levshina
Université catholique de Louvain
01
eng
454
xi
443
LAN009000
v.2006
CFX
2
24
JB Subject Scheme
LIN.COGN
Cognition and language
24
JB Subject Scheme
LIN.COMPUT
Computational & corpus linguistics
24
JB Subject Scheme
LIN.THEOR
Theoretical linguistics
06
01
This book provides a linguist with a statistical toolkit for exploration and analysis of linguistic data. It employs R, a free software environment for statistical computing, which is increasingly popular among linguists. <i>How to do Linguistics with R: Data exploration and statistical analysis</i> is unique in its scope, as it covers a wide range of classical and cutting-edge statistical methods, including different flavours of regression analysis and ANOVA, random forests and conditional inference trees, as well as specific linguistic approaches, among which are Behavioural Profiles, Vector Space Models and various measures of association between words and constructions. The statistical topics are presented comprehensively, but without too much technical detail, and illustrated with linguistic case studies that answer non-trivial research questions. The book also demonstrates how to visualize linguistic data with the help of attractive informative graphs, including the popular ggplot2 system and Google visualization tools.<br />This book has a companion website: <a href="http://doi.org/10.1075/z.195.website">http://doi.org/10.1075/z.195.website</a>
05
Levshina’s book achieves something few other books on doing linguistics with R have achieved. She has written a book that makes sense even for novice users of R and for linguists not accustomed to statistical computing. Levshina writes in a pedagogically sensitive style, in friendly language, and with just the right amount of explanatory prose to lead the reader to insightful analyses. Taken together, the chapters introduce the reader to a sparkling variety of statistical methods. Best of all, from the point of view of linguists, real *linguistic *problems take centre stage throughout the book– the statistical methods are the means to answer intriguing linguistic questions, not an end in themselves.
John Newman, University of Alberta
05
This is a fantastic textbook: extremely comprehensive (the book surveys almost all major analysis technique used in the linguistic literature – from descriptive statistics over regression analysis to semantic vector space modeling), well-written, and with a much appreciated emphasis on good data visualization. Both beginning and more experienced quantitative linguists will find this book an invaluable resource.
Benedikt Szmrecsanyi, University of Leuven
04
09
01
https://benjamins.com/covers/475/z.195.png
04
03
01
https://benjamins.com/covers/475_jpg/9789027212245.jpg
04
03
01
https://benjamins.com/covers/475_tif/9789027212245.tif
06
09
01
https://benjamins.com/covers/1200_front/z.195.pb.png
07
09
01
https://benjamins.com/covers/125/z.195.png
25
09
01
https://benjamins.com/covers/1200_back/z.195.pb.png
27
09
01
https://benjamins.com/covers/3d_web/z.195.pb.png
10
01
JB code
z.195.ack
xi
xii
2
Article
1
<TitleType>01</TitleType>
<TitleText textformat="02">Acknowledgements</TitleText>
10
01
JB code
z.195.intro
1
6
6
Article
2
<TitleType>01</TitleType>
<TitleText textformat="02">Introduction</TitleText>
10
01
JB code
z.195.c1
7
20
14
Article
3
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 1. What is statistics?</TitleText>
<Subtitle textformat="02">Main statistical notions and principles</Subtitle>
01
What is statistics? What can and cannot statistics do for you? How to formulate and test research hypotheses? What kind of statistical tests are there? These and many other questions are discussed in this chapter. In addition, you will also learn about different types of variables, parametric and non-parametric tests, <i>p</i>-values and many other things which you will need in order to understand explanations provided in the following chapters.
10
01
JB code
z.195.c2
21
40
20
Article
4
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 2. Introduction to R</TitleText>
01
In this chapter you will learn to install the basic distribution of R, as well as add-on packages. The chapter also introduces the basics of R syntax and demonstrates how to perform simple operations with different R objects. Special attention is paid to importing and exporting your own data to and from R and saving your graphical output. You will also be able to interpret error messages and warnings that R may give you and search for additional information on R functions.
10
01
JB code
z.195.c3
41
68
28
Article
5
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 3. Descriptive statistics for quantitative variables</TitleText>
01
This chapter shows how to compute basic descriptive statistics for a quantitative variable. You will learn the most popular measures of central tendency (the mean, the median and the mode) and dispersion (variance, standard deviation, range, IQR, median absolute deviation). The chapter will also demonstrate how to produce different graphs (box-and-whisker plots, histograms, density plots, Q–Q plots, line charts), which visualize univariate distributions and help one determine whether a variable is normally distributed. From the case studies you will learn how to analyse the distribution of word lengths in a sample, to detect suspicious values in subjects’ reaction times in a lexical decision task, and to correct some problems with the shape of a distribution.
10
01
JB code
z.195.c4
69
86
18
Article
6
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 4. How to explore qualitative variables</TitleText>
<Subtitle textformat="02">proportions and their visualizations</Subtitle>
01
This chapter demonstrates how to explore a categorical variable with the help of tables of counts and proportions. As in the previous chapter, graphs (pie charts, bar plots and dot charts) will play a very important role. You will also learn how to change values of a categorical variable. In addition, we will discuss how one can use Deviation of Proportions to measure dispersion of words in a corpus. This approach will be illustrated by a case study of the Basic Colour Terms in English.
10
01
JB code
z.195.c5
87
114
28
Article
7
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 5. Comparing two groups</TitleText>
<Subtitle textformat="02"><i>t</i>-test and Wilcoxon and Mann-Whitney tests for independent and dependent samples</Subtitle>
01
Do language learners who are taught by an innovative method show better results than those who are taught traditionally? Do speakers of one language variety speak faster than speakers of another variety? Do people of one gender use more hedging constructions than people of another? In this chapter, you will learn how to make such comparisons using the parametric <i>t</i>-test and the non-parametric Wilcoxon and Mann-Whitney tests for dependent and independent samples. You will learn how to compute the standard error and confidence intervals for the mean. The case studies will involve differences between high- and low-frequency nouns with regard to the number of associations that they trigger and their abstractness/concreteness scores.
10
01
JB code
z.195.c6
115
138
24
Article
8
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 6. Relationships between two quantitative variables</TitleText>
<Subtitle textformat="02">Correlation analysis with elements of linear regression modelling</Subtitle>
01
Will your knowledge of statistics improve as you read more and more books on the subject? Is there a relationship between the length of a word and its frequency? Does grammatical proficiency of children depend on the number of lexical items which they have mastered? Does the number of phonemes in a language depend on the number of speakers? All these questions involve correlation between two variables. This chapter explains the principles of correlation analysis and demonstrates how it can be carried out using popular parametric and non-parametric tests. You will also learn how to produce correlograms and scatter plots with a regression line. Some fundamental notions of regression analysis, such as residuals, homo- and heteroscedasticity, will be introduced. The case studies investigate the relationship between word frequency and mean reaction time in a lexical decision task and the correlation between vocabulary size and grammatical proficiency in first language acquisition.
10
01
JB code
z.195.c7
139
170
32
Article
9
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 7. More on frequencies and reaction times</TitleText>
<Subtitle textformat="02">Linear regression</Subtitle>
01
After the previous chapter has introduced some basic elements of regression analysis, this chapter will provide a more thorough discussion of linear regression. This method enables one to model and explain the relationships between one or more explanatory variables at any level of measurement, on the one hand, and one ratio- or interval-scaled response variable, on the other hand. In addition, one can investigate interactions between explanatory variables. You will learn how to fit a multiple linear regression model, to perform its diagnostics and to interpret the results. You will also learn how to carry out non-parametric linear regression with the help of bootstrap. The case study investigates the relationship between reaction times in a lexical decision task, and such factors as word length, corpus frequency and part of speech of lexical stimuli. In contrast with the previous case studies, all these factors are tested here simultaneously in a multiple linear regression model.
10
01
JB code
z.195.c8
171
198
28
Article
10
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 8. Finding differences between several groups</TitleText>
<Subtitle textformat="02">Sign language, linguistic relativity and ANOVA</Subtitle>
01
This chapter introduces ANOVA (analysis of variance), a special case of linear regression with binary or categorical independent variables. This method is widely used in experimental linguistics, when the researcher compares several groups of experimental objects that undergo different treatments. In this chapter you will learn several types of ANOVA: one-way ANOVA with one factor as an independent variable, factorial ANOVA with two or more categorical independent variables, and repeated-measures and mixed ANOVA. The methods are illustrated by three case studies. The first two focus on grammatical features of an emergent sign language. The third case study deals with cross-linguistic differences in time conceptualization, which are interpreted as evidence in favour of the linguistic relativity hypothesis.
10
01
JB code
z.195.c9
199
222
24
Article
11
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 9. Measuring associations between two categorical variables</TitleText>
<Subtitle textformat="02">Conceptual metaphors and tests of independence</Subtitle>
01
This chapter focuses on associations between two categorical variables. You will learn how to measure the association strength using odds ratios, Cramér’s <i>V</i> and the φ-coefficient. You will also learn how to test whether the association is statistically significant with the help of the <i>χ2</i>-test and the Fisher exact test. Bar plots, mosaic plots and association plots are used as visualization tools for cross-tabulated data. All these concepts and tools will be illustrated by case studies of metaphoric and non-metaphoric uses of the preposition <i>over</i> and the verb <i>see</i> in different registers.
10
01
JB code
z.195.c10
223
240
18
Article
12
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 10. Association measures</TitleText>
<Subtitle textformat="02">collocations and collostructions</Subtitle>
01
Collocations, as well as colligations and other co-occurrence patterns, play an important role in corpus linguistics, psycholinguistics and usage-based grammar and lexicology. To measure the degree of attraction between words and other units, one can use diverse association measures, such as collostructional strength, Pointwise Mutual Information or <i>ΔP</i>. From this chapter you will learn how to compute a variety of association measures using a small set of different co-occurrence frequencies. The case study is based on co-occurrence frequencies of different verbs in the Russian ditransitive construction.
10
01
JB code
z.195.c11
241
252
12
Article
13
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 11. Geographic variation of quite: Distinctive collexeme analysis</TitleText>
01
This chapter introduces distinctive collexeme analysis, which employs bidirectional association measures discussed in the previous chapter. This method is based on the co-occurrence frequencies of words that occur in two near-synonymous constructions, or in two or more dialectal or diachronic variants of the same construction. Here we will compare the variants of <i>quite + </i>ADJ constructions in different national varieties of English. We will first present a canonical distinctive collexeme analysis with only two varieties, British and American English, and then will show how this approach can be extended to more lects, presenting a unified approach to multiple distinctive collexeme analysis.
10
01
JB code
z.195.c12
253
276
24
Article
14
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 12. Probabilistic multifactorial grammar and lexicology</TitleText>
<Subtitle textformat="02">Binomial logistic regression</Subtitle>
01
In this chapter you will learn how to model the speaker’s choice between two near synonymous words or constructions on the basis of contextual features. The most popular statistical tool that is used to create such models is logistic regression. The approach is illustrated by a case study of two Dutch causative auxiliaries. As in the case of linear regression, you will learn how to create, test and interpret a logistic model with the help of different R tools.
10
01
JB code
z.195.c13
277
290
14
Article
15
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 13. Multinomial (polytomous) logistic regression models of three and more near synonyms</TitleText>
01
This chapter continues the discussion of logistic regression models, which can be used to predict the speaker’s choice between different near synonyms or variants. This time you will learn to model situations when the number of possible outcomes is greater than two. Such models are called multinomial, or polytomous. The method will be illustrated with a case study of three near synonyms: <i>let</i>, <i>allow </i>and <i>permit</i>.
10
01
JB code
z.195.c14
291
300
10
Article
16
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 14. Conditional inference trees and random forests</TitleText>
01
This chapter discusses conditional inference trees and random forests. These are non-parametric tree-structure models of regression and classification that can serve as an alternative to multiple regression. They are especially useful in the presence of many high-order interactions and in situations when the sample size is small, but the number of predictors is large. You will learn how to fit such models, interpret their results and evaluate their quality. The case study that illustrates the techniques deals with three English causative constructions <i>make + V</i>, <i>cause + to V</i> and <i>have + V </i>and identifies the set of independent semantic variables that are important for distinguishing between the constructions.
10
01
JB code
z.195.c15
301
322
22
Article
17
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 15. Behavioural profiles, distance metrics and cluster analysis</TitleText>
01
This chapter presents the Behavioural Profiles approach, which involves the comparison of contextual features of words or constructions in a corpus. The chapter also discusses several clustering algorithms, which are based on different distance metrics. Cluster analysis is a family of techniques that can help you discover groups of similar objects in the data. Several popular methods of cluster validation and diagnostics are discussed, which involve the computation of average silhouette widths and multiscale bootstrap resampling. The chapter also demonstrates how to interpret clusters with the help of the snake plot and effect size measures. In addition, you will learn to create and interpret scree plots, which are useful for determining the optimal number of clusters.
10
01
JB code
z.195.c16
323
332
10
Article
18
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 16. Introduction to Semantic Vector Spaces</TitleText>
<Subtitle textformat="02">Cosine as a measure of semantic similarity</Subtitle>
01
This chapter introduces Semantic Vector Spaces, another distributional approach to semantics. This method originates in Natural Language Processing. Unlike Behavioural Profiles discussed in the previous chapter, it uses automatically extracted co-occurrences of target words and contextual features. The characteristic features of the method are weighted co-occurrence frequencies and the use of the cosine as the most popular similarity measure. This chapter provides a general introduction to the method, with a case study of English cooking verbs as an illustration.
10
01
JB code
z.195.c17
333
350
18
Article
19
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 17. Language and space</TitleText>
<Subtitle textformat="02">Dialects, maps and Multidimensional Scaling</Subtitle>
01
This chapter introduces another popular method that deals with distance matrices. This method is called Multidimensional Scaling. It is a dimensionality reduction technique that represents distances between objects in a low-dimensional space. You will learn how to perform different types of metric and non-metric scaling and carry out the diagnostics of solutions by using the scree plot, the Shepard plot and goodness-of-fit measures. The chapter also shows how one can use R for creation of geographical maps with points and text labels. Finally, you will learn how to measure the correlation between two distance matrices with the help of the Mantel test. The case studies are based on geographic coordinates and several linguistic features of varieties of English all over the world.
10
01
JB code
z.195.c18
351
366
16
Article
20
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 18. Multidimensional analysis of register variation</TitleText>
<Subtitle textformat="02">Principal Components Analysis and Factor Analysis</Subtitle>
01
In this chapter you will learn about Principal Components Analysis and Factor Analysis. The aim of these methods is to reduce a large number of correlated quantitative variables to a small set of underlying dimensions. You will learn how to use these methods to perform corpus-based multidimensional analysis of register variation.
10
01
JB code
z.195.c19
367
386
20
Article
21
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 19. Exemplars, categories, prototypes</TitleText>
<Subtitle textformat="02">Simple and multiple correspondence analysis</Subtitle>
01
This chapter introduces Correspondence Analysis. It is similar to PCA, but is designed for visualization and exploration of bivariate and multivariate categorical data. The first case study focuses on register variation of English Basic Colour Terms by using Simple Correspondence Analysis, which can be used for visualization of bivariate categorical data in two-dimensional contingency tables. In the second case study of German lexical categories <i>Stuhl</i> ‘chair’ and <i>Sessel</i> ‘armchair’, you will learn how to perform Multiple Correspondence Analysis with higher-dimensional tables.
10
01
JB code
z.195.c20
387
394
8
Article
22
<TitleType>01</TitleType>
<TitleText textformat="02">Chapter 20. Constructional change and motion charts</TitleText>
01
This chapter introduces motion charts as a method for dynamic visualization of language change. More specifically, they enable one to detect and explore changes in the use of constructions by visualizing the relative frequencies of different lexemes that fill in the constructional slots. The method is illustrated with a case study that explores the changes in the use of future markers <i>will</i> and <i>be going to </i>by comparing the frequencies of infinitives that follow the markers.
10
01
JB code
z.195.epi
395
396
2
Article
23
<TitleType>01</TitleType>
<TitleText textformat="02">Epilogue</TitleText>
10
01
JB code
z.195.app1
397
408
12
Article
24
<TitleType>01</TitleType>
<TitleText textformat="02">The most important R objects and basic operations with them</TitleText>
<TitlePrefix>The </TitlePrefix>
<TitleWithoutPrefix textformat="02">most important R objects and basic operations with them</TitleWithoutPrefix>
10
01
JB code
z.195.app2
409
424
16
Article
25
<TitleType>01</TitleType>
<TitleText textformat="02">Main plotting functions and graphical parameters in R</TitleText>
10
01
JB code
z.195.refs
425
432
8
Article
26
<TitleType>01</TitleType>
<TitleText textformat="02">References</TitleText>
10
01
JB code
z.195.si
433
440
8
Article
27
<TitleType>01</TitleType>
<TitleText textformat="02">Subject Index</TitleText>
10
01
JB code
z.195.ri
441
443
3
Article
28
<TitleType>01</TitleType>
<TitleText textformat="02">Index of R functions and packages</TitleText>
02
JBENJAMINS
John Benjamins Publishing Company
01
John Benjamins Publishing Company
Amsterdam/Philadelphia
NL
04
20151125
2015
John Benjamins B.V.
02
WORLD
01
240
mm
02
170
mm
08
850
gr
01
JB
1
John Benjamins Publishing Company
+31 20 6304747
+31 20 6739773
bookorder@benjamins.nl
01
https://benjamins.com
01
WORLD
US CA MX
21
261
12
01
02
JB
1
00
36.00
EUR
R
02
02
JB
1
00
38.16
EUR
R
01
JB
10
bebc
+44 1202 712 934
+44 1202 712 913
sales@bebc.co.uk
03
GB
21
12
02
02
JB
1
00
30.00
GBP
Z
01
JB
2
John Benjamins North America
+1 800 562-5666
+1 703 661-1501
benjamins@presswarehouse.com
01
https://benjamins.com
01
US CA MX
21
12
12
01
gen
02
JB
1
00
54.00
USD