Monday, September 23, 2013

On dialects, ransom notes and like, California

One area of linguistic research that I find interesting is dialectology, the study of linguistic variation of a single language across a region. A dialect is typically considered a regional variety of a language and dialectologists study, among other aspects, differences in pronunciation, lexicon (word choice) and sentence construction across varieties. 

To determine whether you have a dialect or a language, one question to consider is that of mutual intelligibility. That is, can a speaker of variety X understand a speaker of variety Y and vice versa? There may be higher or lower levels of comprehension on one side or the other; so speaker X might be able to understand more of speaker Y than Y can of X, for example, or the opposite. This doesn’t necessarily mean that they are two distinct languages, but rather variants of a related origin. As an example of this concept, I was reminded of media response to a witness’ speech style in a recently heavily covered trial. I know that this example is not exactly contemporary at this point, but I do think it adds to the argument that AAVE is still largely received as a bastardized version of standard English rather than a distinct rule-governed variant due to lesser intelligibility of non-speakers, and the social implications of who speaks (or has access to) this particular variety only complicates matters further, resulting in a diminished sense of shared cultural identity vis-√†-vis language.

This brief diatribe was only meant to show that differences do in fact exist across a community of speakers, in this case the community being the population of the US whose native language is English. But such a concept is hardly news, a nation-wide dialect survey was conducted by Harvard in 2003 and catalogued differences in sounds in English across the United States as well as particular grammatical constructions through participant survey data collection (How do you pronounce “aunt” or the second vowel in “pajamas” “Where are you?” vs “Where you at?”). What spurred this entry was actually a little light research inspired by one of the talks I heard earlier this month on forensic dialectology given by Jack Grieve at the International Summer School in Forensic Linguistic Analysis in Mainz, Germany.

The forensic aspect of dialectology lies in analyzing written or spoken samples of text (by an unknown author) for distinctive features that might point to a certain region to help narrow down a list of suspects, so-called “author profiling”. As an example, a case was presented wherein a noted linguist was asked to help police in a search for a kidnapper using only a ransom note. As a sidebar, such cases can provide a challenge as authorities usually have only a suspect text to work with (one sample) to try and establish a profile. Linguists can aid in the process by analyzing such texts, as Roger Shuy did in this particular Illinois case, for linguistic clues as to who could have done it. Once a profile has been established, samples from other suspects can be compared to the text in question and further analyzed to determine likeliness of shared authorship, playing a potentially decisive role in solving some crimes containing such texts. This field is also limited by aspects such as genre, which is to say, the type of text. What is meant here is that a ransom note is different from an email, which is different from a report, which is different from shorthand notes of a meeting. All are types of texts which can be analyzed but there are differences in style and register in each of these texts, adding another layer of complexity to analysis in an attempt to profile. However, it can still be (and has been) used to further the investigative process. The ransom note from the case Shuy helped in contained a demand for money and instructions (excerpt from note)

“No kops! Come alone! Put it [the diaper bag of money] in the green trash kan on the devil strip at the corner of 18th and Carlson”

Based on the note alone, Shuy asked the investigating officers if any of their suspects were educated males from Akron, Ohio. Incidentally, there was only one suspect who fit this profile and turned out to be the person they were searching for. Shuy deduced that “kops” and “kan” were intentional misspellings, as an attempt to disguise the level of proficiency (the note contained correct spellings of “daughter” and “diaper” words arguably less intuitive to spell than the two that were misspelled) and the use of “devil strip” to describe a patch of grass between the sidewalk and street, known by that term only in Akron, Ohio. (Hitt, 26)

Dr. Grieve presented lexical variations across the United States using site restricted web searches. What’s an example of lexical variation? Among others presented, consider (as a native speaker of American English), do you refer to the object as a “trash can” or a “garbage can”? Or the meal you eat in the evening as “dinner” or “supper”? First such a lexical pair (two words referring to the same object) is established, then a specific kind of internet search is carried out. Newspaper websites were used in this case as they contain a fairly standard form of English, are published daily and widely available on the internet, thus providing a significant enough corpus to draw conclusions from. This search shows how many times the word “trash can” appeared in the LA times website, which was then compared to the frequency of “garbage can” on the same website. The results of the LA times search were
Typed into google
Number of pages
72% of time
28% of time

Imagine me sitting in a conference room, one of only a handful of native American English speakers, specifically from the West, thinking to myself, “Yeah, I would say trash can.” The other example that came to my mind was “pop” and “soda,” a distinction I learned of on a trip to Illinois in my youth for a family reunion. Dr. Grieve also commented on this pair as a classic example, but mentioned that web searches might not be the best way to establish such variance as one can sometimes use the word “pop” without referring to a zesty beverage (pop music). 

If you’re still with me, by now you’ve got an idea of dialectology, and have become acquainted with a method of data collection using site restricted newspaper website searches to map variance of vocabulary across a region. Now to the real topic, Californian English. 

As I mentioned, Dr. Grieve’s work interested me, and a perusal of his website led me to another study, conducted on Californian English by Costanza Asnaghi. The study begins by stating the goals: to determine if modern written Californian English contains regional variation and to map and describe these written dialects. To answer the question of “Why study Californian English?” Asnaghi points out that dialect studies tend to consider “The West,” comprising 10 (or more depending on which map or study you’re consulting) western states, as one region of speakers. Furthermore, California is the most populous state (12% of population) and with its large area has multiple population centers (Sacramento, San Francisco, San Jose, Fresno, Los Angeles, San Diego) and is grist for such a study on regional variance. Asnaghi also mentions previously incomplete research in the area of Californian English combined with a considerable population of speakers as a sort of gap in American regional dialect knowledge. Asnaghi also points out the varied landscape in California (from deserts to mountains to forests and coastlines) as another potentially contributing factor to consider in regional variation. The study aims to expand on previously conducted research (Bright, Elizabeth. A Word Geography of California and Nevada. 1971), which mapped dialectical variations within the state, but offered little explanation of the differences, as well as look for any statistically significant distinctions, be they north/south, inland/coastal or urban/rural.
The study was done using site restricted web searches of word pairs such as the ones previously mentioned using 245 Californian newspapers from 176 cities. Below you see a list of the pairs examined taken from Asnaghi’s presentation. The slide that follows shows the result of one of those pairs, as well as a proportional measurement of frequency for the less frequent of the two terms in the LA times website, namely “pail.” 

Using all kinds of statistical models and scatter plot maps (like the third image, showing the spatial clustering of buddy vs. pal, wherein the redder the dot indicates a higher frequency of the first term and the bluer the dot the more frequent the use of the second), which I have yet to fully comprehend, the study concluded that there is an observable regional lexical variation in standard written Californian English. Asnaghi highlights the advantages for this kind of study, also mentioning the legitimacy of this method for data collection and analysis, citing other nation-wide dialect studies conducted in a similar fashion, while also including its limitations and questions for further research.
Hella Nor Cal or Totally So Cal: The Perceptual Dialectology of California is another study which piqued my interest when I was a student as I was being taught by the professor (Bucholtz et al.) It examined how Californians themselves perceive the regional differences, which tend to fall into the geographical North/South distinction as characterized in the title. This difference only became apparent to me once I was studying in another city as I had previously only been surrounded by a certain population of speakers and thus never thought of my speech style as marked in any way, when in fact it totally is!

Works Cited:
Asnaghi, Costanza.  2012, October 8. Dissertation: Patterns of Lexical Variation in California English in Newspaper Writing. Universit√† Cattolica del Sacro Cuore, Milan, Italy and Department of Linguistics, KU Leuven, Belgium. Supervisors: Prof. Dirk Speelman, Prof. Maria Luisa Maggioni, Dr. Jack Grieve.

Hitt, Jack. 2012, July 23. Words on Trial: Using Linguistics to Solve Crimes. The New Yorker, 24-30.
Jack Grieve and Costanza Asnaghi. A lexical dialect survey of American English using site-restricted web searches. Presented at the American Dialect Society Annual Meeting, Boston, United States January 4, 2013.

Mary Bucholtz, Nancy Bermudez, Victor Fung, Lisa Edwards, Rosalva Vargas. Hella Nor Cal or Totally So Cal? The Perceptual Dialectology of California. Journal of English Linguistics December 2007 vol. 35 no. 4 325-352 accessed September 23, 2013.

McWhorter, John. Rachel Jeantel Explained, Linguistically. TIME Ideas under LAW. June 28, 2013 accessed September 22, 2013.

Rickford, John. “
Rachel Jeantel’s language in the Zimmerman trial” The Language Log under language and culture, variation. July 10, 2013 accessed September 22, 2013. 

Slide images taken from Grieve and Asnaghi’s presentations