|
|
This seminar/colloquium series is presented by the Department of Linguistics and the Division of Continuing Studies, University of Victoria.
All sessions are free and open to the public: no pre-registration is required.
Colloquia Spring 2012
Date |
Time |
Location |
Presenter |
Topic |
| May 16th, 2012 | 2:00-3:30pm | David Strong Building, C126 | Some considerations for an appropriate protocol for archivable corpora |
|
| March 23rd, 2012 | 3:30pm | CLE-C108 | At the Intersection of Race and Region: The Case of the PIN-PEN Merger in DC |
* For further information, please contact Dr. Hossein Nassaji (nassaji@uvic.ca).
Bio: Robert Podesva holds a Ph.D. and M.A. in linguistics from Stanford University, a B.A. in linguistics from Cornell University, and is formerly a faculty member at Georgetown University. Currently an Assistant Professor of Linguistics at Stanford University, he teaches a variety of courses on sociolinguistic variation, language and identity, and sociophonetics. His research examines the social significance of phonetic variation in the domains of vowels, consonants, prosody, and voice quality. He is particularly interested in how individuals draw on phonetic resources to construct identity, most notably gender, sexuality, race, and their intersections. His current projects investigate the linguistic practices of U.S. politicians, gay professionals, and residents of communities in Northern California and Washington, D.C. He has co-edited two volumes on the topic of language and sexuality and is currently editing a collection on research methods in linguistics.
[ c l o s e ]The PIN-PEN merger is well attested in both Southern (Brown 1991, Baranowski 2007) and African American (Bailey and Thomas 1998, Rickford 1999, Thomas 2007) varieties of English. Its patterning in Washington, DC – where North meets South and where African American speakers constitute a racial majority – suggests that even though the merger is prevalent among African American speakers, it is losing ground.
Previous sociophonetic studies have appealed to a variety of techniques for measuring mergers, including the impressionistic inspection of vowel formant plots, the difference in mean F1 or F2 between vowels, the Euclidean distance between vowel means (Harrington 2006), and the Pillai trace, a measure of overlap between two vowel distributions (Hay et al. 2006, Hall-Lew 2009). Based on an analysis of the three acoustic measures, I show that multiple measures are needed to adequately capture PIN-PEN patterns in DC. Pillai traces and Euclidean differences reveal that African American speakers merge PIN and PEN to a greater extent than white speakers, while differences in mean F1 show that younger women merge less, suggesting a change in progress that begins in the dimension of vowel height. Thus, while measures like Euclidean distance and the Pillai trace are informative, a traditional measure like the distance in mean F1 captures other crucial information.
I attribute the fact that African American speakers exhibit the merger while white speakers do not to the longstanding history of racial segregation in DC. Since the largest influx of African Americans in DC decades before the Great Migration, the city and its suburbs remain largely segregated (Manning 1998), a situation that has maintained distinct merger patterns. The finding that the merger is on its way out is likely due to more recent trends of inter-ethnic interaction, facilitated by discourses of diversity that are sweeping several of the district’s neighborhoods. I suggest that the PIN-PEN merger is losing ground in DC, even in the speech of African Americans, because it has not been enregistered (Agha 2003) as a distinctive feature of African American English, given its greater recognition as a feature of Southern varieties of English. [ c l o s e ]
Bio: Malcah Yaeger-Dror is a research scientist in the Cognitive Sciences program at the University of Arizona. She is also affiliated with the Linguistic Data Consortium (
Sociolinguists, corpus linguists and field linguists have a ‘common denominator’ set of issues which they address as they develop the corpora that they then analyze. As linguists have determined that larger corpora are needed (Kendall 2011), it has become obvious that any one individual researcher may not be able to develop a corpus that is sufficient; for example, now that we realize that it may be important to return many years later, to monitor for linguistic ‘change in real-time’, it seems much more important to archive data for later ‘mining’. It turns out there are several different tasks involved in acquiring an archivable corpus: the two earliest tasks should be provided for in the protocol for the collection of the corpus. One, of course, is determining how to develop a high quality corpus, and save the data, whether the corpus is of articulation, sound files, or text. Another issue provided for in the protocol requires careful choice of the speaker demographics, and of social situation(s) in which all the interactions should take place. Often coding for these features is referred to as the ‘metadata’ for the archive. A series of workshops have recently taken place at the NWAV and LSA conferences (http://projects.ldc.upenn.edu/NSF_Coding_Workshop_LSA/index.html) with discussions focused principally on the preferred granularity of the metadata for a subset of categories that are necessary for linguistic fieldwork generally, and for accurate analysis of language or speech variation in the field research conditions -- like the analysis of first nations' speech communities, laboratory phonology, sociolinguistics, psycholinguistics, language change, corpus linguistics, or even 'Human Language Technology'. The workshop participants demonstrated that the metadata variables discussed provide necessary information whether research is beingcarried out in North or South America, Europe, Asia, the Middle East, or Africa. The participants agreed that protocols should be developed, and unified, to permit later data sharing. Protocols do not always clarify all of the demographic and interactive/situational features which will be needed to permit later comparisons among speakers. As a result, the metadata often do not specify important information, and even when one can go back to get the information, there may be insufficient speakers in a given cell to permit subsequent analysis. After a short review of how we got to the point of realizing the need for larger corpora, this paper will focus on the issue which research protocols do not always address: the elicitation and encoding of demographic, situational and attitudinal ‘metadata’ for accurate linguistic research with an eye toward standardization of the coding to facilitate subsequent data sharing through future research archives. We present evidence that the methodological issues behind coding choices apply generally to speech collection, and should become a prominent aspect of any field corpus gathering protocol.
[ c l o s e ]