*To join the LEAP mailing list: https://lists.uchicago.edu/web/info/langpsych
*Click the toggle button to read the abstract (all times in Central Time zone)
Spring quarter: Fridays, 11:00am-12:20pm, details will be sent via email
April 1 – Stephanie Reyes (UChicago, Linguistics)
Title: Characterizing the Scope Interpretations of English-Tagalog Bilingual Speakers in Toronto and Manila
In the heritage language literature, a common approach to assessing the language proficiency of heritage speakers is to compare them to the baseline language group. In the North American context, a baseline language is understood as a language variety spoken by first-generation adult immigrants, while their children are considered heritage speakers. The baseline language typically serves as the input to heritage speakers and understanding the language of the baseline is crucial for inquiry into heritage language learning. To further elucidate the characterization of baseline speakers, this qualifying paper study compares baseline speakers with an understudied group, homeland speakers. Homeland speakers are speakers of the language spoken in the home country of baseline speakers. Most studies on homeland languages have concentrated on monolingual speakers in the homeland. Tagalog homeland speakers in Manila, and their baseline counterparts in Toronto, provide an interesting case study since both groups are bilingual in Tagalog and English. This project studies these groups by investigating their scope ambiguity interpretations, as ambiguity resolution is a common indicator of linguistic ability. Despite increasing investigations of scope ambiguities in the literature, there is limited comparative data on the scope judgments of bilingual baseline speakers and bilingual homeland speakers. In this presentation, I present preliminary results from sentence-picture rating experiments on the availability of scope interpretations among baseline speakers in Toronto and homeland speakers in Manila.
April 15 – Pamela Sugrue (UChicago, Linguistics) rescheduled to May 6
April 29 – Wesley Orth (Northwestern University, Linguistics)
Title: Scope Economy and online sentence processing: How comparative conditions constrain parsing
The grammatical properties of a language constrain the design of the parser used in online sentence processing. Quantification provides a unique test case for how the properties of the grammar influence sentence processing. As Quantifier Raising (QR) in English is a covert movement, the parser has little information about whether raising is possible or necessary. In addition to this challenge, quantification is subject to a grammatical condition called Scope Economy. Scope Economy states that the post-QR structure must yield a new interpretation relative to the pre-QR structures, otherwise QR yields an ungrammatical structure. This comparative economy rule combined with the scant information available to the parser creates a puzzle: If determining if QR is grammatical requires performing QR, how can we design a parser that minimizes the frequency of ungrammatical QRs and ensures we can produce all grammatical QRs? In this talk I discuss what parsers that can handle QR and Scope Economy need to look like and present experimental data from polarity illusions and ellipsis to help narrow the range of potential parsers.
May 6 – Pamela Sugrue (UChicago, Linguistics)
Title: Probing Word Vector Semantics
Distributional semantic representations of words, also called word vectors or word embeddings, are a popular way of representing the meanings of words in computational systems. Word embeddings are numerical (vector) representations of words, calculated (in various ways, depending on the specific model architecture) using the co-occurrence of all words over large corpora, such that words become located in semantic space near to other, semantically similar, words. The core intuition behind representing meaning this way is the Distributional Hypothesis (Harris 1954), which holds that word meaning is a function of how words are used. While these models are popular in computational systems and perform impressively at many natural language processing tasks, open questions remain about what specific aspects of word meaning that word vectors come to encode (Turton et al. 2020, Utsumi 2020, Lebani and Lenci 2021, among others). Rubinstein et al. (2015) propose that taxonomic properties are better represented in distributional semantic models than other properties. The research I will discuss probes the semantic information encoded in vector representations of English adjectives, specifically investigating whether adjective embeddings encode gradability, lexical dimensionality, and subjectivity, and offers evidence in support of Rubinstein et al.’s hypothesis.
May 13 – Tom Schoenemann (Indiana University, Anthropology & Cognitive Science Program)
Title: Brain and language from an evolutionary perspective
Language is one of the most important behavioral adaptations of the human lineage. Understanding the evolution of language is therefore central to understanding the human condition. One part of this involves taking seriously some basic broadly applicable principles of evolutionary change – the most important being that evolutionary process favors the elaboration and modification of pre-existing abilities. This in turn makes essential the exploration of possible language-relevant cognitive and behavioral patterns in our closest relatives. It also means that we should expect overlap and integration of different cognitive systems to be a central feature of language evolution, rather than a set of highly domain-specific modules. Previous studies demonstrating categorical perception in mammals (and monkeys in particular) will be used to illustrate these points. Critiques of ape language studies will also be discussed, and a new statistical analysis of responses of one particularly intriguing subject, the bonobo Kanzi, will be presented arguing that he does in fact appear to understand how argument structure is coded in English grammar using arbitrary word-order rules. Clues about the evolution of language from the fossil endocranial record will be reviewed. Brain size itself is associated with several important behavioral dimensions central to language: The complexity of the social environment itself, as well as the expected richness of conceptual understanding. Thus, as brain size increased during our evolutionary history, hominins with increasingly interesting and rich conceptual understanding lived in increasingly complex and interactive social environments. Finally, two areas of research relevant to language evolution will be discussed: 1) explorations of the hypothesis that Broca’s area (which appears to have long predated the human lineage) may have evolved to pay special attention to sequential pattern information in the environment, and 2) the possible overlap of brain circuitry underlying language and those involved in stone tool manufacturing.
Winter quarter: Fridays, 11:00am-12:20pm, details will be sent via email
*Schedule may be subject to change
January 14 – Hyunji Hayley Park (UIUC, Linguistics)
Title: Pitfalls and possibilities: What NLP systems are missing out on
Despite recent advancements in language modeling and NLP in general, there are still many areas where NLP systems face difficulty. With NLP research disproportionally dedicated to English and a few other high-resource languages, the effect of morphology on NLP systems is clearly an under-studied area. Most high-resource languages such as English and Chinese utilize little morphology, encoding more information syntactically (e.g. word order) than morphologically (e.g. case inflection). Morphologically rich languages like Turkish and St. Lawrence Island Yupik use much more variations in word forms to encode meaning and have flexible or free word order. Regarding this issue, I present two studies that augment the existing data to investigate how morphology interacts with NLP systems. First, I compile a parallel Bible corpus and a linguistic typology database to study the effect of morphology on LSTM language modeling difficulty. The results show that morphological complexity, characterized by higher word type counts, makes a language harder to model. Subword segmentation methods such as BPE and Morfessor mitigate the effect of morphology for some languages, but not for others. Even when they do, they still lag behind morpheme segmentation methods based on FSTs. Next, I develop the first dependency treebank for St. Lawrence Island Yupik and demonstrate how morphology interacts with syntax in the morphologically rich language. I argue that the Universal Dependencies (UD) guidelines, which focus on word-level annotations only, should be extended to morpheme-level annotations for morphologically rich languages. As for another area that requires further research, I present a recent study on long document classification. Several methods have been proposed for the task of long document classification using Transformers. However, there is a lack of consensus on a benchmark to enable a fair comparison among different approaches. In this paper, I provide a comprehensive evaluation of existing models’ relative efficacy against various datasets and baselines– both in terms accuracy as well as time and space overheads. Our results show that existing models often fail to outperform simple baseline models and yield inconsistent performance across the datasets. The findings also emphasize that future studies should consider comprehensive baselines and datasets that better represent the task of long document classification to develop robust models.
Hyunji Hayley Park, Katherine J. Zhang, Coleman Haley, Kenneth Steimel, Han Liu, Lane Schwartz. 2021. Morphology Matters: A Multilingual Language Modeling Analysis. Transactions of the Association for Computational Linguistics, 9: 261–276.
Hyunji Hayley Park, Lane Schwartz, and Francis M. Tyers. 2021. Expanding Universal Dependencies for Polysynthetic Languages: A Case of St. Lawrence Island Yupik. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), pages 131–142, Online. Association for Computational Linguistics.
Hyunji Hayley Park, Yogarshi Vyas, Kashif Shah. Under Review. Efficient Classification of Long Documents Using Transformers.
February 11 – Tal Linzen (NYU, Linguistics and Data Science)
Title: Inductive biases for the acquisition of syntactic transformations in neural networks: an interim update
One of the fundamental questions in the cognitive science of language concerns the conditions under which computational learners generalize from their input in a similar way to humans: by examining the effects of different learning assumptions on the generalizations acquired by the learner, we can construct hypotheses about the constraints and biases that underlie human learning. In this meeting, I will describe an ongoing long-term project with Bob Frank (Yale) and other colleagues, whose goal is to investigate this question using one type of behavior — syntactic mappings between related forms (e.g., a declarative and a question) — and a broad class of learners based on artificial neural networks. I have two goals for this session: first, to give an overview of the project and some of our results; and second, to obtain feedback about the ways this project (and others like it) can engage more productively with syntacticians, whether it is by addressing specific questions they are concerned with, by constructing tests for computational learners that incorporate challenging and subtle syntactic generalizations from a variety of languages, or in any other way.
February 25 – Daniel Lam (UChicago, Linguistics)
Title: The encoding, maintenance and retrieval of complex linguistic representations in working memory
Previous studies have shown that representationally complex referents are encoded slower into working memory (WM) but are retrieved faster (Hofmeister, 2011; Karimi & Ferreira, 2016). However, the cost of maintaining complex representations is still not well understood. In a series of experiments, we investigated the cost of encoding, maintaining and retrieving complex representations, such as coordinated noun phrases (those doctors and nurses) and modified noun phrases (those compassionate ER nurses) in WM. While we replicated the facilitatory effect during retrieval, the slowdown during encoding was not consistent across our experiments. More critically, for the first time, our experiments demonstrated that maintaining complex representations in WM is less costly than maintaining their simple counterparts. Furthermore, we found that WM maintenance cost is reduced because complex target noun phrases are more distinct from other competing referents in WM than simple ones. Additionally, we also carried out two self-paced reading experiments to investigate how similarity between sub-representations in a coordinated noun phrase influences the cost of WM processes. We found that noun phrases combining similar referents (the dog, the cat and the hamster) showed facilitated encoding and maintenance, relative to those combining dissimilar referents (the truck, the TV and the hamster). Overall, our results showed that the semantic elaborateness and similarity between sub-representations in complex representations can reduce WM costs, especially maintenance cost and provided new perspectives into this understudied WM process.
March 11 – Casey Ferrara (UChicago, Psychology)
Title: Iconic Modification in Sign and Gesture
Iconic and gradient “language play” is common in daily communication (e.g. “It’s been a loooooooong day”). However, gradience and iconicity are often excluded from traditional accounts of language. Research on spoken language has found that iconic words such as ideophones are particularly prone to this phenomenon. In sign languages, which have an abundance of visual iconicity, we might expect this phenomenon (“iconic modification”) to occur more frequently. The research I’ll be discussing has aimed at establishing (1) to what extent iconic modification of lexical signs occurs in ASL, (2) what factors determine this distribution, and (3) how signers’ iconic modification compares with silent gesture.
Fall quarter: Fridays, 11:00am-12:20pm, details will be sent via email
October 8 – Sanghee Kim (UChicago, Linguistics)
October 22 – Lucas Fagen (UChicago, Linguistics)
Paper discussion – Cheung, Hartley & Monaghan (2021)
This paper explores how caregivers adapt their gestural cues to referential uncertainty using computational modeling. We will discuss this paper from the perspective of (but not limited to) language acquisition, processing, evolution, and computational modeling.
November 5 – Marisa Casillas (UChicago, Comparative Human Development)
Title: Chatterlab spotlight: Yélî Dnye phonological development
This is an investigation into the phonological development of children acquiring Yélî Dnye, an isolate language of Papua New Guinea. With 56 consonants and 34 vowels, Yélî Dnye has one of the largest recorded phonological inventories in the Pacific. Consonants are densely packed into a few pockets of acoustic/articulatory space, featuring some contrasts that are rare among the world’s languages (e.g., dental vs. post-alveolar stops), understudied in previous work (e.g., doubly-articulated consonants), or a unique combination of the two (e.g., doubly-articulated dental-vs.-post-alveolar stops). I have been using a variety of experiment-based methods to tap into children’s knowledge of these phones between ages 6 months and 12 years and will describe some preliminary results supporting a pattern of marked development for typologically rare phones and phonological contrasts.
November 19 – Rob Voigt (Northwestern University, Linguistics)
Title: Language and Immigration: Historical Views from Personal and Political Speech
In this talk I will discuss two projects related to the relations between language and historical trajectories of immigration in the United States. In the first, we examine the impact the impact of arrival as a refugee to linguistic attainment. Prior work on language attainment has relied on self-reported measures of fluency; by contrast, we apply computational methods to a large archive of recorded oral history interviews. Our findings show that refugee migrants achieved higher levels of English proficiency than did economic migrants, even in a time without official support programs that modern refugees receive. In the second, we use methods based on contextual embeddings to analyze U.S. Congressional speeches and Presidential communications related to immigration from 1880 to the present. We find that political speech about immigration is now much more positive on average than in the past, but that political parties have become increasingly polarized in their expressed tone toward immigration, with Republicans significantly more likely to use language suggestive of dehumanizing metaphors such as Vermin and Disease. Together, these projects provide insight into changing experiences of immigration and attitudes towards immigrants in the US, and the linguistic mechanisms by which these changes play out.
Spring quarter: Fridays, 11:30am-1pm, Zoom details sent via email
- April 30 – Pamela Sugrue (UChicago)
- May 14 Natalie Dowling (UChicago) – **3:30pm-5pm**
- May 21 – Richard Lewis (Michigan)
- June 4 – Anne Pycha (University of Wisconsin-Milwaukee) & Michelle Cohn (UC Davis)
Winter quarter: Fridays, 11:30am-1pm, virtual meetings (Zoom details sent via email)
Week 2, January 22 – Daniel Lam (UChicago)
Title: The encoding, maintenance and retrieval of complex linguistic structures in working memory
Previous studies (Hofmeister, 2011; Vasishth and Hofmeister, 2014) have shown that the encoding of more complex modified NPs (e.g the alleged Venezuelan communist) is slower than that of simpler unmodified NPs (e.g the communist) while the reverse pattern was found for retrieval. This was attributed to more effort for elaboration during encoding, which in turn leads to increased salience and decreased interference during retrieval. Hofmeister (2011, Experiment 3) also showed that the facilitation during retrieval occurs when the features in the complex structures are related (e.g the cruel dictator) and not when they are not (e.g the lovable dictator). In three experiments, we explore the interaction between representational complexity and WM processes further by exploiting the complexity of coordinated structures (e.g those nurses and doctors) relative to uncoordinated structures (e.g those doctors). Experiments 1 and 2 found, contrary to results from Hofmeister (2011), that there is instead a facilitation during encoding thanks to complexity and none during retrieval. We also discovered some facilitation during the maintenance stage that depends on the number of competing referents concurrently held in WM. We also explored in Experiment 3 how feature similarity between the conjuncts in complex coordinated NPs can help the encoding stage.
Week 4, February 5 – Ljiljana Progovac (Wayne State)
Title: What use is half a sentence? Grammar caught in the act of natural/sexual selection
This lecture focuses on the evolution of grammar, advocating a gradual/incremental emergence of hierarchical structure and transitivity, as subject not only to cultural innovation, but also to natural/sexual selection forces. I present a precise syntactic reconstruction of the initial, proto-grammar stage, characterized as an intransitive, flat, absolutive-like, two-slot mold, unable to distinguish subjects from objects. Even this crude grammar offers clear and substantial communicative benefits over no grammar at all, as well as reveals, through its limits, reasons and rationale for evolving more complex grammars. The particular uses to which this proto-grammar can be put even today (e.g. insult: cry-baby, kill-joy; naming: rattle-snake; proverbial wisdoms: Monkey see, monkey do; First come, first serve(d)) reveals why this cultural invention (of coining binary compositions) would have been highly adaptive at the dawn of language. With the goal to shed concrete light on how biological evolution would have begun to shape the genetic make-up that supports human language, a specific sexual selection scenario will be considered. By identifying insult (verbal aggression) as relevant for early language evolution, this proposal directly engages the recent hypothesis of human self-domestication, whose main postulate is a gradual reduction in reactive physical aggression, yielding several testable hypotheses. This approach also identifies testable hypotheses in the arena of neuroimaging, from which some fMRI experimental results will be reported.
Week 6, February 19 – Thomas Sostarics (Northwestern)
Title: Pragmatic Contributions of the L*L-L% Contour in American English
Pierrehumbert and Hirschberg’s seminal 1990 paper proposes a compositional approach for understanding the meaning of nuclear pitch contours in American English. While their approach to a speaker’s choice of “tune” is seemingly rooted in pragmatic notions of common ground and epistemic stance, little empirical work has been done to validate or enrich this claim (Büring 2016; Prieto and Borràs-Comes 2018). While some contours have received considerable attention in the literature, others have received little-to-none. I will be presenting recent results from my qualifying paper investigating the meaning of the rarely-mentioned L*L-L% contour as it contrasts with H*L-L%, the so-called default intonation contour for assertions and declaratives. Despite being “default,” there seem to be particular discourse contexts in which H*L-L% is a less felicitous choice of tune compared to L*L-L%, which we investigate through a series of perception experiments.
Week 9, March 12 – Claire Bergey (UChicago)
Title: Learning communicative acts in children’s conversations: A Hidden Topic Markov Model analysis of the CHILDES corpus
Over their first years of life, children learn not just the words of their native languages, but how to use them to communicate. Because manual annotation of communicative intent does not scale to large corpora, our understanding of communicative act development is limited to case studies of a few children at a few time points. We present an approach to automatic identification of communicative acts using a Hidden Topic Markov Model, applying it to the CHILDES database. We first describe qualitative changes in parent-child communication over development, and then use our method to demonstrate two large-scale features of communicative development: (1) children develop a parent-like repertoire of our model’s communicative acts rapidly, their learning rate peaking around 12 months of age, and (2) this period of steep repertoire change coincides with the highest predictability between parents’ acts and children’s, suggesting that structured interactions play a role in learning to communicate. We then present some preliminary analyses about the relationship between form and function in language acquisition, looking at the correspondence between these communicative acts and sentence frames.
Autumn quarter: Fridays, 11:30am-1pm, virtual meetings (Zoom details sent via email)
Week 2, October 9 – Daniel Lam (UChicago) & Eszter Ronai (UChicago)
Welcome & Bayesian statistics
Week 4, October 23 – Hayoung Song (UChicago)
Title: Predicting attentional engagement during narratives and its consequences for event memoryThe degree to which we are engaged in narratives fluctuates over time. What drives these changes in engagement, and how do they affect what we remember? Behavioral studies showed that people experienced similar fluctuations in engagement during a television show or an audio-narrated story and were more engaged during emotional moments. Functional MRI experiments revealed that changes in a pattern of functional brain connectivity predicted changes in how engaged people were in the narratives. This predictive brain network not only was related to a validated neuromarker of sustained attention, but also predicted what narrative events people recalled after the MRI scan. Together these findings reveal a robust, generalizable neural signature of engagement dynamics and elucidate relationships between narrative engagement, sustained attention, and event memory.
Week 6, November 6 – Emre Hakguder, Aurora Martinez del Rio, Casey Ferrara & Sanghee Kim (UChicago)
Title: Identifying the Correlations Between the Lexical Semantics and Phonology of ASL: A Vector Space Approach
In this study, we create a semantic vector space model (VSM) for lexical word meaning in ASL and use it to investigate whether there is a relationship between the semantic and phonological properties of signs. We hypothesize that clusters of ASL words that are related in meaning are likely to have phonological similarities, basing our hypothesis on the many observations that transparent iconicity can be found at different levels of the grammar in sign languages, including at the lexical level. We show that the more neatly the ASL lexicon is semantically organized, the greater the phonological similarity within its clusters.
Week 10, December 4 – Andras Molnar (Carnegie Mellon)
Title: The demand for, and avoidance of, information
We apply a previously developed “information gap” framework (Golman and Loewenstein, 2018) to make sense of and predict information seeking and avoidance. The resulting theory posits that, beyond the conventional desire for information as an input to decision making, two additional motives contribute to the demand for information: curiosity – the desire to fill information gaps, i.e., to answer specific questions that capture attention; and motivated attention – the desire to savor good news and ignore bad news, i.e., to obtain information we like thinking about and avoid information we do not like thinking about. Five experiments (N = 2,361) test three of the primary hypotheses derived from the theory about the demand for information. People are more inclined to acquire information: (1) when it seems more important, even when the information is not instrumental for decision making (Experiments 1A & 1B); (2) when it is more salient, manipulated by how recently the information gap was opened (Experiments 2A & 2B); and (3) when it has higher valence – i.e., when individuals anticipate receiving more favorable news (Experiment 3). This set of findings cannot be explained by alternative models of information acquisition or avoidance.
Spring quarter: Fridays, 11.00am-12.30pm, virtual meetings
- April 24 – Josef Klafka (Carnegie Mellon)
- May 1 – Jinghua Ou (UChicago)
- May 22 3:30pm-5pm – Daniel Lam (UChicago)
- May 29 3:30pm-5pm – Laura Stigliano (UChicago)
Winter quarter: Fridays, 11.00am-12.30pm, Cobb Lecture Hall Room 107 (unless otherwise noted)
- 24 January: Sanghee Kim (UChicago)
- 7 February: Yi Ting Huang (Maryland)
- 21 February: Ryan Lepic (UChicago)
- 28 February: Eszter Ronai (UChicago) *meeting 3:30pm-4:30pm in Rosenwald 208*
- 13 March: Ljiljana Progovac (Wayne State) — postponed due to COVID-19
Autumn quarter: Fridays, 11.30am-1pm, Cobb Lecture Hall Room 302
- 11 October: Adrian Staub (UMass Amherst)
- October 25: Laura Stigliano & Eszter Ronai (UChicago)
joint meeting with Morphology and Syntax
- 1 November: Brian Dillon (UMass Amherst)
- Thursday, 7 November, 12.30-2pm in SS 302: Jenny Lu (UChicago)
- 15 November: Anisia Popescu (University of Potsdam)
Spring quarter: Thursdays, 11.00-12.30 in Cobb 116
- 18 April, Aurora Martinez del Rio
- 25 April, Ryan Lepic
- May 24, Jon Sprouse (UConn)
joint meeting with Morphology and Syntax
- May 31, Masaya Yoshida (Northwestern)
joint meeting with Morphology and Syntax
- 6 June, Casey Ferrara
Winter quarter: even week Fridays 1:30-2:50 in Cobb 203
- January 18, Andrea E. Martin (Max Planck Institute for Psycholinguistics)
- February 1, Ryan Lepic
- February 15, Eszter Ronai
- March 1, Daniel Lam
- March 15, Josef Klafka
- October 12, Interest meeting
- October 19, Ryan Lepic
- October 26, Daniel Lam
- November 13, Eszter Ronai and Jimmy Waller
- December 7, Madeline Meyers and Claire Bergey