Click the toggle button to read talk titles and abstracts.
2023–24
Spring quarter: Fridays, 11:00am-12:20pm
5 April – Benjamin Morris (PhD Candidate, Psychology, UChicago)
Title: Thinking “Um…” out loud: Children’s inferences from speech disfluencies
Abstract: Conversation is profoundly structured by an expectation of timeliness. As a result, adults derive a range of inferences when a speaker pauses or produces disfluencies (e.g., “um”) in speech—inferring a speaker’s knowledge, willingness, and more. In this presentation, I will discuss 3 studies (total n = 305) exploring when and how children ages 4-9 use disfluencies to infer a speaker’s knowledge and preferences. Our mental states leak out in all manner of cues during conversation, and across these studies, I hope to paint a picture of how children begin to make use of that information to become smooth conversationalists, and also to learn about their social worlds.
12 April – Zejian Lyu (MACSS student, UChicago)
Title: Connecting the Language Computed and Text Collected, a glance at the usage of computational content analysis in social language study
Abstract: Computational content analysis forms a highly potential yet unexplored method for the social study of language. Although there are quite a lot of projects and papers that leverage this method and make remarkable explorations, this newly emerging and somehow alienated method is far from mature. The general group of researchers lacks an understanding of what it can do, what it can’t, and what should be done in the seeable future. In attempting to address this topic, my presentation will provide a glance at the possible directions of leveraging computational content analysis, drawing on examples from my current projects and recent remarkable publications. I tend to dive into how these methods are configured (in a comprehensible way) and how they could be adjusted for a few practical or theoretical needs illustrated. What’s more, I would also like to articulate my own understanding of these methods and pose hints on how we could enlarge our imagination on research projects with the possibilities brought by these methods.
19 April – Melinh Lai (Asst. Instructional Professor, Cognitive Science, UChicago)
Title: The fate of the unexpected: Downstream consequences of prediction violations
Abstract: Amid increasing interest in the role of prediction in language comprehension, there remains a gap in our understanding of what happens when predictions are disconfirmed. Are unexpected words harder to process and encode because of interference from the original prediction? Or, because of their relevance for learning, do expectation violations strengthen the representations of unexpected words? In two experiments, we used event-related potentials to probe the downstream consequences of prediction violations.
10 May – Eugene Yu Ji (Postdoctoral Teaching Fellow, Cognitive Science, UChicago)
Title: Two Models of Metalinguistics
Abstract: In this presentation, I will introduce two innovative cognitive-driven, corpus-based models for investigating “metasemantic presupposition” and “metapragmatic presupposition,” integrating cognition and computation with metalinguistic theory in sociolinguistics and linguistic anthropology (Silverstein 1976, 1993; Urban 2006, 2018). The first model develops a distributional metric in hyperbolic coordinates, mapping metasemantic presupposition as co-textual relatedness and metapragmatic presupposition as co-textual dependency. The second model adopts a dynamical-system approach (Niyogi 2006) to map metasemantic and metapragmatic presuppositions conditioned upon meta-level awareness of synchronization of linguistic categories over time. Empirically, the first model is applied to relatively long-term (hundreds to thousands of years) shifts of semantic categories with case studies including color, kinship, smell, and shape-based classifiers based on ancient and modern Chinese texts. The second model is tested against diachronic change data from the Google Ngram corpus of written American English, modeling shorter-term changes spanning decades to centuries in semantic categories related to color, kinship, smell, and political and religious identities. Together, this research advances cognitive-driven computational models informed by the contemporary literature of sociolinguistics and linguistic anthropology, and also proposes a novel framework for generating empirical hypotheses that contribute to bridging micro-level psycholinguistic and macro-level sociolinguistic phenomena.
24 May – Parker Robbins (PhD student, Linguistics, UChicago) — Cobb 115 and on Zoom
Title: Must speakers perceive to achieve?: Investigating the role of feedback perception in audience design
Abstract: Speakers tailor communication to a particular listener based on what they believe that listener knows, a skill called audience design (AD; Clark & Murphy 1982). For example, a UChicago student who is giving directions to a campus visitor may refrain from referring to Regenstein Library as “the Reg” because they assume that a visitor may not understand that abbreviation. In general, speakers are very good at AD (Horton 2005), but sometimes they make mistakes: For example, Navarro et al. 2020 found that speakers under high cognitive load were more likely to make egocentric AD errors in a referential communication task; speakers with higher fluid intelligence and working memory capacity were overall less likely to make errors, however. Though not as well studied as the aforementioned cognitive factors, social factors—especially the ability to make use of feedback cues from listeners—have been shown to be another important part of successful AD (i.a. Horton 2005, Krahmer & Swerts 2005). In this presentation, I will outline a study design that aims to investigate the extent to which failing to perceive feedback cues contributes to failures in audience design. This is joint work with Monica Do and Serpil Karabuklu.
Winter quarter: Fridays, 11:00am-12:20pm
19 January – ON ZOOM ONLY – Melissa Baese-Berk (Assoc. Professor, Linguistics, UChicago)
Title: Adaptation to unfamiliar speech
Abstract: It is often more difficult for listeners to understand accents they are unfamiliar with. Often this difficulty manifests in terms of decreased intelligibility, or ability to transcribe the speech the listener hears. However, with relatively limited exposure, listeners can improve their perception of this speech. In this talk, I present a series of studies that are designed to investigate the factors that impact initial perception of unfamiliar accents, how listeners can adapt to these accents, and how this adaptation generalizes to novel talkers and accents. I will also discuss some consequences of adaptation for later processing (e.g., comprehension and memory). I will conclude with recommendations for future work in this area, including ways these lab-based studies can be translated to real-world practice.
9 February – Sanghee Kim (PhD Candidate, Linguistics, UChicago)
Title: Encoding discourse structure information during language comprehension: Evidence from web-based visual world paradigm experiments
Abstract: This study explores the way discourse structure-related information is used during the encoding of linguistic representations, using the distinction between main and subordinate information as a case study. We use the two contrasting constructions: (a) “The singers[+main] who admired the violinists[+main] invited their mentors to the party”; and (b) “The singers[+main], who admired the violinists[+subordinate], invited their mentors to the party.” While both contain discourse-main information, (b) includes discourse-subordinate information in the clause, (the singers) admired the violinists. Importantly, the singers and the violinists are both plausible antecedents for their, but the overlap in discourse structure information between the two NPs differs: (a) has an overlap ({[+main], [+main]}); (b) has no overlap ({[+main], [+subordinate]}). We found evidence through two web-based eye-tracking experiments using a visual world paradigm that the overlap in discourse structure leads to a competition effect between the two NPs, evidenced by smaller eye-gaze differences between the two NPs in (a) compared to (b). We also find that this competition effect manifests early, even before the relevant information needs to be retrieved, i.e., before pronoun resolution. I will also talk about my experience in using PCIbex for conducting a visual world paradigm eye-tracking experiment, including the caveats and practical challenges as well as the benefits of using it.
23 February – Aaron White (Assoc. Professor, Linguistics, University of Rochester)
Title: Identifying lexical semantic categories from gradient inference judgments (joint work with Benjamin Kane and William Gantt)
Abstract: Gaps in logically possible patterns of lexically triggered inferences have long played an important role in semantic theory because they suggest potentially deep constraints on lexicalization. The clearest cases of such gaps tend to be found among closed classes–e.g. the apparent lack of non-conservative quantificational determiners and non-convex color terms–and it is often quite challenging to clearly establish similar gaps among more open classes–even in the lexicons of extremely well-studied languages. One conclusion that might be drawn from this difficulty is that it is hard to find inferential gaps in open classes because they do not exist–at least not in the same way as for closed classes.
In this talk, I argue against this conclusion. I suggest, instead, that such gaps do exist–in spades–but that they are often difficult to discover due to the gradience inherent in inference judgments. I take as a case study lexically triggered inferences that are associated with predicates’ intensional properties—in particular, lexically triggered veridicality inferences, neg(ation)-raising inferences, doxastic inferences, and bouletic inferences. These inferences are of interest for at least two reasons. First, they have been argued to display apparent correlations with each other across lexical items—potentially suggesting some core set of lexical properties that interact to give rise to them. Second, they have been argued to correlate with morphosyntactic distribution—potentially suggesting that said lexical properties may be formally represented, rather than solely a byproduct of how conceptual representations interact with pragmatic reasoning.
I report on the collection of three dataset that attempt to capture these inferences across over 1,000 predicates of English as well as a dataset that captures the acceptability of those predicates in a variety of syntactic contexts. I then develop a clustering model that attempts to discover underlying classes of predicates in terms of their inference judgments while appropriately modeling potential sources of gradience in those judgments. I show that, if the number of classes to assume is determined by selecting the clustering that best predicts the acceptability judgments with the minimum number of clusters, a small, compact set of intuitive classes is revealed. I argue that this finding suggests not only that there are clear classes of predicates–and gaps between them–but also that these classes are plausibly semantic, given that they correlate with syntactic distribution.
1 March – Yuchen Jin (PhD Student, Comp. Human Development, UChicago)
Title: When does word learning happen? The contribution of overhearable speech
Abstract: When does word learning happen? Fruitful research has been done on how children learn words from caregivers’ speech that is directly addressed to them. But little do we know about what happens with overhearable speech which constitutes a considerable proportion of children’s linguistic input – neither its linguistic features nor its impact on child language development. Current research therefore aimed to investigate whether and when overhearable speech contributes to everyday word learning. In study 1, we used “transcripts-informed parental survey” to estimate the frequency distribution of common household object labels. Based on the findings, in study 2, we conducted an eye-tracking experiment to assess whether children of 18, 24 and 30 months old understand words that mostly appear in overhearable speech. I can only share the preliminary results, since the data analysis is still ongoing. I would really love to learn about your suggestions on everything, including data analysis, interpretation, presentation as well as further directions.
Autumn quarter: Fridays, 11:00am-12:20pm
6 October – Planning Meeting — ONLINE ONLY!
Join us on Zoom to meet your new coordinator and faculty sponsors, discuss plans for this year’s workshop, and get to know fellow LEAPers! The Zoom link will be sent out by email.
13 October at 3:30 pm – Matt Goldrick (Professor, Linguistics, Northwestern)
This talk will be jointly hosted by LEAP and the Language Variation and Change workshop (LVC) and will take place at 3:30 pm in Cobb Hall 115.
Title: Mechanisms of language control at multiple levels of linguistic structure
Abstract: Multilingual speakers have the amazing capacity to shift between quite different codes for communication. What cognitive mechanisms allow speakers to fluently produce an intended language, even when one language is much more difficult to access than another? I’ll discuss data from experiments eliciting isolated words and connected speech from bilingual speakers that suggests lexical inhibition– a temporary reduction in the accessibility of lexical items in the easier-to-access language – plays a key role in allowing bilinguals to fluently shift languages. I’ll then discuss phonetic data suggesting that different mechanisms may be at play in the processing of sound structure.
3 November – Lucas Fagen (PhD Student, Linguistics, UChicago)
Title: Pronominal and reflexive resolution in noncomplementary environments
Abstract: Binding theory aims to explain the finding that pronouns and reflexives are often in complementary distribution. Syntactic environments in which complementarity does not hold have thus attracted attention. Descriptively, complementarity in English appears strongest when pronouns, reflexives, and their antecedents are coarguments of the same predicate. Experimental literature on anaphora has found that the structural restrictions posited by binding theory influence resolution, but the majority of studies have only tested coargument contexts. I’ll present data from experiments examining pronominal and reflexive resolution across a wider range of environments than prior studies have tested: coarguments, picture noun phrases, comparatives, coordination, and prepositional phrases. Although complementarity is indeed strongest in coargument positions, results show substantial gradience across environments. I’ll discuss this finding in the context of the theoretical literature on anaphora.
17 November – Dave Kush (Asst. Professor, Linguistics, Univ. of Toronto)
Title: Grammatical prediction in active dependency resolution: Insights from cataphora
Abstract: Real-time dependency resolution is an active process. Nearly all researchers agree that active dependency resolution relies, to some extent, on prediction: comprehenders appear to commit to analyses in advance of unambiguous confirmatory evidence. Researchers disagree, however, on how far in advance prediction occurs, what portions of linguistic representation(s) are predicted, and how to characterize the mechanisms that subserve predictive processes. In this talk, I’ll present results from a series of collaborative studies on the processing of cataphora in Norwegian, Dutch, and English to probe the limits of prediction. I’ll argue (i) that comprehenders can make predictions earlier than is commonly assumed, (ii) that fine-grained predictions are made above the lexical level, and (iii) that predictive mechanisms are (relatively) grammatically faithful. I discuss how these results support a model of hierarchical prediction as inference to the best analysis across multiple levels of linguistic representation.
1 December – Casey Ferrara (PhD Candidate, Psychology, UChicago)
Title: Show and Tell: Children’s use of depiction in sign and silent gesture
Abstract: Depiction is a powerful tool for communicating visual information. However, to depict a specific entity, event, or scene, we often need to get creative and make use of non-standardized forms (e.g., playing with the forms of words, incorporating gestures, etc.). In this study I investigate the ways that children develop depictive strategies in the manual modality using (1) deaf children acquiring ASL, and (2) hearing children using silent gesture. I’ll be presenting newly collected data from hearing and deaf children and reviewing preliminary findings.
2022–23
Spring quarter: Fridays, 11:00am-12:20pm, details will be sent via email
March 31 – Bart de Boer (Professor, Artificial Intelligence Research Group, Vrije Universiteit Brussel)
Title: Agent-based modeling of language evolution
Abstract: Agent-based modeling is a powerful computational technique to investigate linguistic questions: it uses techniques from artificial intelligence to simulate populations of language users and it has been used successfully to investigate questions related to language emergence and language change. This talk will give a brief introduction to agent-based modeling, will review some results from past research and will present examples of ongoing research in this area. It will attempt to show how these computer simulations can be used to shed light on real-world linguistic questions.
April 14 – Monica L. Do (Assistant Professor, Linguistics, UChicago)
Title: Ordering Nouns and Adjectives in Naturally Emerging Sign Language, Homesign, and Elicited Silent Gesture
Abstract: Although some languages order adjectives before nouns (e.g., English, “orange cat”) and others order nouns before adjectives (e.g., French, “chat orange”), overall there is a typological preference across languages for one of these two orders––Noun-Adjective. We use two complementary approaches to ask whether new languages display the Noun-Adjective order, thus providing evidence for an ordering bias that humans bring to language. In Study 1, we show a bias towards the Noun-Adjective order in signers of Nicaraguan Sign Language, an emerging language. In that study, we also take a first look at the ordering biases among Homesigning individuals. In Study 2, we corroborate the bias towards the Noun-Adjective order in silent gesturers (English-speakers asked to gesture without speech). The parallels between our naturalistic and experimental investigations validate using experimental approaches like the silent gesture paradigm to explore language emergence, and provide evidence for a bias to order nouns before adjectives in emerging languages.
* This is a collaborative work with Simon Kirby, Susan Goldin-Meadow, Laura Horton, Natasha Abner, Molly Flaherty, Marie Coppola, and Ann Senghas.
April 14 – William Croft (Professor, Linguistics, U of New Mexico)
This talk has been canceled for unforeseen circumstances.
May 5 – Lalchand Pandia (PhD Student, Computer Science, UChicago)
Title: Do Natural Language Inference models understand monotonicity reasoning?
Abstract: Natural Language Inference (NLI) models based on pretrain-and-finetune paradigm achieve state-of-the-art results on current benchmarks. NLI involves different kinds of lexical and logical inferences. In this work, we specifically focus on monotonicity reasoning. Monotonicity reasoning is a type of logical inference which involves replacing one term by another that is either less or more specific. In this project, we look at how well do current NLI models perform on such inferences. For this purpose, we created a synthetic dataset with comparative determiners. We find that the performance of many state-of-the-art models is substantially worse. We also delve deeper to understand some of the potential reasons of failure.
* This is a collaborative work with Lucas Fagen, Sanghee Kim, and Allyson Ettinger.
May 12 – Chih-chan Tien (PhD Student, Computer Science, UChicago)
This talk has been canceled for unforeseen circumstances.
May 12 – Mourad Heddaya (PhD Student, Computer Science, UChicago)
Work in progress discussion.
We aim to investigate the role of meaning in next token prediction. The central task is to define access to meaning. We will then measure the contribution of meaning compared to other predictive cues.
May 23 – Lucas Fagen & Sanghee Kim (PhD Student, Linguistics, UChicago)
General discussion on language evolution, acquisition, and processing.
We discuss the overall topics on language evolution, acquisition, and processing, and future directions of the workshop meeting.
Winter quarter: Fridays, 11:00am-12:20pm, details will be sent via email
January 13 – Mourad Heddaya (PhD Student, Computer Science, UChicago)
Title: Language of Bargaining
Abstract: Bilateral bargaining dates back millennia and directs substantial amounts of modern economic trade. Generally speaking, the activity involves two interested parties communicating with one another over issues of concern. And yet, there is scarce research on how natural language actively structures these negotiations. For example, are there linguistic elements that are reliably associated with success? In this research, we investigate the natural language of bilateral bargaining in a controlled experimental setting. In this environment, subjects bargain over the price of a house for sale. Parties share some common information (e.g., prices of comparable houses) but also some private information (e.g., the maximum the buyer is willing to pay). The treatment in the experiment is the manner in which subjects communicate: either through alternating, written numeric offers or unstructured, verbal communication. Despite the two contrasting forms of communication, we find that the average agreed prices of the two treatments are virtually identical. But the likelihood of reaching agreement rises significantly under the open communication treatment. When subjects can talk, fewer offers are exchanged, negotiations finish faster, and the variance of possible prices that subjects agree drops substantially. Preliminary evidence suggests that buyers are more successful (i.e., negotiate a lower price) when they are less polite to sellers, are more emotionally distant in their language, talk more about their personal situation and budget, and exhibit more confidence. On the other hand, sellers are more successful (i.e., negotiate a higher price) when they are more polite to buyers, talk more about the house characteristics, use more social words like “we,” and hedge less in their language. We further investigate how language affects the trajectory of offers and whether can develop an algorithm that can predict the likelihood of either party succeeding.
January 27 – Marie-Catherine de Marneffe (Associate Professor, Linguistics, FNRS – UCLouvain – Ohio State University)
Title: Investigating Reasons for Disagreement in Natural Language Inference
Abstract: Current practices of operationalizing annotations in crowdsourced datasets for natural language understanding (NLU) too often assume one single label per item. In this talk, I argue that NLU should investigate disagreement in annotations – human label variation (Plank 2022) – to fully capture human interpretations of language. I investigate how human label variation in natural language inference (NLI) arises, focusing on linguistic phenomena present in the sentences that lead to different interpretations. I also explore two modeling approaches for detecting items with potential disagreement (a 4-way classification with a Complicated label in addition to the three standard NLI labels, and a multilabel classification approach), and evaluate whether these approaches recall the possible interpretations in the data.
February 24 – Aurora Martinez del Rio (PhD Candidate, Linguistics, UChicago)
Title: Repetition reduction across the ASL lexicon
Abstract: In some theories of language production (Fowler & Housum 1978, Aylett & Turk 2004), phonetic reduction has been associated with balancing articulatory effort and comprehension. However, the distinct articulatory constraints of different linguistic systems may allow for different possibilities for reducing articulatory effort, in turn having a distinct impact on an interlocutor’s perception of reduced forms. In my presentation, I will be presenting results from a project examining how fingerspelled words and core lexical signs in American Sign Language (ASL) reduce as they are repeated. I compare patterns in reduction across these two systems within the ASL lexicon, looking both at the realization of reduction trends in production, as well as the corresponding impact of this reduction on perception. These two different parts of the ASL lexicon have different structural and articulatory properties, leading to predictions that they will exhibit distinct patterns in reduction. In contrast to initial predictions, the findings from the language production analysis show strikingly similar patterns in duration reduction between fingerspelling and lexical signs. However, contrasting with the results from production, findings from a perception experiment suggest an unequal impact of reduction on how signers of ASL perceive reduction in each of these systems.
March 3 – Kanishka Misra (PhD Candidate, Purdue University)
Title: COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language Models
Abstract: A characteristic feature of human semantic cognition is the ability to not only store and retrieve the properties of concepts observed through experience, but to also facilitate the inheritance of properties (e.g., can breathe) from superordinate concepts (animal) to their subordinates (dog)—i.e. demonstrate property inheritance. In this work, I will present COMPS, a collection of minimal pair sentences that jointly tests pre-trained language models (PLMs) on their ability to attribute properties to concepts and their ability to demonstrate property inheritance behavior. Analyses of 22 different PLMs on COMPS reveal that they can easily distinguish between concepts on the basis of a property when they are trivially different but find it relatively difficult when concepts are related on the basis of explicit knowledge representations. Further analyses find that PLMs can show behaviors suggesting successful property inheritance in simple contexts, but fail in the presence of distracting information, which decreases the performance of many models sometimes even below chance. This lack of robustness in demonstrating simple reasoning raises important questions about PLMs’ capacity to make correct inferences even when they appear to possess the prerequisite knowledge.
Fall quarter: Fridays, 11:00am-12:20pm, details will be sent via email
September 30 – Sanghee Kim (PhD Candidate, Linguistics, UChicago)
Title: “No, they did not”: Dialogue response dynamics in pre-trained language models
Abstract:
A critical component of competence in language is being able to identify relevant components of an utterance and reply appropriately. In this paper we examine the extent of such dialogue response sensitivity in pre-trained language models, conducting a series of experiments with a particular focus on sensitivity to dynamics involving phenomena of at-issueness and ellipsis. We find that models show clear sensitivity to a distinctive role of embedded clauses, and a general preference for responses that target main clause content of prior utterances. However, the results indicate mixed and generally weak trends with respect to capturing the full range of dynamics involved in targeting at-issue versus not-at-issue content. Additionally, models show fundamental limitations in grasp of the dynamics governing ellipsis, and response selections show clear interference from superficial factors that outweigh the influence of principled discourse constraints.
*This is a joint work with Lang Yu (Meta) and Allyson Ettinger (UChicago)
October 21 – Lucas Fagen (PhD Student, Linguistics, UChicago)
Title: Testing variation across exclusive modifiers
Abstract:
Exclusive modifiers, which convey that some proposition is true and alternatives are false, form a lexical class subject to considerable semantic variation. In joint work with Eszter Ronai, I present the first experimental assessment of variation among English exclusives, focusing on only, just, and merely. Our testing ground is the scalar diversity phenomenon: the observation that scalar expressions vary in how likely they are to lead to exclusionary inferences. Across 51 different lexical scales, we find the highest rates of exclusionary inference with merely and lowest with just. Interpreted in the context of the semantic literature on exclusives, our results suggest that exclusives vary systematically by at least two parameters, scale structure and strength of exclusion: compared to only, merely is restricted to rank-order scales, while just excludes via a weaker semantic operation.
November 4 – Jennifer Arnold (Professor, Psychology & Neuroscience, UNC-Chapel Hill)
Title: You learn what you hear: Discourse exposure guides pronoun comprehension
Abstract:
It is well known that listeners adapt to the most frequent patterns in language at the sound, word, and syntactic levels (Chang et al., 2012). Do people also track the frequency of referential structures and use this frequency to guide interpretation? Referential structures are more complex, and require identifying the relation between a word (e.g. a pronoun), the entity it refers to, and how that entity has been presented in the recent context. We examine this question in the domain of pronoun comprehension.
Our first question is whether people adapt to the most frequent pronoun-antecedent structure in the current context. If so, it would require categorization at some level for the purpose of counting frequency. Thus, our second question is whether people represent these relations in terms of narrow or broad categorizations of pronouns and antecedents. We demonstrate that indeed comprehenders do adapt to pronoun-antecedent structures. These representations are specific to anaphoric 3rd person pronouns (and not names or I/you pronouns), but they generalize from he to she pronouns and vice versa. People also generalize across different antecedent types. This suggests that the frequency of pronoun-antecedent structures underlies contextual biases in pronoun comprehension, and that these frequencies can be updated based on recent exposure. It also shows that the representations are stored at grain that is limited to anaphoric reference, but broad enough to include all anaphoric pronouns and multiple antecedent types.
November 18 – Lisa Pearl (Professor, Language Science, UC Irvine)
** — Meeting at 12 pm (CT) — **
Title: Adventures in computational modeling for syntactic acquisition: A look at syntactic islands
Abstract:
Computational cognitive modeling is a tool we can use to concretely explore theories, because it allows us to generate predictions that we can compare against empirical data. Here, I use this tool to investigate the acquisition of sophisticated knowledge of syntax involving wh-dependencies, sometimes called syntactic islands. For instance, in English, knowledge of syntactic islands causes adults to strongly dislike this wh-question: Who did Lily think the kitty for ___ was pretty?
I discuss the acquisition modeling approach I use, and investigate acquisition theories that learn this knowledge as a by-product of learning about wh-dependencies more generally. I find that these theories indeed can explain how children could learn about syntactic islands implicitly by learning about the building blocks of wh-dependencies more generally, how children across dialects could learn, and how children could come with less built-in knowledge about the building blocks and still succeed.
Spring quarter: Fridays, 11:00am-12:20pm, details will be sent via email
April 1 – Stephanie Reyes (UChicago, Linguistics)
Title: Characterizing the Scope Interpretations of English-Tagalog Bilingual Speakers in Toronto and Manila
In the heritage language literature, a common approach to assessing the language proficiency of heritage speakers is to compare them to the baseline language group. In the North American context, a baseline language is understood as a language variety spoken by first-generation adult immigrants, while their children are considered heritage speakers. The baseline language typically serves as the input to heritage speakers and understanding the language of the baseline is crucial for inquiry into heritage language learning. To further elucidate the characterization of baseline speakers, this qualifying paper study compares baseline speakers with an understudied group, homeland speakers. Homeland speakers are speakers of the language spoken in the home country of baseline speakers. Most studies on homeland languages have concentrated on monolingual speakers in the homeland. Tagalog homeland speakers in Manila, and their baseline counterparts in Toronto, provide an interesting case study since both groups are bilingual in Tagalog and English. This project studies these groups by investigating their scope ambiguity interpretations, as ambiguity resolution is a common indicator of linguistic ability. Despite increasing investigations of scope ambiguities in the literature, there is limited comparative data on the scope judgments of bilingual baseline speakers and bilingual homeland speakers. In this presentation, I present preliminary results from sentence-picture rating experiments on the availability of scope interpretations among baseline speakers in Toronto and homeland speakers in Manila.
April 15 – Pamela Sugrue (UChicago, Linguistics) rescheduled to May 6
April 29 – Wesley Orth (Northwestern University, Linguistics)
Title: Scope Economy and online sentence processing: How comparative conditions constrain parsing
The grammatical properties of a language constrain the design of the parser used in online sentence processing. Quantification provides a unique test case for how the properties of the grammar influence sentence processing. As Quantifier Raising (QR) in English is a covert movement, the parser has little information about whether raising is possible or necessary. In addition to this challenge, quantification is subject to a grammatical condition called Scope Economy. Scope Economy states that the post-QR structure must yield a new interpretation relative to the pre-QR structures, otherwise QR yields an ungrammatical structure. This comparative economy rule combined with the scant information available to the parser creates a puzzle: If determining if QR is grammatical requires performing QR, how can we design a parser that minimizes the frequency of ungrammatical QRs and ensures we can produce all grammatical QRs? In this talk I discuss what parsers that can handle QR and Scope Economy need to look like and present experimental data from polarity illusions and ellipsis to help narrow the range of potential parsers.
May 6 – Pamela Sugrue (UChicago, Linguistics)
Title: Probing Word Vector Semantics
Distributional semantic representations of words, also called word vectors or word embeddings, are a popular way of representing the meanings of words in computational systems. Word embeddings are numerical (vector) representations of words, calculated (in various ways, depending on the specific model architecture) using the co-occurrence of all words over large corpora, such that words become located in semantic space near to other, semantically similar, words. The core intuition behind representing meaning this way is the Distributional Hypothesis (Harris 1954), which holds that word meaning is a function of how words are used. While these models are popular in computational systems and perform impressively at many natural language processing tasks, open questions remain about what specific aspects of word meaning that word vectors come to encode (Turton et al. 2020, Utsumi 2020, Lebani and Lenci 2021, among others). Rubinstein et al. (2015) propose that taxonomic properties are better represented in distributional semantic models than other properties. The research I will discuss probes the semantic information encoded in vector representations of English adjectives, specifically investigating whether adjective embeddings encode gradability, lexical dimensionality, and subjectivity, and offers evidence in support of Rubinstein et al.’s hypothesis.
May 13 – Tom Schoenemann (Indiana University, Anthropology & Cognitive Science Program)
Title: Brain and language from an evolutionary perspective
Language is one of the most important behavioral adaptations of the human lineage. Understanding the evolution of language is therefore central to understanding the human condition. One part of this involves taking seriously some basic broadly applicable principles of evolutionary change – the most important being that evolutionary process favors the elaboration and modification of pre-existing abilities. This in turn makes essential the exploration of possible language-relevant cognitive and behavioral patterns in our closest relatives. It also means that we should expect overlap and integration of different cognitive systems to be a central feature of language evolution, rather than a set of highly domain-specific modules. Previous studies demonstrating categorical perception in mammals (and monkeys in particular) will be used to illustrate these points. Critiques of ape language studies will also be discussed, and a new statistical analysis of responses of one particularly intriguing subject, the bonobo Kanzi, will be presented arguing that he does in fact appear to understand how argument structure is coded in English grammar using arbitrary word-order rules. Clues about the evolution of language from the fossil endocranial record will be reviewed. Brain size itself is associated with several important behavioral dimensions central to language: The complexity of the social environment itself, as well as the expected richness of conceptual understanding. Thus, as brain size increased during our evolutionary history, hominins with increasingly interesting and rich conceptual understanding lived in increasingly complex and interactive social environments. Finally, two areas of research relevant to language evolution will be discussed: 1) explorations of the hypothesis that Broca’s area (which appears to have long predated the human lineage) may have evolved to pay special attention to sequential pattern information in the environment, and 2) the possible overlap of brain circuitry underlying language and those involved in stone tool manufacturing.
Winter quarter: Fridays, 11:00am-12:20pm, details will be sent via email
January 14 – Hyunji Hayley Park (UIUC, Linguistics)
Title: Pitfalls and possibilities: What NLP systems are missing out on
Despite recent advancements in language modeling and NLP in general, there are still many areas where NLP systems face difficulty. With NLP research disproportionally dedicated to English and a few other high-resource languages, the effect of morphology on NLP systems is clearly an under-studied area. Most high-resource languages such as English and Chinese utilize little morphology, encoding more information syntactically (e.g. word order) than morphologically (e.g. case inflection). Morphologically rich languages like Turkish and St. Lawrence Island Yupik use much more variations in word forms to encode meaning and have flexible or free word order. Regarding this issue, I present two studies that augment the existing data to investigate how morphology interacts with NLP systems. First, I compile a parallel Bible corpus and a linguistic typology database to study the effect of morphology on LSTM language modeling difficulty. The results show that morphological complexity, characterized by higher word type counts, makes a language harder to model. Subword segmentation methods such as BPE and Morfessor mitigate the effect of morphology for some languages, but not for others. Even when they do, they still lag behind morpheme segmentation methods based on FSTs. Next, I develop the first dependency treebank for St. Lawrence Island Yupik and demonstrate how morphology interacts with syntax in the morphologically rich language. I argue that the Universal Dependencies (UD) guidelines, which focus on word-level annotations only, should be extended to morpheme-level annotations for morphologically rich languages. As for another area that requires further research, I present a recent study on long document classification. Several methods have been proposed for the task of long document classification using Transformers. However, there is a lack of consensus on a benchmark to enable a fair comparison among different approaches. In this paper, I provide a comprehensive evaluation of existing models’ relative efficacy against various datasets and baselines– both in terms accuracy as well as time and space overheads. Our results show that existing models often fail to outperform simple baseline models and yield inconsistent performance across the datasets. The findings also emphasize that future studies should consider comprehensive baselines and datasets that better represent the task of long document classification to develop robust models.
References:
Hyunji Hayley Park, Katherine J. Zhang, Coleman Haley, Kenneth Steimel, Han Liu, Lane Schwartz. 2021. Morphology Matters: A Multilingual Language Modeling Analysis. Transactions of the Association for Computational Linguistics, 9: 261–276.
Hyunji Hayley Park, Lane Schwartz, and Francis M. Tyers. 2021. Expanding Universal Dependencies for Polysynthetic Languages: A Case of St. Lawrence Island Yupik. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), pages 131–142, Online. Association for Computational Linguistics.
Hyunji Hayley Park, Yogarshi Vyas, Kashif Shah. Under Review. Efficient Classification of Long Documents Using Transformers.
February 11 – Tal Linzen (NYU, Linguistics and Data Science)
Title: Inductive biases for the acquisition of syntactic transformations in neural networks: an interim update
One of the fundamental questions in the cognitive science of language concerns the conditions under which computational learners generalize from their input in a similar way to humans: by examining the effects of different learning assumptions on the generalizations acquired by the learner, we can construct hypotheses about the constraints and biases that underlie human learning. In this meeting, I will describe an ongoing long-term project with Bob Frank (Yale) and other colleagues, whose goal is to investigate this question using one type of behavior — syntactic mappings between related forms (e.g., a declarative and a question) — and a broad class of learners based on artificial neural networks. I have two goals for this session: first, to give an overview of the project and some of our results; and second, to obtain feedback about the ways this project (and others like it) can engage more productively with syntacticians, whether it is by addressing specific questions they are concerned with, by constructing tests for computational learners that incorporate challenging and subtle syntactic generalizations from a variety of languages, or in any other way.
February 25 – Daniel Lam (UChicago, Linguistics)
Title: The encoding, maintenance and retrieval of complex linguistic representations in working memory
Previous studies have shown that representationally complex referents are encoded slower into working memory (WM) but are retrieved faster (Hofmeister, 2011; Karimi & Ferreira, 2016). However, the cost of maintaining complex representations is still not well understood. In a series of experiments, we investigated the cost of encoding, maintaining and retrieving complex representations, such as coordinated noun phrases (those doctors and nurses) and modified noun phrases (those compassionate ER nurses) in WM. While we replicated the facilitatory effect during retrieval, the slowdown during encoding was not consistent across our experiments. More critically, for the first time, our experiments demonstrated that maintaining complex representations in WM is less costly than maintaining their simple counterparts. Furthermore, we found that WM maintenance cost is reduced because complex target noun phrases are more distinct from other competing referents in WM than simple ones. Additionally, we also carried out two self-paced reading experiments to investigate how similarity between sub-representations in a coordinated noun phrase influences the cost of WM processes. We found that noun phrases combining similar referents (the dog, the cat and the hamster) showed facilitated encoding and maintenance, relative to those combining dissimilar referents (the truck, the TV and the hamster). Overall, our results showed that the semantic elaborateness and similarity between sub-representations in complex representations can reduce WM costs, especially maintenance cost and provided new perspectives into this understudied WM process.
March 11 – Casey Ferrara (UChicago, Psychology)
Title: Iconic Modification in Sign and Gesture
Iconic and gradient “language play” is common in daily communication (e.g. “It’s been a loooooooong day”). However, gradience and iconicity are often excluded from traditional accounts of language. Research on spoken language has found that iconic words such as ideophones are particularly prone to this phenomenon. In sign languages, which have an abundance of visual iconicity, we might expect this phenomenon (“iconic modification”) to occur more frequently. The research I’ll be discussing has aimed at establishing (1) to what extent iconic modification of lexical signs occurs in ASL, (2) what factors determine this distribution, and (3) how signers’ iconic modification compares with silent gesture.
Fall quarter: Fridays, 11:00am-12:20pm, details will be sent via email
October 8 – Sanghee Kim (UChicago, Linguistics)
October 22 – Lucas Fagen (UChicago, Linguistics)
Paper discussion – Cheung, Hartley & Monaghan (2021)
This paper explores how caregivers adapt their gestural cues to referential uncertainty using computational modeling. We will discuss this paper from the perspective of (but not limited to) language acquisition, processing, evolution, and computational modeling.
November 5 – Marisa Casillas (UChicago, Comparative Human Development)
Title: Chatterlab spotlight: Yélî Dnye phonological development
This is an investigation into the phonological development of children acquiring Yélî Dnye, an isolate language of Papua New Guinea. With 56 consonants and 34 vowels, Yélî Dnye has one of the largest recorded phonological inventories in the Pacific. Consonants are densely packed into a few pockets of acoustic/articulatory space, featuring some contrasts that are rare among the world’s languages (e.g., dental vs. post-alveolar stops), understudied in previous work (e.g., doubly-articulated consonants), or a unique combination of the two (e.g., doubly-articulated dental-vs.-post-alveolar stops). I have been using a variety of experiment-based methods to tap into children’s knowledge of these phones between ages 6 months and 12 years and will describe some preliminary results supporting a pattern of marked development for typologically rare phones and phonological contrasts.
November 19 – Rob Voigt (Northwestern University, Linguistics)
Title: Language and Immigration: Historical Views from Personal and Political Speech
In this talk I will discuss two projects related to the relations between language and historical trajectories of immigration in the United States. In the first, we examine the impact the impact of arrival as a refugee to linguistic attainment. Prior work on language attainment has relied on self-reported measures of fluency; by contrast, we apply computational methods to a large archive of recorded oral history interviews. Our findings show that refugee migrants achieved higher levels of English proficiency than did economic migrants, even in a time without official support programs that modern refugees receive. In the second, we use methods based on contextual embeddings to analyze U.S. Congressional speeches and Presidential communications related to immigration from 1880 to the present. We find that political speech about immigration is now much more positive on average than in the past, but that political parties have become increasingly polarized in their expressed tone toward immigration, with Republicans significantly more likely to use language suggestive of dehumanizing metaphors such as Vermin and Disease. Together, these projects provide insight into changing experiences of immigration and attitudes towards immigrants in the US, and the linguistic mechanisms by which these changes play out.
2020–21
Spring quarter: Fridays, 11:30am-1pm, Zoom details sent via email
- April 30 – Pamela Sugrue (UChicago)
- May 14 Natalie Dowling (UChicago) – **3:30pm-5pm**
- May 21 – Richard Lewis (Michigan)
- June 4 – Anne Pycha (University of Wisconsin-Milwaukee) & Michelle Cohn (UC Davis)
Winter quarter: Fridays, 11:30am-1pm, virtual meetings (Zoom details sent via email)
Week 2, January 22 – Daniel Lam (UChicago)
Title: The encoding, maintenance and retrieval of complex linguistic structures in working memory
Previous studies (Hofmeister, 2011; Vasishth and Hofmeister, 2014) have shown that the encoding of more complex modified NPs (e.g the alleged Venezuelan communist) is slower than that of simpler unmodified NPs (e.g the communist) while the reverse pattern was found for retrieval. This was attributed to more effort for elaboration during encoding, which in turn leads to increased salience and decreased interference during retrieval. Hofmeister (2011, Experiment 3) also showed that the facilitation during retrieval occurs when the features in the complex structures are related (e.g the cruel dictator) and not when they are not (e.g the lovable dictator). In three experiments, we explore the interaction between representational complexity and WM processes further by exploiting the complexity of coordinated structures (e.g those nurses and doctors) relative to uncoordinated structures (e.g those doctors). Experiments 1 and 2 found, contrary to results from Hofmeister (2011), that there is instead a facilitation during encoding thanks to complexity and none during retrieval. We also discovered some facilitation during the maintenance stage that depends on the number of competing referents concurrently held in WM. We also explored in Experiment 3 how feature similarity between the conjuncts in complex coordinated NPs can help the encoding stage.
Week 4, February 5 – Ljiljana Progovac (Wayne State)
Title: What use is half a sentence? Grammar caught in the act of natural/sexual selection
This lecture focuses on the evolution of grammar, advocating a gradual/incremental emergence of hierarchical structure and transitivity, as subject not only to cultural innovation, but also to natural/sexual selection forces. I present a precise syntactic reconstruction of the initial, proto-grammar stage, characterized as an intransitive, flat, absolutive-like, two-slot mold, unable to distinguish subjects from objects. Even this crude grammar offers clear and substantial communicative benefits over no grammar at all, as well as reveals, through its limits, reasons and rationale for evolving more complex grammars. The particular uses to which this proto-grammar can be put even today (e.g. insult: cry-baby, kill-joy; naming: rattle-snake; proverbial wisdoms: Monkey see, monkey do; First come, first serve(d)) reveals why this cultural invention (of coining binary compositions) would have been highly adaptive at the dawn of language. With the goal to shed concrete light on how biological evolution would have begun to shape the genetic make-up that supports human language, a specific sexual selection scenario will be considered. By identifying insult (verbal aggression) as relevant for early language evolution, this proposal directly engages the recent hypothesis of human self-domestication, whose main postulate is a gradual reduction in reactive physical aggression, yielding several testable hypotheses. This approach also identifies testable hypotheses in the arena of neuroimaging, from which some fMRI experimental results will be reported.
Week 6, February 19 – Thomas Sostarics (Northwestern)
Title: Pragmatic Contributions of the L*L-L% Contour in American English
Pierrehumbert and Hirschberg’s seminal 1990 paper proposes a compositional approach for understanding the meaning of nuclear pitch contours in American English. While their approach to a speaker’s choice of “tune” is seemingly rooted in pragmatic notions of common ground and epistemic stance, little empirical work has been done to validate or enrich this claim (Büring 2016; Prieto and Borràs-Comes 2018). While some contours have received considerable attention in the literature, others have received little-to-none. I will be presenting recent results from my qualifying paper investigating the meaning of the rarely-mentioned L*L-L% contour as it contrasts with H*L-L%, the so-called default intonation contour for assertions and declaratives. Despite being “default,” there seem to be particular discourse contexts in which H*L-L% is a less felicitous choice of tune compared to L*L-L%, which we investigate through a series of perception experiments.
Week 9, March 12 – Claire Bergey (UChicago)
Title: Learning communicative acts in children’s conversations: A Hidden Topic Markov Model analysis of the CHILDES corpus
Over their first years of life, children learn not just the words of their native languages, but how to use them to communicate. Because manual annotation of communicative intent does not scale to large corpora, our understanding of communicative act development is limited to case studies of a few children at a few time points. We present an approach to automatic identification of communicative acts using a Hidden Topic Markov Model, applying it to the CHILDES database. We first describe qualitative changes in parent-child communication over development, and then use our method to demonstrate two large-scale features of communicative development: (1) children develop a parent-like repertoire of our model’s communicative acts rapidly, their learning rate peaking around 12 months of age, and (2) this period of steep repertoire change coincides with the highest predictability between parents’ acts and children’s, suggesting that structured interactions play a role in learning to communicate. We then present some preliminary analyses about the relationship between form and function in language acquisition, looking at the correspondence between these communicative acts and sentence frames.
Autumn quarter: Fridays, 11:30am-1pm, virtual meetings (Zoom details sent via email)
Week 2, October 9 – Daniel Lam (UChicago) & Eszter Ronai (UChicago)
Welcome & Bayesian statistics
Week 4, October 23 – Hayoung Song (UChicago)
Title: Predicting attentional engagement during narratives and its consequences for event memoryThe degree to which we are engaged in narratives fluctuates over time. What drives these changes in engagement, and how do they affect what we remember? Behavioral studies showed that people experienced similar fluctuations in engagement during a television show or an audio-narrated story and were more engaged during emotional moments. Functional MRI experiments revealed that changes in a pattern of functional brain connectivity predicted changes in how engaged people were in the narratives. This predictive brain network not only was related to a validated neuromarker of sustained attention, but also predicted what narrative events people recalled after the MRI scan. Together these findings reveal a robust, generalizable neural signature of engagement dynamics and elucidate relationships between narrative engagement, sustained attention, and event memory.
Week 6, November 6 – Emre Hakguder, Aurora Martinez del Rio, Casey Ferrara & Sanghee Kim (UChicago)
Title: Identifying the Correlations Between the Lexical Semantics and Phonology of ASL: A Vector Space Approach
In this study, we create a semantic vector space model (VSM) for lexical word meaning in ASL and use it to investigate whether there is a relationship between the semantic and phonological properties of signs. We hypothesize that clusters of ASL words that are related in meaning are likely to have phonological similarities, basing our hypothesis on the many observations that transparent iconicity can be found at different levels of the grammar in sign languages, including at the lexical level. We show that the more neatly the ASL lexicon is semantically organized, the greater the phonological similarity within its clusters.
Week 10, December 4 – Andras Molnar (Carnegie Mellon)
Title: The demand for, and avoidance of, information
We apply a previously developed “information gap” framework (Golman and Loewenstein, 2018) to make sense of and predict information seeking and avoidance. The resulting theory posits that, beyond the conventional desire for information as an input to decision making, two additional motives contribute to the demand for information: curiosity – the desire to fill information gaps, i.e., to answer specific questions that capture attention; and motivated attention – the desire to savor good news and ignore bad news, i.e., to obtain information we like thinking about and avoid information we do not like thinking about. Five experiments (N = 2,361) test three of the primary hypotheses derived from the theory about the demand for information. People are more inclined to acquire information: (1) when it seems more important, even when the information is not instrumental for decision making (Experiments 1A & 1B); (2) when it is more salient, manipulated by how recently the information gap was opened (Experiments 2A & 2B); and (3) when it has higher valence – i.e., when individuals anticipate receiving more favorable news (Experiment 3). This set of findings cannot be explained by alternative models of information acquisition or avoidance.
2019–20
Spring quarter: Fridays, 11.00am-12.30pm, virtual meetings
- April 24 – Josef Klafka (Carnegie Mellon)
- May 1 – Jinghua Ou (UChicago)
- May 22 3:30pm-5pm – Daniel Lam (UChicago)
- May 29 3:30pm-5pm – Laura Stigliano (UChicago)
Winter quarter: Fridays, 11.00am-12.30pm, Cobb Lecture Hall Room 107 (unless otherwise noted)
- 24 January: Sanghee Kim (UChicago)
- 7 February: Yi Ting Huang (Maryland)
- 21 February: Ryan Lepic (UChicago)
- 28 February: Eszter Ronai (UChicago) *meeting 3:30pm-4:30pm in Rosenwald 208*
- 13 March: Ljiljana Progovac (Wayne State) — postponed due to COVID-19
Autumn quarter: Fridays, 11.30am-1pm, Cobb Lecture Hall Room 302
- 11 October: Adrian Staub (UMass Amherst)
- October 25: Laura Stigliano & Eszter Ronai (UChicago)
joint meeting with Morphology and Syntax - 1 November: Brian Dillon (UMass Amherst)
- Thursday, 7 November, 12.30-2pm in SS 302: Jenny Lu (UChicago)
- 15 November: Anisia Popescu (University of Potsdam)
2018–19
Spring quarter: Thursdays, 11.00-12.30 in Cobb 116
- 18 April, Aurora Martinez del Rio
- 25 April, Ryan Lepic
- May 24, Jon Sprouse (UConn)
joint meeting with Morphology and Syntax - May 31, Masaya Yoshida (Northwestern)
joint meeting with Morphology and Syntax - 6 June, Casey Ferrara
Winter quarter: even week Fridays 1:30-2:50 in Cobb 203
- January 18, Andrea E. Martin (Max Planck Institute for Psycholinguistics)
- February 1, Ryan Lepic
- February 15, Eszter Ronai
- March 1, Daniel Lam
- March 15, Josef Klafka
Autumn quarter:
- October 12, Interest meeting
- October 19, Ryan Lepic
- October 26, Daniel Lam
- November 13, Eszter Ronai and Jimmy Waller
- December 7, Madeline Meyers and Claire Bergey