Resources

R

R colab notebook
Base R Cheat Sheet
dplyr full reference (we'll only use some of these functions)
dplyr vignette
dplyr & tidyr cheat sheet
ggplot full reference
ggplot2 overview and more learning resources
ggplot2 cheat sheet
ggplot2 workshop part 1 (youtube webinar, if you want to go further!)

Datasets

Human Brain Evolution

DeSilva, J. M., Traniello, J. F., Claxton, A. G., & Fannin, L. D. (2021). When and why did human brains decrease in size? A new change-point analysis and insights from brain evolution in ants. Frontiers in Ecology and Evolution, 712.

Data downloaded from the supplemental materials in DeSilva et al (2021); to use:

data <- read.csv('https://kathrynschuler.com/datasci-langmind-datasets/human-brain-evolution/data.csv'

Stanford's Wordbank

Cross-linguistic trajectories of two words: ball and dog taken from wordbank.stanford.edu. To use this dataset:

data <- read.csv('https://kathrynschuler.com/ling172/datasets/crosslinguistic-dog-ball.csv')

Nettle's Language diversity

A long format dataset that is most useful in wide format. Data taken from Appendix 1 in:

Nettle, D. (1998). Explaining Global Patterns of Language Diversity. Journal of Anthropological Archaeology, 17, 354–374.

To use this dataset, you’ll need the jvcasillas/untidydata package:

install.packages("devtools")
devtools::install_github("jvcasillas/untidydata")

Then call

language_diversity

Exam study guides

Project guidelines

Welcome to your data science project!

Throughout the semester, you'll be applying what you learn to a data science project that is of particular interest to you (and your group!). You'll need to select a project within the bounds of linguistics or cognitive science, but other than that the topic is up to you.

Types of projects

Projects can be one of two types:

A group project: you join a group with 1 or 2 other students in your lab section (max group size is 3 students total). You and your group replicate a classic study in linguistics or cognitive science by reconstructing the data and analysis from the published paper
A solo project: you work alone and either (1) replicate a classic study in linguistic or cognitive science by reconstructing the data and analysis from the published paper (same as group project version, you just work alone) OR (2) you work on an original research project in which you collect data yourself.

Reasons for doing a solo project would include: (1) you are required to, usually in order to use this class in a specific way toward your major or minor; (2) you are conducting (or want to conduct!) a research project in linguistics or cognitive science either as part of a class, a thesis, or independently; and you want to use this class to help you do the data science on that project.

Classic studies

Wondering how to find a classic study? Here are a few suggestions:

Look to your previous classes! Does anything stand out? Any interesting things you heard about in class that you'd like to explore more?
Look in intro textbooks! Introductory textbooks in a given field often describe classic studies in an accessible way. That would be a great place to start to figure out what studies you might want to replicate. They will reference the original research article.
Look to social media! Have you read about anything in the news or on social media that you'd like to dive into? You can usually find the reference to the original research study somewhere in a news article.
Use our lists below! We've put together a list of possible classics in Cognitive Science and Linguistics that you might be interested in.
Ask on Ed! Describe your interest on Ed and we can try to direct you to some specific papers that way.

Linguistics

Sociolinguistics (from Dr. Meredith Tamminga, Associate Professor of Linguistics)

Tagliamonte & D'Arcy 2007 on the "be like" quotative
Hay & Drager 2010 speech perception study involving kiwis and kangaroos!

Language Evolution (from Dr. Gareth Roberts, Associate Professor of Linguistics)

Galantucci, B. (2005). An experimental study of the emergence of human communication systems. Cognitive Science, 29(5), 737-767.
Garrod, S., Fay, N., Lee, J., Oberlander, J., & MacLeod, T. (2007). Foundations of representation: where might graphical symbol systems come from? Cognitive Science, 31(6), 961-987.
Scott-Phillips, T. C., Kirby, S., & Ritchie, G. R. (2009). Signalling signalhood and the emergence of communication. Cognition, 113(2), 226-233.
Sneller, B. and Roberts, G. (2018) Why some behaviors spread while others don't: A laboratory simulation of dialect contact. Cognition 170C: 298–311.

Semantics (from Dr. Florian Schwarz, Associate Professor of Linguistics)

Bott & Noveck’s 2004 experiment on scalar implicatures (some vs. all elephants are mammels)

And from Dr. Schuler - You could look to some of the papers from my graduate seminars on topics:

Cognitive Science

List provided by Dr. Russell Richie, Associate Director of Cognitive Science and mindCORE programs

Introduction

Donders (1868) How long does it take to make a decision? (subtractive method)

Cognitive Revolution and the Computational Theory of Mind

Latent learning with rats in a maze (rats learn even if not rewarded/punished). Tolman and honzik 1930.
Magical number seven. Miller 1956
Behrend & Bitteren (1961) – simple reinforcement learning example of fish learning food probabilities from two feeders

Modularity

Firestone and Scholl (2016): This is a BBS piece and doesn’t have original data itself, but discusses a lot of studies of alleged top down effects of cognition on perception. Students may find it interesting to replicate those.

Judgment and Decision-Making: Are we 'good' at reasoning?

Wason selection task. Wason, P. C. (1968).
People are better at logical reasoning if the problem fits into a ‘schema’ (e.g., permission schema). Cheng and Holyoak (1985)
Conjunction fallacy (aka linda problem) -- Tversky and Kahneman 1981
Base rate neglect (Tom problem) -- Kahneman & Tversky (1973)
Availability heuristic - Tversky & Kahneman, 1973
Anchoring heuristic – Tversky and Kahneman 1974. Ariely et al 2003.

Judgment and Decision-Making: Behavioral Economics

Certainty effect – Tversky and Kahneman 1986
Loss aversion - Kahneman, D. & Tversky, A. (1979). "Prospect Theory: An Analysis of Decision under Risk".
Framing/epidemic problem. Tversky and Kahneman 1981.
Mental accounting – Heath and Soll 1996 (journal of consumer research)
Decoy effect -- Huber, Joel; Payne, John W.; Puto, Christopher (1982)
Intransitive preferences – Tversky 1969

Language structure

Eimas et al 1971 on categorical perception in infants
Werker and tees 1984 on losing ability to distinguish non-native speech category contrasts.
Bias towards hearing clicks at phrase boundaries -- Ladefoged and Broadbent (1960) and Fodor and Bever (1965)

Language comprehension

When listening/reading to words, search for and activate words in parallel, rather than serially. Tanenhaus et al 1979
Cohort theory vs TRACE model of word recognition: Allopenna et al 1998
Evidence for interactive theory of sentence processing: i. Visual world context: Tanenhaus et al. (1995); Trueswell et al. (1999) ii. Verb bias: snedeker and trueswell 2004 iii. Prosody: snedeker and trueswell 2003 iv. Real world knowledge: Chambers et al 2004

Language acquisition

Word segmentation i. Conditional probabilities between syllables: Saffran et al 1996 ii. Stress patterns: Jusczyk et al 1999
Word learning. How to aggregate information across word-referent pairings?
Global cross-situational word learning: Yu and Smith 2007
Hypothesis testing/Propose by verify: Trueswell et al 2013
Hybrid Pursuit model, developed by Charles and others at Penn. ii. Noun bias: bates et al 1994 iii. Human simulation paradigm: Gleitman et al 1999 iv. Syntactic bootstrapping: Naigles, 1990; Hirsh-pasek et al 1996; yuan and fisher 2009
Rule learning and regularization i. Artificial language learning and the Tolerance Principle: Schuler et al 2021 😉 ii. Deaf children exposed to sign language early outperform their hearing parents (Newport, 1990)

Language and thought

Effect of cross-linguistic differences on color perception (or perhaps just decision-making!) – Winawer et al 2007
“Whorf hypothesis is supported in the right visual field but not the left” – Gilbert et al 2006
“Does categorical perception in the left hemisphere depend on language?” – Holmes and Wolff 2012
Can you represent spatial relations (e.g., left/right) if you don’t have words for them? See Brown and Levinson 1993 for ‘yes’. i. But then see Li and Gleitman 2002 for counterargument with Penn undergrads!

Neuroscience – Methods

Dead salmon fMRI study, showing risks of research degrees of freedom – bennett et al (2009, neuroimage)
Different neurons/patches of superior temporal gyrus encode different phonetic features (evidence from direct electrode recordings) -- Mesgarani et al 2014

Neuroscience – Plasticity

Long term potentiation – bliss and Lømo 1973
Beatrice Gelber studies on conditioning in single-celled organisms (paramecia) -- Gershman et al 2021 in eLIFE.
Critical periods: i. Hubel and Wiesel 1964 studies with suturing cat eyes shut, on critical period effects in visual development… ii. Newport 1990 showing age of acquisition/critical period effects of language acquisition in Deaf people acquiring ASL (who don’t have another L1, thus suggesting critical period effects in learning additional languages as an adult are not merely interference from L1).
Ferrets with visual pathway rewired to auditory cortex can still see. (i.e., brain areas have some flexibility in the inputs they can take). Von melchner et al 2000
London cab drivers have larger grey matter in posterior hippocampus, relative to London bus drivers. Maguire et al 2006 make argument that needing to flexibly navigate increase hippocampal volume.

Cognitive Development - Object perception

Infants perceive objects as unitary (if two things move in tandem but their connection is occluded, infants assume a single object) -- Valenza, E., Leo, I., Gava, L., & Simion, F. (2006). Child Development, 77, 1810–1821.
Infants know objects persist when they disappear (object permanence): wang et al 2004
Infants know objects move continuously through space and time (Aguiar & Baillargeon, 1998; Johnson et al 2003)
When objects violate physics, infants test them appropriately (drop objects that defy gravity; bang objects that pass through others)…Stahl and Feigenson 2015.
Infants detect shape changes first in development, then pattern, then color (wilcox, 1999)

Cognitive Development - Understanding of agents

Infants can use repeated reaching to infer a goal – Woodward 1998
Infants expect goal-directed actions to be efficient – Liu et al 2019; Gergely and Csibra 2002
Infants expect successful agents to be happy – skerry and spelke 2014.
Do babies prefer agents that help, over agents that hinder? Hamlin et al 2007 says ‘yes’. But see failed replications, Schlingloff et al. 2020, Salvadori et al 2015! (FYI, there is an ongoing multi-site replication: https://manybabies.github.io/MB4/)

Reinforcement Learning

Blocking effect: Suppose an animal already knows that A (chime) predicts B (food). If X (light) is presented simultaneously with A, animal won’t learn association between X and B. Kamin 1968.
Dopamine spikes don’t accompany receipt of reward; they reflect changes in expectations of reward/punishments! Schultz et al 1997

Concepts and categories

Production Tasks (list members of a category). List examples of birds: “robin, sparrow…..parrot…ostrich, penguin” people are relatively consistent (typical first)
Rating tasks (rate members of a category) people in agreement about what is typical member
Sentence Verification “A robin is a bird” is faster than “A penguin is a bird”
Picture Identification People faster to identify typical members of category. Posner and Keele 1968 v. Missing prototype effect – Posner and keel 1968
Odd (even) numbers show prototypicality effects in rating tasks and sentence verification tasks! Armstrong, Gleitman, and Gleitman 1983.
Blind and sighted people have similar organization of visual verbs. Bedny et al 2019.

Number cognition

Monkeys can count! Brannon and Terrace, 1998; Cantlon and Brannon, 2006.
So can infants! Feigenson et al. (2002)
Ants count their steps in order to navigate! Wittlinger et al 2006
Number words/symbols seem to be necessary to exactly represent large number concepts (like 56). Two papers: i. Exact and Approximate Arithmetic in an Amazonian Indigene Group. Pica et al 2004. ii. Number cognition in Deaf Nicaraguan homesigners. Spaepen et al 2011.

Collective cognition and behavior

The spread of behavior in an online social network – Centola (2010). Behavior spreads more quickly in social networks with clusters, than in social networks where people are connected randomly, even though it takes fewer steps to get between any two people in a random network!

Final project submission

due on gradescope by May 10th at 11:59pm (no late submissions possible: grades are due May 12!)

Your final project submission will essentially be a more formal, cleaned up report of your preregistration and project checkpoint 3. Your submission should be a google colab notebook (see sample!) and include the following sections:

Introduction - in the introduction, provide a brief background and summary of the research question(s) addressed in the original paper. Describe (again very briefly) how the researchers addressed this question(s) in their original work. Then describe which aspect of the paper your group chose to replicate.
Method - your method section should include at least 3 subsections
1. Participants - describe the participants in the portion of the study you are replicating and how you simulated that participant structure (if your data was not available). Include the R code that generates the participant structure (or summarizes it if your data was availble) and show this in a table with R code.
2. Procedure - describe in more detail the procedure of the portion of the study you are replicating. What did the researchers do (or have their participants do) in the experiment? On each trial? What were the stimuli like? Did the researchers code or summarize the data in any way? Explain that here. Include the R code that generates the data, including the trial-by-trial data. If you imported data, include R code that generates tables to indicate the number of trials, etc.
3. Analysis - describe in more detail the analysis you and your group have opted to conduct. First, restate the research question you are addressing and its null hypothesis. You should describe your planned analysis, which should include at least a linear or logistic regression to test your null hypothesis. If you will remove outliers or trim data before anlaysis, describe that here. Include the R code for outlier removal and model building here.
Results - begin by recreating the figure (or creating a figure if one did not exist) that summarizes your research question. Then run the anova() function on your model in R and interpret the results in text (Chapter 15 in the book is helpful for this!). Then, use summary() to get the regression coefficients and interpret those results in the text (Chapter 7 in the book is helpful for this!). Finally, get your model's predictions with the predict() function and add the model predictions to your original figure.
Conclusions - Finally, include a few sentences to remind the reader what you set out to do, what you found, and whether the results from your model were the same or different than the original research finding. Briefly summarize what this might mean.

Here is a sample final project submission, based on the Petitto & Marentette 1991 article you all read for Project Checkpoint 01, plus a blank R notebook if you'd like to start there:

PreviousFAQs

Last updated 11 months ago