Research

Publications and presentations from tinlab are listed below. Lab members are highlighted.

A systematic framework for generating novel experimental hypotheses from language models
Kanishka Misra and Najoung Kim
arXiv


RExBench: Can coding agents autonomously implement AI research extensions?
Nicholas Edwards,* Yukyung Lee,* Yujun Audrey Mao, Yulu Qin, Sebastian Schuster,† and Najoung Kim.† (*,†Equal contribution)
ACL 2026


Are they lovers or friends? Evaluating LLMs’ Social Reasoning in English and Korean Dialogue
Eunsu Kim, Junyeong Park, Juhyun Oh, Kiwoong Park, Seyoung Song, A. Seza Doğruöz, Najoung Kim,* and Alice Oh.* (*Equal contribution)
ACL 2026


Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity
Arkadiy Saakyan, Najoung Kim, Smaranda Muresan, Tuhin Chakrabarty
ICLR 2026


Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It
Yulu Qin,* Dheeraj Varghese,* Adam Dahlgren Lindström, Lucia Donatelli, Kanishka Misra,† and Najoung Kim.† (*,†Equal contribution)
NeurIPS 2025


CheckEval: A reliable LLM-as-a-Judge framework for evaluating text generation using checklists
Yukyung Lee, JoongHoon Kim, Jaehee Kim, Hyowon Cho, Jaewook Kang, Pilsung Kang, and Najoung Kim
EMNLP 2025


Mechanistic Understanding of Entity Tracking in Natural Language involving Multiple Operations
Zilu (Peter) Tang, Qiao Zhao, Gabriel Franco, Geneva Yang, Angelos Poulis, Derry Wijaya, Aaron Mueller, Sebastian Schuster, and Najoung Kim
NEMI 2025


Is analogy enough to draw novel adjective-noun inferences?
Hayley Ross, Kathryn Davidson, and Najoung Kim
SCiL 2025


Implicit mechanisms for symbol manipulation in RNNs
Aditya Yedetore and Najoung Kim
NENLP 2025


Transformers Struggle to Learn to Search Without In-context Exploration
Abulhair Saparov, Srushti Pawar, Shreyas Pimpalgaonkar, Nitish Joshi, Richard Yuanzhe Pang, Vishakh Padmakumar, Seyed Mehran Kazemi, Najoung Kim,* and He He.* (*Equal contribution)
ICLR 2025


Fake reefs are sometimes reefs and sometimes not, but are always compositional
Hayley Ross, Najoung Kim, and Kathryn Davidson
ELM 3 (2025)


Semantic Training Signals Promote Hierarchical Syntactic Generalization in Neural Networks
Aditya Yedetore and Najoung Kim
EMNLP 2024


Code Pretraining Improves Entity Tracking Abilities of Language Models
Najoung Kim,* Sebastian Schuster,* and Shubham Toshniwal.* (*Equal contribution)
arXiv


Personas as a Way to Model Truthfulness in Language Models
Nitish Joshi, Javier Rando, Abulhair Saparov, Najoung Kim, and He He
EMNLP 2024


Is artificial intelligence still intelligence? LLMs generalize to novel adjective-noun pairs, but don’t mimic the full human distribution
Hayley Ross, Kathryn Davidson, and Najoung Kim
GenBench @ EMNLP 2024 👑 Best paper award


Structural Generalization of Modification in Adult Learners of an Artificial Language
Najoung Kim and Paul Smolensky
CogSci 2024


Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation
Katherine M. Collins, Najoung Kim, Yonatan Bitton, Verena Rieser, Shayegan Omidshafiei, Yushi Hu, Sherol Chen, Senjuti Dutta, Minsuk Chang, Kimin Lee, Youwei Liang, Georgina Evans, Sahil Singla, Gang Li, Adrian Weller, Junfeng He, Deepak Ramachandran, and Krishnamurthy Dj Dvijotham
AIES 2024


Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks
Zhaofeng Wu, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen, Bailin Wang, Najoung Kim, Jacob Andreas, and Yoon Kim
NAACL 2024


Syn-(QA)^2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets
Ashwin Daswani, Rohan Sawant, and Najoung Kim
arXiv


Abstraction via exemplars? A representational case study on lexical category inference in BERT
Kanishka Misra and Najoung Kim
BUCLD 47 (2023)


SLOG: A Structural Generalization Benchmark for Semantic Parsing
Bingzhi Li, Lucia Donatelli, Alexander Koller, Tal Linzen, Yuekun Yao, and Najoung Kim
EMNLP 2023


Inverse scaling can become U-shaped
Jason Wei,* Najoung Kim,* Yi Tay, and Quoc V. Le (*Equal contribution)
EMNLP 2023


Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples
Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Seyed Mehran Kazemi, Najoung Kim,* and He He.* (*Equal contribution)
NeurIPS 2023


BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information
Mehran Kazemi, Quan Yuan, Deepti Bhatia, Najoung Kim, Xin Xu, Vaiva Imbrasaite, and Deepak Ramachandran
NeurIPS 2023 (Datasets and Benchmarks)


Inverse Scaling: When Bigger Isn’t Better
Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, Euan McLean, Aaron Kirtland, Alexis Ross, Alisa Liu, Andrew Gritsevskiy, Daniel Wurgaft, Derik Kauffman, Gabriel Recchia, Jiacheng Liu, Joe Cavanagh, Max Weiss, Sicong Huang, The Floating Droid, Tom Tseng, Tomasz Korbak, Xudong Shen, Yuhui Zhang, Zhengping Zhou, Najoung Kim, Samuel R. Bowman, and Ethan Perez
TMLR 2023 👑 Featured Certification


Finding Structure in One Child’s Linguistic Experience
Wentao Wang, Wai Keen Vong, Najoung Kim, and Brenden M. Lake
Cognitive Science 2023


(QA)^2: Question Answering with Questionable Assumptions
Najoung Kim,* Phu Mon Htut,* Samuel R. Bowman, and Jackson Petty (*Equal contribution)
ACL 2023


Entity Tracking in Language Models
Najoung Kim* and Sebastian Schuster* (*Equal contribution)
ACL 2023 👑 Area Chair Award


LAMBADA: Backward Chaining for Automated Reasoning in Natural Language
Seyed Mehran Kazemi, Najoung Kim, Deepti Bhatia, Xin Xu, and Deepak Ramachandran
ACL 2023


Reconstruction Probing
Najoung Kim, Jatin Khilnani, Alex Warstadt, and Abed Qaddoumi
Findings of ACL 2023


Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models
Najoung Kim, Tal Linzen, and Paul Smolensky
arXiv (2022)


Compositional Linguistic Generalization in Artificial Neural Networks
Najoung Kim
PhD Dissertation, Johns Hopkins University (2021)


Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering
Najoung Kim, Ellie Pavlick, Burcu Karagol Ayan, and Deepak Ramachandran
ACL 2021


Testing for Grammatical Category Abstraction in Neural Language Models
Najoung Kim and Paul Smolensky
SCiL 2021


COGS: A Compositional Generalization Challenge Based on Semantic Interpretation
Najoung Kim and Tal Linzen
EMNLP 2020


Implicit Discourse Relation Classification: We Need to Talk About Evaluation
Najoung Kim, Song Feng, Chulaka Gunasekara, and Luis A. Lastras
ACL 2020


Maximize presupposition and the Korean demonstrative ku
Sadhwi Srinivas, Najoung Kim, and Kyle Rawlins
LSA 2020


Compositionality as Directional Consistency in Sequential Neural Networks
Najoung Kim and Tal Linzen
NeurIPS 2019 Workshop on Context and Compositionality


Probing What Different NLP Tasks Teach Machines About Function Word Comprehension
Najoung Kim, Roma Patel, Adam Poliak, Alex Wang, Patrick Xia, Tom McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Sam Bowman, and Ellie Pavlick
*SEM 2019 👑 Best Paper Award


How to Get Past Sesame Street: Sentence-Level Pretraining Beyond Language Modeling
Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, and Samuel R. Bowman
ACL 2019


Automatic Scoring of Semantic Fluency
Najoung Kim, Jung-Ho Kim, Maria K. Wolters, Sarah E. MacPherson, and Jong C. Park
Frontiers in Psychology (2019)


Predicting the Argumenthood of English Prepositional Phrases
Najoung Kim, Kyle Rawlins, Benjamin Van Durme, and Paul Smolensky
AAAI 2019


What do you learn from context? Probing for sentence structure in contextualized word representations
Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R Thomas McCoy, Najoung Kim, Benjamin Van Durme, Sam Bowman, Dipanjan Das, and Ellie Pavlick
ICLR 2019


Prosodic and Linguistic Analysis of Semantic Fluency Data: A Window into Speech Production and Cognition
Maria K. Wolters, Najoung Kim, Jung-Ho Kim, Sarah E. MacPherson, and Jong C. Park
Interspeech 2018


Enhanced Sign Language Transcription System via Hand Tracking and Pose Estimation
Jung-Ho Kim, Najoung Kim, Hancheol Park, and Jong C. Park
Journal of Computing Science and Engineering, vol 10.3


A Morphological Approach to the Longitudinal Detection of Dementia
Najoung Kim and Jong C. Park
HCI Korea 2016