AI Seminar: "Revealing Hidden Vulnerabilities in Long-Context Large Language Models" by Yue Dong

-
MRB Seminar Room
ABSTRACT:

Large Language Models are increasingly deployed in applications that require reasoning over long and complex context, such as extended documents, multi-turn interactions, retrieved evidence, and multimodal inputs. While these capabilities make LLMs more powerful, they also introduce new and underexplored safety risks. In long-context settings, safety-relevant signals can be diluted or overridden, boundaries between context segments can break down, and harmful influence can emerge only when information is recombined during reasoning.

In this talk, I will highlight recent research uncovering hidden vulnerabilities in long-context LLMs, including hallucinations, alignment failures, and adversarial weaknesses across both text and multimodal systems. These findings suggest that many existing safety evaluations and defenses, which are often designed for short and self-contained inputs, are insufficient for long-context reasoning. Addressing these challenges requires new benchmarks, interpretability tools, and defense strategies for safer and more reliable LLMs.
 

Bio:

Yue Dong is an Assistant Professor of Computer Science at the University of California, Riverside. Her research focuses on building controllable, trustworthy, and efficient large language models. She has published over 40 peer-reviewed papers in leading venues including ACL, ICLR, ICML, TACL, NAACL, EMNLP, and AAAI. Her recent work spans hallucination reduction, efficient post-training, and AI safety and robustness, including red-teaming and alignment of multimodal language models. Her research has received multiple recognitions, including a Best Paper Award at the 2023 SoCal NLP Symposium for work on multimodal LLM safety. Prior to joining UC Riverside, she completed PhD research internships at Google, Microsoft, and AI2.

Type
Seminars
Target Audience
Students, Faculty, Staff
Admission
Free
Let us help you with your search