Sunghwan Kim

MS Student at Yonsei University

kimsh8564[at]yonsei.ac.kr

Hi! I am a first year M.S. student at Language and AGI Lab advised by Jinyoung Yeo. Previously, I received B.S. in Materials Science & Engineering from Yonsei University in Aug. 2024.

I aim to build human-like intelligent systems that can autonomously learn, reason, and adapt to diverse environments. My recent research interests include: (i) Reinforcement Learning (RL) to solve long-horizon tasks and (ii) Developing intelligent systems that learn through interaction with the environment. Additionally, I focus on analyzing language models to identify limitations and room for improvement.

News

Sep 18, 2025	🎉 Our “Web-Shepherd” got accepted to NeuIPS 2025 Spotlight!
May 23, 2025	Our “Web-Shepherd” and “Embodied Agents Meet Personalization” papers are released!
May 17, 2025	🎉 Our “Reward Model Evaluation” and “LLM Meets Scene Graph” got accepted to ACL 2025!
Mar 11, 2025	I will join Microsoft Research Asia (MSRA) as a research intern!
Jan 23, 2025	🎉 Our work “World Model for Web Agent” got accepted to ICLR 2025!
Sep 21, 2024	🎉 Our work “Think-and-Execute” got accepted to EMNLP 2024 and “Cactus” got accepted to EMNLP 2024 Findings!
Aug 14, 2024	🏆 Our paper has been selected as an outstanding paper at ACL 2024! 🏆

Selected Publications

† indicates equal contribution.

Reward Model

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Hyungjoo Chae^†, Sunghwan Kim^† , Junhee Cho^†, Seungone Kim, Seungjun Moon, Gyeom Hwangbo, Dongha Lim, Minjin Kim, Yeonjun Hwang, Minju Gwak, Dongwook Choi, Minseok Kang, Gwanhoon Im, ByeongUng Cho, Hyojun Kim, Jun Hee Han, Taeyoon Kwon, Minju Kim, Beong-woo Kwak, Dongjin Kang, and 1 more author

NeuIPS 2025 (Spotlight)

arXiv
Reward Model

Rethinking Reward Model Evaluation Through the Lens of Reward Overoptimization

Sunghwan Kim^† , Dongjin Kang^†, Taeyoon Kwon, Hyungjoo Chae, Dongha Lee, and Jinyoung Yeo

ACL 2025 (Oral)

arXiv
Interaction

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Hyungjoo Chae, Namyoung Kim, Kai Tzu-iunn Ong, Minju Gwak, Gwanwoo Song, Jihoon Kim, Sunghwan Kim , Dongha Lee, and Jinyoung Yeo

ICLR 2025

arXiv Code
Dialogue

Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation

Dongjin Kang^†, Sunghwan Kim^† , Taeyoon Kwon, Seungjun Moon, Hyunsouk Cho, Youngjae Yu, Dongha Lee, and Jinyoung Yeo

ACL 2024

🏆 Outstanding Paper Award 🏆

Abs arXiv

Outstanding Paper Award

Emotional Support Conversation (ESC) is a task aimed at alleviating individuals’ emotional distress through daily conversation. Given its inherent complexity and non-intuitive nature, ESConv dataset incorporates support strategies to facilitate the generation of appropriate responses. Recently, despite the remarkable conversational ability of large language models (LLMs), previous studies have suggested that they often struggle with providing useful emotional support. Hence, this work initially analyzes the results of LLMs on ESConv, revealing challenges in selecting the correct strategy and a notable preference for a specific strategy. Motivated by these, we explore the impact of the inherent preference in LLMs on providing emotional support, and consequently, we observe that exhibiting high preference for specific strategies hinders effective emotional support, aggravating its robustness in predicting the appropriate strategy. Moreover, we conduct a methodological study to offer insights into the necessary approaches for LLMs to serve as proficient emotional supporters. Our findings emphasize that (1) low preference for specific strategies hinders the progress of emotional support, (2) external assistance helps reduce preference bias, and (3) existing LLMs alone cannot become good emotional supporters. These insights suggest promising avenues for future research to enhance the emotional intelligence of LLMs.