Byung-Kwan Lee
“Standing on the shoulders of giants.” — Isaac Newton

Research Interest

Building efficient, high-performing Vision-Language Models (VLMs), with focus on:

Collaboration Requests & Job Applications

If your research interests align with mine and you already have a draft idea to discuss and develop together, feel free to reach out for collaboration. For university collaboration, alignment is the primary criterion. We are also actively looking for talented internship and full-time candidates. Preferred qualifications are strong first or co-first author publications at top-tier main-track conferences (not workshops) and deep expertise in one area. In my view, a strong profile means five-to-ten first or co-first author papers at top-tier venues (CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, ACL, EMNLP). I wouldn't consider other conferences and journals since I am not familiar with the others. If you meet these criteria, click "Open Gmail" to open a pre-filled draft, attach your resume, and send. Please keep it as a brief cold email.

University Collaboration with NVIDIA

NVIDIA Research Intern Application

NVIDIA Research Full-time Application

Work Experience

NVIDIA Research Scientist Oct. 2025 — Current
  • LEAD AXPO: Agent eXplorative Policy Optimization [Completed] Agentic RL training that recovers tool usage through tool-call resampling, improving multimodal reasoning performance against larger baselines.
    Under Review U.S. Patent Application 2026 ArXiv Release Internal Release
  • LEAD Masking Teacher and Reinforcing Student [Completed] Mask-progressive RL distillation that gradually unmasks teacher weights and uses offline RL with accuracy & distillation rewards.
    CVPR 2026 Accept U.S. Non-Provisional Patent Filed 2026 ArXiv Release Internal Release
  • LEAD GenRecal [Completed] Cross-architecture VLM distillation via a Recalibrator that aligns heterogeneous token representations regardless of vocabulary, token splits, or index ordering.
    Under Review U.S. Patent Upgraded to Non-Provisional 2025 Internal Tech Transfer
  • LEAD Unified RL & Imitation Learning for VLMs [Completed] Combines RL with adversarial imitation, using an LLM-based discriminator and multi-teacher guidance to build lightweight yet powerful VLMs.
    NeurIPS 2025 Accept U.S. Patent Upgraded to Non-Provisional 2025 ArXiv Release Internal Release
NVIDIA Research Intern Oct. 2024 — Oct. 2025
  • LEAD VLsI: Verbalized Layers-to-Interactions [Completed] Layer-wise distillation using intermediate verbalizers, enabling small VLMs (2B/7B) to align with large VLMs' reasoning progression and outperform GPT-4V.
    CVPR 2025 Accept U.S. Non-Provisional Patent Filed 2025 ArXiv Release Internal Release Internal Tech Transfer
  • LEAD GenRecal [Initiated] Initial design of the Recalibrator framework for cross-tokenizer VLM distillation; first proof-of-concept and internal demo.
    U.S. Provisional Patent Filed 2025 ArXiv Release Internal Release
  • LEAD Unified RL & Imitation Learning for VLMs [Initiated] First formulation of the RL + adversarial imitation training pipeline; initial experiments and team setup.
    U.S. Provisional Patent Filed 2025

Education

KAIST Mar. 2020 — Aug. 2025
Ph.D., School of Electrical Engineering GPA 3.77 / 4.3
Dissertation: Building High-performing, Efficient-size Vision Language Models: Merge, Modify, and Distill  [Link]  [Degree Certificate]
KAIST Mar. 2018 — Feb. 2020
M.S., The Cho Chun Sik Graduate School of Green Transportation GPA 3.72 / 4.3
Thesis: Training Encoder-Attention through Fully-Connected CRFs for Efficient End-to-End Lane Detection Model  [Link]  [Degree Certificate]
Hanyang University Mar. 2014 — Feb. 2018
B.S., Mathematics and Electronic Engineering GPA 3.86 / 4.5

Publications

Overall16 Accepts · 4 Pending · 2 Tech Reports
Computer Vision5 CVPR · 2 ICCV · 1 ECCV · 1 ICIP
Machine Learning3 NeurIPS · 1 ICLR
NLP1 ACL · 1 EMNLP
Journal1 Pattern Recognition
Publication Profile
  1. AXPO
    "Agent Explorative Policy Optimization for Multimodal Agentic Reasoning"
    Minki Kang, Shizhe Diao, Ryo Hachiuma, Sung Ju Hwang, Pavlo Molchanov, Yu-Chiang Frank Wang, Byung-Kwan Lee
    Under Review   [Paper (Coming Soon)] [Project]
    U.S. Patent Application, NVIDIA Research, 2026
  2. Hide to See
    "Hide to See: Reasoning-prefix Masking for Visual-anchored Thinking in VLM Distillation"
    Seonghoon Yu, Dongjun Nam, Byung-Kwan Lee†, Jeany Son†
    Under Review   [Paper][Code]
  3. DSTP
    "Why and When Visual Token Pruning Fails? A Study on Relevant Visual Information Shift in MLLMs Decoding"
    Jiwan Kim, Kibum Kim, Wonjoong Kim, Byung-Kwan Lee, Chanyoung Park
    Under Review   [Paper][Project]
  4. GenRecal
    "GenRecal: Generation after Recalibration from Large to Small Vision Language Models"
    Byung-Kwan Lee, Ryo Hachiuma, Yong Man Ro, Yu-Chiang Frank Wang, Yueh-Hua Wu
    Under Review   [Paper][Project]
    U.S. Patent Application Filed (Non-Provisional), NVIDIA Research, 2025
  5. MTRS
    "Masking Teacher and Reinforcing Student for Distilling Vision-Language Models"
    Byung-Kwan Lee, Yu-Chiang Frank Wang, Ryo Hachiuma
    Computer Vision and Pattern Recognition (CVPR), 2026   [Paper]
    U.S. Patent Application Filed (Non-Provisional), NVIDIA Research, 2026
  6. R-TAP
    "Recursive Think-Answer Process for LLMs and VLMs"
    Byung-Kwan Lee*, Youngchae Chee*, Yong Man Ro
    Computer Vision and Pattern Recognition (CVPR) Findings, 2026   [Paper][Project]
  7. RefineBench
    "RefineBench: Evaluating Refinement Capability in Language Models"
    Young-Jun Lee*, Seungone Kim*, Byung-Kwan Lee, Minkyeong Moon, Yechan Hwang, Jong Myoung Kim, Graham Neubig, Sean Welleck, Ho-Jin Choi
    International Conference on Learning Representations (ICLR), 2026   [Paper][Project]
    Best Runner-Up Award (Oral, Top 1%), Multi-Turn Interactions in LLMs Workshop @ NeurIPS 2025   [Link]
  8. RIL
    "Unified Reinforcement and Imitation Learning for Vision-Language Models"
    Byung-Kwan Lee, Ryo Hachiuma, Yong Man Ro, Yu-Chiang Frank Wang, Yueh-Hua Wu
    Neural Information Processing Systems (NeurIPS), 2025   [Paper][Project]
    U.S. Patent Application Filed (Non-Provisional), NVIDIA Research, 2025
  9. MultiVerse
    "MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models"
    Young-Jun Lee, Byung-Kwan Lee, Jianshu Zhang, Yechan Hwang, Byungsoo Ko, Han-Gyu Kim, Dongyu Yao, Xuankun Rong, Eojin Joo, Seung-Ho Han, Bowon Ko, Ho-Jin Choi
    IEEE/CVF International Conference on Computer Vision (ICCV), 2025   [Paper][Project]
    Workshop for Knowledge-Intensive Multimodal Reasoning, ICCV 2025   [Link]
  10. VLsI
    "VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models"
    Byung-Kwan Lee, Ryo Hachiuma, Yu-Chiang Frank Wang, Yong Man Ro, Yueh-Hua Wu
    Computer Vision and Pattern Recognition (CVPR), 2025   [Paper][Project]
    U.S. Patent Application Filed (Non-Provisional), NVIDIA Research, 2025
  11. Phantom
    "Phantom of Latent for Large Language and Vision Models"
    Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man Ro
    Technical Report   [Paper][Code][HF Model]
  12. TroL
    "TroL: Traversal of Layers for Large Language and Vision Models"
    Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man Ro
    Empirical Methods in Natural Language Processing (EMNLP), 2024   [Paper][Code][HF Model]
  13. Meteor
    "Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models"
    Byung-Kwan Lee, Chae Won Kim, Beomchan Park, Yong Man Ro
    Neural Information Processing Systems (NeurIPS), 2024   [Paper][Code][HF Model]
    31st Samsung HumanTech Paper Awards in Computer Science & Engineering
  14. MoAI
    "MoAI: Mixture of All Intelligence for Large Language and Vision Models"
    Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Ro
    European Conference on Computer Vision (ECCV), 2024   [Paper][Code][HF Model]
  15. CoLLaVO
    "CoLLaVO: Crayon Large Language and Vision mOdel"
    Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Ro
    Findings of the Association for Computational Linguistics (ACL), 2024   [Paper][Code][HF Model]
    2024 KCC XAI Workshop, Best Paper Awards
  16. Causal Unsupervised Segmentation
    "Causal Unsupervised Semantic Segmentation"
    Junho Kim*, Byung-Kwan Lee*, Yong Man Ro
    Journal of Pattern Recognition   [Paper][Code]
  17. Adversarial Double ML
    "Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning"
    Byung-Kwan Lee*, Junho Kim*, Yong Man Ro
    IEEE/CVF International Conference on Computer Vision (ICCV), 2023   [Paper][Code]
  18. C2Cap
    "Mitigating Dataset Bias in Image Captioning through CLIP Confounder-free Captioning Network"
    YeonJu Kim, Junho Kim, Byung-Kwan Lee, Sebin Shin, Yong Man Ro
    IEEE International Conference on Image Processing (ICIP), 2023   [Paper][Code]
  19. Causal Adversarial Instruments
    "Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression"
    Junho Kim*, Byung-Kwan Lee*, Yong Man Ro
    Computer Vision and Pattern Recognition (CVPR), 2023   [Paper][Code]
  20. Masking Adversarial Damage
    "Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse Network"
    Byung-Kwan Lee*, Junho Kim*, Yong Man Ro
    Computer Vision and Pattern Recognition (CVPR), 2022   [Paper][Code]
  21. Information Bottleneck
    "Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck"
    Junho Kim*, Byung-Kwan Lee*, Yong Man Ro
    Neural Information Processing Systems (NeurIPS), 2021   [Paper][Code]
  22. Hierarchical Bayesian Defense
    "Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical Variational Inference"
    Byung-Kwan Lee, Youngjoon Yu, Yong Man Ro
    Technical Report   [Paper][Code]

Reviewer Experience

Journal

Conference

Invited Talks & Awards

NVIDIA Stock

Current Price & Today Change TradingView
Loading NVDA price...
Loading NVDA daily chart...