Boyan(Bryan) Han - Homepage

About Me

I am actively looking for PhD opportunities for 2027.

I am currently a Research Assistant at the AGI Lab, Westlake University, advised by Prof. Chi Zhang. Concurrently, I serve as a Remote Research Intern at the NLP Lab, University of California, Merced, under the supervision of Prof. Yiwei Wang. Prior to this, I received my B.Eng. degree in Artificial Intelligence from Hebei University of Technology.

My research interests focus on Multimodal dLLMs/LLMs, Reinforcement Learning, and dLLM/LLM Agents.

Download my CV

Publications

Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models

Boyan(Bryan) Han, Yiwei Wang, Yi Song, Yujun Cai, Chi Zhang

ACL 2026 (Main)

Paper / Code
Hard to Read, Easy to Jailbreak: How Visual Degradation Bypasses MLLM Safety Alignment

Zhixue Song, Boyan(Bryan) Han, Yiwei Wang, Chi Zhang

ACL 2026 (Findings)

Paper / Code
SQLAgent: Learning to Explore Before Generating as a Data Engineer

Wenjia Jiang, Yiwei Wang, Boyan(Bryan) Han, Joey Tianyi Zhou, Chi Zhang

ACL 2026 (Findings)

Paper / Code
Learning When to Parallelize: Structure-Aware RL for Logical Reasoning in Diffusion LLMs

Boyan(Bryan) Han, Yiwei Wang, Xuechen Wang, Yi Song, Keyu Chen, Chi Zhang

Submitted

Education

Hebei University of Technology

B.Eng. in Artificial Intelligence

Sep 2020 – Jun 2024

Experience

AGI Lab, Westlake University

Research Assistant

Advisor: Prof. Chi Zhang

Apr 2025 – Present

NLP Lab, University of California, Merced

Remote Research Intern

Advisor: Prof. Yiwei Wang

May 2025 – Present

Research Projects

Optimizing Diffusion LLMs via Logic-Aware Reinforcement Learning Reward Design
Lead Project Member · Nov 2025 – Jan 2026
Advisors: Prof. Chi Zhang, Prof. Yiwei Wang

Proposed and implemented a logic-aware RL reward design to address autoregressive degradation in complex reasoning tasks.
Constructed a multi-dimensional reward mechanism — encompassing parallelism, dependency, and result correctness rewards — to guide the optimization of diffusion LLMs through logical structure constraints and generation parallelism.
Systematically evaluated on the ListOps reasoning benchmark, exploring robustness across various positional sampling algorithms. Significantly enhanced generation quality for complex logical tasks while substantially improving parallel decoding efficiency.

Training-Free Format Control for Large Diffusion Language Models
Independent Researcher · Jun 2025 – Oct 2025
Advisors: Prof. Chi Zhang, Prof. Yiwei Wang

Implemented a training-free inference algorithm for large diffusion language models, utilizing token-level softmax scores to guide dynamic anchor selection, ensuring robust structural integrity across diverse scenarios in zero-shot settings.
Conducted rigorous evaluations on mathematical reasoning (GSM8K, MATH) and JSON benchmarks against Dream-7B and static infilling baselines.
Significantly enhanced format adherence while preserving reasoning accuracy: achieved over 70% format retention in mathematical reasoning and maintained a stable ~80% valid JSON generation rate across various extraction methods.

Skills

Programming: Python, C, Shell/Bash, Markdown

Frameworks & Ecosystem: PyTorch, HuggingFace, WandB, Tree-sitter

Soft Skills: Synergistic Collaborator, Intrinsically Motivated, Intellectually Curious

AI-Assisted Development: 100M+ tokens/month vibe coding