Zhe Sun

Name pronunciation: “Juh Soon”

I work at the boundary of academic AI research and product execution: turning models, evaluations, and human workflows into systems people can trust, inspect, and use.

Research Product work Writing Email

Featured Research

Research threads I want the site to make easy to scan, cite, and discuss.

Evaluation

Human-centered evaluation for AI agents

Benchmarks, rubric design, and product telemetry for measuring whether agentic systems behave well outside curated demos.

Interaction

Interfaces for controllable model behavior

Design patterns that expose uncertainty, provenance, and revision control without making advanced AI tools feel heavy.

Systems

Applied AI systems from prototype to deployment

Translating papers into resilient product architectures: retrieval, feedback loops, eval harnesses, and monitoring.

AI Product Work

Selected product directions for translating research capability into useful workflows.

Research tooling

Collaboration Areas

LLM Evaluation Agent UX AI Safety Applied ML Product Strategy

“The work I want this site to foreground is research that survives contact with messy product reality: measurable, inspectable, and useful to the people making decisions.”

Focus

Reusable building blocks for the academic/product profile.

Research Agenda

Concise positioning for publications, preprints, talks, and ongoing questions.

View research

Product Portfolio

Case-study slots for shipped tools, prototypes, evaluations, and strategy work.

View products

Public Writing

Essays that connect model capability, evaluation culture, and product judgment.

Read writing

Advising

Clear entry points for collaboration with labs, founders, and product teams.

Start a conversation

Writing

AI Product 18 Feb, 2026

Zhe Sun

Featured Research

Human-centered evaluation for AI agents

Interfaces for controllable model behavior

Applied AI systems from prototype to deployment

AI Product Work

AI research copilot for literature synthesis

Model evaluation dashboard for product teams

Human-in-the-loop automation patterns

Knowledge infrastructure for expert teams

Collaboration Areas

Focus

Research Agenda

Product Portfolio

Public Writing

Advising

Writing

What AI product teams should measure first

The case for slower agent demos

Designing interfaces around uncertainty

Notes on research taste in applied AI