Yupeng Han

Yupeng Han portrait

GPU performance engineer focused on CUDA, low-latency systems, and large-scale compute optimization. Recently expanding into LLM inference systems, with emphasis on transformer serving, KV-cache trade-offs, batching, roofline analysis, and distributed communication.

Professional Experience

  • Staff Software Engineer, Plus AI
  • Senior GPU Engineer, EBots
  • R&D Engineer, Trifo
  • Research Engineer, CMU Robotics Institute