About

Hey there! I am an undergraduate researcher at Stanford University where I study mathematics and computer science. Currently, I work on AI safety and alignment research, investigating ways to make AI systems more reliable, interpretable, and aligned with human values.

Through my work at the ML Alignment & Theory Scholars (MATS) program, I focussed on detecting and preventing potentially dangerous capabilities in AI systems. My research spans from evaluating model behaviors to developing tools for safety testing, with contributions to frameworks used by major AI organizations including DeepMind, Anthropic, and the UK AI Safety Institute.

Previously, I’ve explored various aspects of AI through projects in adversarial reinforcement learning, language-controlled robotics, and automated theorem-proving. These diverse experiences helped me recognize the crucial importance of ensuring AI systems remain safe and beneficial as they become more capable.

Off the grid, I am love to play Ultimate Frisbee and take pride in representing Stanford as co-captain of Stanford’s club team, Huck-Syndrome.

In my free time, I enjoy making pottery and perform improv. Whether I’m making (occasionally lopsided) mugs or thinking on my feet during a scene, these creative outlets help me stay balanced and bring some fun into my week.

Kai Fronsdal AI Researcher

About