Background in Physics (BSc) and Data Science (MSc). Preparing for PhD-level research while building evaluation-driven AI systems, data pipelines, and agent tooling.
- Agent Architectures & Skill Optimization: Designing inspectable agent workflows, reusable skill artifacts, and evaluation loops that improve behavior without relying on ad hoc prompt tweaking.
- LLM Evaluation & Alignment: Studying behavioral failure modes such as sycophancy, reward hacking, prompt bloat, and noisy benchmark selection.
- Reinforcement Learning & World Models: Exploring how agents learn, plan, update internal representations, and make decisions under uncertainty.
- Text & Multilingual Networks: Using graph theory and NLP to map semantic, rhetorical, and behavioral structure across languages.
- Human-Centered AI Systems: Building tools that reduce cognitive load while keeping decisions, provenance, and failure modes visible.
- Bayesian Evolutionary Skill Optimization: Research prototype for optimizing LLM agent skills with Bayesian surrogate modeling, acquisition-based candidate selection, validation gates, and Pareto-style trade-offs.
- Resume Optimizer: LLM job-application pipeline with conversational UX, Supabase-backed state, and provenance/event tracking.
- LLM Text Network Analysis: Graph and NLP pipeline for extracting semantic structure from multilingual text datasets.
- SycoBench: Benchmarking and evaluation work around sycophancy and behavioral tendencies in language models.


