Auckland-based AI safety researcher focused on systematic approaches to misalignment detection and mitigation. 2nd place winner of the prestigious Palisade Research AI Misalignment Bounty, demonstrating reproducible misalignment behaviors in advanced AI models including o3 and GPT-5.
Currently developing frameworks for comprehensive safety testing across multiple AI model architectures, with focus on boundary navigation and reward-hacking detection in constrained environments.
Unified AI Misalignment Framework Comprehensive system for systematic AI safety testing across multiple model implementations and reasoning paradigms. Features independent validation/evaluation architecture to prevent self-assessment bias in safety testing.
LLM RAG Prompt Injections Security research examining prompt injection vulnerabilities in retrieval-augmented generation systems.
- AI Safety Testing: Framework development for systematic misalignment detection
- Model Security: Prompt injection research and mitigation strategies
- Research Infrastructure: Containerized, reproducible AI research environments
- Multi-Model Analysis: Comparative safety evaluation across AI architectures
Languages: Python, JavaScript, Bash AI/ML: OpenAI API, Anthropic API, LiteLLM Web Development: Full-stack web applications, responsive design Infrastructure: Docker, Docker Compose Research Tools: Systematic evaluation frameworks, automated testing pipelines Professional Development: Mission Ready HQ Full Stack Developer (August 2025)
Contributing to AI safety research through Approxiom Research, focusing on boundary navigation behaviors and architectural vulnerabilities in AI safety systems. Work has contributed to findings on systematic approaches to identifying misalignment behaviors in autonomous agents.
Palisade Research Misalignment Bounty - 2nd Place Winner
- Demonstrated reproducible misalignment behaviors in o3 and GPT-5 models
- Identified AI agents' ability to overcome permission constraints and perform reward-hacking
- Developed systematic methodology for testing boundary navigation in constrained environments
- Contributed to understanding of architectural vulnerabilities in advanced AI systems
- Built multi-provider AI testing framework - Architected Docker-based system supporting OpenAI, Anthropic APIs with independent validation architecture, eliminating self-evaluation bias across 3+ model implementations
- Developed automated code analysis suite - Created Python tools for technical debt assessment, duplicate code detection, and complexity analysis, reducing manual code review time and improving codebase maintainability
- Implemented secure API routing system - Designed environment-configurable model selection preventing security vulnerabilities, with containerized deployment supporting multiple AI providers and failover mechanisms