Job Description:
The Senior AI Test Engineer is responsible for validating the quality, reliability, safety, performance, and governance of AI systems, including LLMs, RAG pipelines, and AI agents. Unlike traditional QA role, this position focuses on LLM evaluation, behavioral testing, hallucination, bias detection, AI test automation, and alignment with Responsible AI governance standards.
Key Responsibilities:
· Define AI test strategy for LLMs, AI agents, and orchestration workflows
· Design and maintain golden evaluation datasets, prompt test cases, and benchmarking frameworks for LLMs, RAG systems
· Perform regression testing across prompt updates, agent logic changes, multi-step agent workflows and model upgrades
· Measure and analyze accuracy, consistency, latency, throughput, and cost efficiency of AI services
· Integrate AI test suites into CI/CD pipelines with automated quality gates
· Conduct red-team testing to identify safety, compliance, security, and prompt-injection vulnerabilities
· Provide actionable insights through evaluation reports, metrics, and release readiness assessments
Qualifications:
· 7–8 years of experience in AI Test Automation
· Proven experience in testing AI/ML/LLM-based systems
· Strong understanding of:
Prompt behavior, model evaluation, dataset curation and risk management
· Hands-on experience with:
Python, Pytest, Java
Test automation tools (Selenium, Playwright)
· AI evaluation tools and frameworks
· Azure AI Services / Azure OpenAI, CI/CD pipelines and GitHub Actions
· Test automation integration into DevOps workflows