← all jobs

[Remote] QA Engineer, AI Products

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. MDCalc is a leading medical reference widely used by clinicians to improve patient outcomes. They are seeking a QA Engineer to ensure the quality and reliability of their AI-powered features, focusing on testing LLM-based systems and collaborating with cross-functional teams.

Responsibilities

  • Design and execute test strategies for LLM-powered features, including prompt regression testing, output evaluation, and hallucination detection
  • Build and maintain automated evaluation pipelines (eval sets, golden datasets, LLM-as-judge frameworks) to catch quality regressions in non-deterministic outputs
  • Perform black-box and exploratory testing of MDCalc's AI features across web and mobile, with particular attention to clinical accuracy, safety, and edge cases
  • Define quality metrics for AI outputs (accuracy, faithfulness, relevance, safety, latency, cost) and establish thresholds for release readiness
  • Collaborate cross-functionally with engineers, product managers, ML/AI engineers, and clinical reviewers to define what 'good' looks like for AI responses
  • Investigate and triage AI failure modes, distinguishing model issues, prompt issues, retrieval issues, and integration bugs
  • Participate in team discussions, offering feedback on testability, risks, prompt design, and guardrails
  • Help develop QA strategies to expand future testing capacity, automation, and evaluation coverage as the AI product surface grows

Skills

  • 5+ years of experience in software QA, with at least 1 year of hands-on testing of LLM-based or AI/ML-powered features
  • Strong understanding of QA principles, test case creation/documentation, and best practices for both deterministic and non-deterministic systems
  • Hands-on experience with LLM tooling and concepts: prompt engineering, RAG systems, evaluation frameworks (e.g., Promptfoo, Braintrust, LangSmith, DeepEval, Ragas, OpenAI Evals), and LLM APIs (OpenAI, Anthropic, etc.)
  • Experience designing automated qualitative evaluation approaches, including LLM-as-judge, rubric-based scoring, semantic similarity checks, and golden dataset regression testing
  • Proficiency with test automation tools, with a focus on Playwright
  • Strong SQL skills for data validation, test data creation, and verifying data integrity across systems
  • Familiarity with token usage, latency profiling, and cost monitoring as quality signals
  • Eagerness to learn quickly and a positive, solutions-oriented attitude
  • Clear and concise communicator, able to surface issues, blockers, and risks effectively when communicating ambiguous or probabilistic failures
  • Self-motivated, proactive, and able to manage time and priorities independently

Benefits

  • Medical, Dental, & Vision Coverage, with option to extend to your dependents
  • Company-sponsored short-term insurance
  • Fully-paid 8 week parental leave, after 6 months of employment
  • Company-sponsored 401k, after 3 months of employment
  • Unlimited vacation for salaried roles - we trust you to take the time you need
  • Bi-annual company offsites to connect, reflect, and plan together
  • Work from home monthly stipend
  • A culture of fun and motivated team members who believe in a greater mission here at MDCalc

Company Overview

  • MDCalc is used by over 2/3 of US physicians, provides free and access to 800+ medical scores, calculations and algorithms. It was founded in 2005, and is headquartered in New York, New York, USA, with a workforce of 11-50 employees. Its website is https://www.mdcalc.com.
  • More open positions

    [Remote] Strategic Account Manager - Northeast

    Work from home Full-time role

    [Remote] ML Ops Engineer (AI)

    Work from home Full-time role

    [Remote] Technical Account Manager

    Work from home Full-time role

    [Remote] Clinical Biostatistician II

    Work from home Full-time role

    [Remote] Economist, Healthcare Innovations

    Work from home Full-time role

    Copy of Senior Solutions Architect | Netherlands | Remote

    Work from home Full-time role

    Telehealth Physician Assistant (FL Licensed) – Part-Time | HRT / Functional Medicine

    Work from home Full-time role

    Disaster Recovery Public Assistance Specialist - Reservist On Call - Remote US

    Work from home Full-time role

    Senior UX/UI Designer - Temporary - REMOTE

    Work from home Full-time role

    Senior Manager, Talent Acquisition

    Work from home Full-time role

    Experienced Remote Chat Consultant – Customer Service & Sales Expert

    Work from home Full-time role

    Remote Data Entry Specialist - Part-Time ($25-$35/hour) at careerzynith

    Work from home Full-time role

    Transportation Engineer Senior ($4,000 Hiring Incentive)

    Work from home Full-time role

    Senior Donor Advisor (Western U.S.)

    Work from home Full-time role

    Analyst-Compliance AML Special Investigations

    Work from home Full-time role

    Pessoal Desenvolvedora Fullstack (Node + React) Pl.

    Work from home Full-time role

    Pessoa Engenheira de Dados Sênior

    Work from home Full-time role

    Coupa Integration Engineer

    Work from home Full-time role

    Steuerfachkraft (m/w/d) in Gillersheim mindestens 52.000€ - 100% Remote möglich

    Work from home Full-time role

    Freelance Medical Content Writer

    Work from home Full-time role

    Experienced Full Stack Sr Manager, Social Media Customer Support – Web & Cloud Application Development

    Work from home Full-time role