healthcare-eval-harness
OfficialPatient safety evaluation harness for healthcare application deployments. Automated test suites for CDSS accuracy, PHI exposure, clinical workflow integrity, and integration compliance. Blocks deployments on safety failures.
What this skill does
When applied, it prepends a system prompt before your request is sent — no extra calls and no change to how you are billed beyond the added tokens.
--- name: healthcare-eval-harness description: Patient safety evaluation harness for healthcare application deployments. Automated test suites for CDSS accuracy, PHI exposure, clinical workflow integrity, and integration compliance. Blocks deployments on safety failures. origin: Health1 Super Speciality Hospitals — contributed by Dr. Keyur Patel version: "1.0.0" --- # Healthcare Eval Harness — Patient Safety Verification Automated verification system for healthcare application deployments. A single CRITICAL failure blocks deployment. Patient safety is non-negotiable. > **Note:** Examples use Jest as the reference test runner. Adapt commands for your framework (Vitest, pytest, PHPUnit, etc.) — the test categories and pass thresholds are framework-agnostic. ## When to Use - Before any deployment of EMR/EHR applications - After modifying CDSS logic (drug interactions, dose validation, scoring) - After changing database schemas that touch patient data - After modifying authentication or access control - During CI/CD pipeline configuration for healthcare apps - After resolving merge conflicts in clinical modules ## How It Works The eval harness runs five test categories in order. The first three (CDSS Accuracy, PHI Exposure, Data Integrity) are CRITICAL gates requiring 100% pass rate — a single failure blocks deployment. The remaining two (Clinical Workflow, Integration) are HIGH gates requiring 95%+ pass rate. Each category maps to a Jest test path pattern. The CI pipeline runs CRITICAL gates with `--bail` (stop on first failure) and enforces coverage thresholds with `--coverage --coverageThreshold`. ### Eval Categories **1. CDSS Accuracy (CRITICAL — 100% required)** Tests all clinical decision support logic: drug interaction pairs (both directions), dose validation rules, clinical scoring vs published specs, no false negatives, no silent failures. ```bash npx jest --testPathPattern='tests/cdss' --bail --ci --coverage ``` **2. PHI Exposure (CRITICAL — 100%
Use this skill
Add a "skill" field with the skill’s ID to your chat completion request. It is applied server-side before your prompt is sent — no extra calls.
{
"model": "gpt-4o-mini",
"skill": "imp-4417fdb2-bb10-4e27-bab8-cea70dc28753",
"messages": [{ "role": "user", "content": "…" }]
}Install the skill, enable it in your dashboard and (optionally) limit it to specific models. It then applies automatically to every matching request — with no "skill" field to send each time.
Set it up in your dashboardMore skills
Set up and use 1Password CLI for sign-in, desktop integration, and reading or injecting secrets.
Create, view, edit, delete, search, move, or export Apple Notes via the memo CLI on macOS.
List, add, edit, complete, or delete Apple Reminders and reminder lists via remindctl.
Create, search, and manage Bear notes via grizzly CLI.
Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI.
BluOS CLI (blu) for discovery, playback, grouping, and volume.
Capture frames or clips from RTSP/ONVIF cameras.
Search, install, update, sync, or publish agent skills with the ClawHub CLI and registry.