Portfolio · 2026

Quality Assurance Data Manager who built
an internal data QA platform.

Sole developer of Vantage, Ideon's internal data QA platform. Validates pipeline outputs, runs source-to-target reconciliation across staged data layers, and catches regressions before they reach downstream BI and operational consumers.

5+ years at Ideon. Most of my work converts manual data checks into automated, CI-gated test suites. I lean on LLM tooling where it speeds that work up.

Role Quality Assurance Data Manager

Location New York, NY

Contact Get in touch ↗

Resume Open resume ↗

System shape · how the platform connects

Scroll to explore

742

Carrier / Audience combos
run daily & on-demand

14+

Tools shipped on
Vantage as sole dev

103

SQL queries gated
by SQL-suite CI

43k+

SBCs indexed and
queryable via SAGE

01 · About

From data ops, to QA, to building the platform.

A non-linear path that turned out to be the right one. Each role left me closer to the systems I now design.

I'm a QA Data Manager at Ideon, focused on data quality. Five-plus years in. Sole developer of Vantage, our internal data QA platform. It validates pipeline outputs, runs source-to-target reconciliation across staged data layers (a Bronze / Silver / Gold medallion architecture), and catches regressions before they reach downstream BI and operational consumers.

Most of my work converts manual data checks into automated, CI-gated test suites. The shape is consistent across tools: define a contract, encode it as a test, gate deployment on the result, surface failures to the people who can act on them, and only escalate to a human when the system can't recover on its own.

I started at Ideon in 2020 as a Data Operations Coordinator, standardizing QHPs, SBCs, and rate files across dozens of sources. That work defined the data contracts that everything downstream now depends on. I grew into QA Coordinator, then in 2024 stepped up as Quality Assurance Data Manager and started building Vantage from scratch.

I lean on LLM tooling where it speeds that work up, Bedrock-backed validators, code-generators, and repair loops, but only with sandboxing, structured outputs, and audit trails sitting underneath. Most of the engineering work is in the plumbing, not the model.

02 · Featured Projects

Things I've shipped, in detail.

Every project below is in production at Ideon. Click through for the architecture, the why, and what it actually does.

Flagship platform

Vantage

The platform everything else lives on. 14+ tools, one auth model, one CI/CD gate, one security posture.

Internal QA engineering platform · sole developer

Live in production

I architected Vantage from scratch as the unified home for Ideon's internal QA tooling. It runs on AWS Elastic Beanstalk with Cognito OAuth 2.0 / PKCE auth and role-based access, and uses a layered architecture that separates test definition, execution, and adaptation concerns so changes in one layer never cascade through the others. Beyond shipping the tools themselves, I designed the security model, the deployment pipeline, and the runtime control plane.

Designed a mandatory CI/CD quality gate: ruff, mypy, pip-audit, bandit, pytest, Selenium browser tests, and a custom SQL-suite check. No deployment proceeds without all checks green.
Implemented a layered production security model: PKCE S256 (RFC 7636/9700), stateless HMAC-SHA256 CSRF, RS256 JWT against the Cognito JWKS endpoint with 1-hour TTL cache, HttpOnly + Secure + SameSite=Lax cookies, ALB-aware per-user rate limits, Nginx scanner-probe blocking, CORS exact-allowlist.
Ran and resolved a full dependency security audit and a bandit code-scan audit to a clean reviewed baseline. pip-audit and bandit gate every CI build going forward.

PythonFlaskPlotly DashGunicornNginxAWS EBCognitoOAuth 2.0 / PKCERS256 JWTCloudWatch

benefitwatch/registry.py · hsa_name_flag_consistency contract

-- Plans with "HSA" in the name must carry -- the hsa_eligible = 'Yes' flag. Any row returned -- is a labeling drift or misingested benefit. SELECT plan_id, plan_name, hsa_eligible FROM medical.plans WHERE plan_year = 2026 AND plan_name ILIKE '%HSA%' AND (hsa_eligible IS NULL OR hsa_eligible <> 'Yes');

What the tools speak

The tools below all run on Vantage, grouped by what they do.

Each is a contract. BenefitWatch is 103 of them, written as Athena SQL like this one. ARIA applies the same idea to LLM judgments. The CI gate refuses to ship if any of them break.

Plan Validation

Standardized, auditable checks that gate plan data before it reaches downstream systems.

— 01

Data Quality

Daily & on-demand

Scale 742 combos / night

Coverage 103 SQL queries

Engine AWS Athena (Trino)

Lives in Vantage

BenefitWatch

5-BATCH ATHENA VALIDATION ENGINE

Vantage's regression suite, a single-file query registry powering 742 unique carrier/audience combinations, run daily and on-demand.

BenefitWatch is the operational core of Ideon's data-quality monitoring. A 5-batch Athena validation engine runs from a single-file registry of 103 SQL queries across ticket-ingestion, QA-ingestion, rate-ingestion, and daily rule-engine batches, with cost-tagged queries and dashboards offering pagination, sorting, filtering, and per-row dismissal. A separate Analytics layer reads S3 run history to surface active, new, resolved, and recurring quality flags across all carriers.

Custom SQL-suite CI check validates the registry: get_queries() returns a non-empty dict[str, str] for every batch×audience, no stale years or unresolved tokens, benefit and rate batches reference the correct tables, and every query parses as valid Trino SQL via sqlglot.
Per-row dismissal with audit logs so dismissed flags don't reappear. Operators get a clean queue without losing history.
Trend dashboards make recurring defect categories visible across processing seasons, replacing one-off spot-checks with a continuous signal.

AthenaTrino SQLsqlglotS3Plotly DashPandasData-Driven Testing

— 02

Data Quality

In production

Scope Entire dataset

Cadence Daily + per-ticket

Priority fields 14 high-priority

Workflows Two: daily & ticket

WREN

CHANGE-LOG MONITOR FOR THE FULL DATASET

Watches every change to the data and queues it for QA approval before it propagates — with priority-tiered flag categorization so reviewers aren't drowning in noise.

WREN tracks every data change across the entire dataset and gates it on QA sign-off. The novel piece is the priority-tiered flag categorization: high-priority flags get triaged first; lower-priority ones are batched. Two workflows run side-by-side. The daily zero-tolerance scan focuses on 14 high-priority fields we've deemed critical based on direct customer feedback — nothing on those fields ships without explicit approval. The ticket-level change-set view surfaces every change made during a given ticket's lifecycle so the reviewer can sign off on the whole bundle at once.

Two cadences, one engine. Daily zero-tolerance scan for the 14 priority fields + a ticket-level diff view for everything else — same flag taxonomy, different triggers.
Priority tiers, not a single severity. High-priority flags get expedited routing; lower-priority ones batch up for end-of-day review. Customer feedback drove the priority list.
Same flag categorization extended to ARIA. Once WREN proved out, the same priority-tier model went into ARIA so its SBC reviewer queue stays focused on what matters.

Change-log MonitorPriority FlaggingQA Approval GatesAudit TrailAthenaS3

— 03

Data Quality

In production

Throughput 50+ plans / batch

Engine pdfplumber + regex

Checks Name · dates · values

Input SBC PDFs on S3

ARIA

AUTOMATED REVIEW AND INTELLIGENCE AGENT

Automated data profiling over semi-structured SBC PDFs — deterministic, regex-based, no LLM in the hot path.

ARIA discovers SBC (Summary of Benefits and Coverage) PDFs on S3, uses pdfplumber to extract coverage period, plan name, and benefit cost-share fields, then runs regex-based checks covering name alignment, coverage dates, deductibles, and benefit values. Three categories of structured checks per plan, processed at 50+ plans per batch with audit-logged dismissals. Every flag, override, and reviewer decision is traceable. Now runs the same priority-tiered flag categorization that WREN pioneered so the reviewer queue stays focused on what matters.

Deliberately not an LLM. Regex on a parsed-text pipeline is faster, cheaper, and fully deterministic, and the audit trail is meaningful precisely because the rules are explicit.
Layered architecture: discovery, extraction, and validation are independently testable concerns, the same test-automation patterns the rest of Vantage uses.
Audit-logged dismissals mean every flag is traceable back to the reviewer who handled it; dismissed flags don't re-fire on the next run.
Priority-tiered flags from WREN. ARIA borrowed WREN's flag taxonomy so high-priority issues get triaged first and the queue doesn't drown reviewers in low-signal noise.

pdfplumberRegexData ProfilingS3PDF ExtractionAudit Trail

— 04

Data Quality

In production

Input Rates, areas, benefits

Checks Curves · period · audience · issuer

Approach Standardized rules

Lives in Vantage

QHP Validation

STANDARDIZED QHP RATE & BENEFIT CHECKS

Ingests QHP rates, service areas, and benefits and runs a standardized battery of checks before the data reaches downstream systems.

QHP Validation takes a carrier's qualified health plan rates, service areas, and benefits and runs them through one standardized set of checks — invalid rate curves, wrong year/quarter, wrong audience, wrong issuer, missing data, and similar contract violations. It replaces ad-hoc, source-by-source QHP review with a single repeatable gate, so the same problems get caught the same way every time.

Standardized checks in one pass: invalid curves, wrong year/quarter, wrong audience, wrong issuer, and missing-data detection.
One consistent gate for every source — QHP review stops depending on who happens to be looking at it.
Runs on the same Vantage data and auth as the other validators, so its flags land in the same review workflow.

QHPRate CurvesValidationACAVantage

Metrics & Reporting

Live operational metrics and season reporting, built on the same data the validation tools produce.

— 05

Metrics

In production

Tracks Publishing rate

Against Acquisition targets

Scope Current year/quarter

Lives in Vantage

Quote Metrics

PUBLISHING RATE VS. ACQUISITION TARGETS

Tracks publishing rate against acquisition targets for the active year/quarter, so Ops always knows how close the season is to plan.

Quote Metrics measures publishing rate against the acquisition targets set for the current year/quarter combination. It gives the operations team a live read on progress toward the season's goals instead of waiting for an end-of-season tally.

Publishing rate vs. acquisition target for the active year/quarter, at a glance.
Turns the season goal into a single number Ops can watch day to day.

MetricsPublishing RateAcquisition Targets

— 06

Metrics

In production

Source Plan-validation output

Surfaces Flag analysis

Scope All validators

Lives in Vantage

Data Ops Quality

VALIDATION METRICS & FLAG ANALYSIS

Validation metrics and flag analysis across the plan-validation output, brought into one quality view.

Data Ops Quality aggregates the validation metrics and flag analysis produced by the plan-validation tools — BenefitWatch, WREN, ARIA, and QHP Validation — into a single view: what's flagging, how often, and where defects cluster. Quality trends become visible across the whole pipeline rather than tool by tool.

One quality view across every validator instead of separate dashboards.
Flag analysis surfaces where defects concentrate and whether they're trending up or down.

MetricsFlag AnalysisData Quality

— 07

Metrics

In production

Data Live + historical Jira

API Jira REST v3

Outputs Per-processor reports, YoY

Feedback From QA comments

Season Reports

PER-PROCESSOR JIRA REPORTING + YEAR-OVER-YEAR BENCHMARKS

Per-processor season reports and year-over-year benchmarks built from Jira data — with feedback pulled straight from QA comments.

Season Reports uses the Jira REST v3 API to build per-processor reports for the active build season, drawing feedback from QA comments and generating data analysis over the underlying ticket data. It also benchmarks each season against prior years, so recurring defect categories and capacity shifts stay visible instead of being re-discovered every cycle.

Live Jira data via REST v3 with JQL — no scheduled exports, no stale dashboards.
Per-processor report cards with feedback drawn from the QA comments on each ticket.
Year-over-year benchmarks treat prior seasons as the baseline, so what didn't change is itself a signal.

Jira REST v3JQLPandasPlotlyYoY Benchmarking

— 08

Metrics

In production

Purpose Override metric queries

Use case Data anomalies / bugs

Keeps Reported metrics accurate

Lives in Vantage

Manual Metric Override

MANUAL OVERRIDE FOR SQL METRIC QUERIES

Manual overrides for the SQL metric queries behind the dashboards — for when data is storing oddly or a bug is hiding it.

Manual Metric Override lets an operator step in and override the SQL metric queries that feed the dashboards when the underlying data is storing weirdly or isn't surfacing because of a system bug. Reported metrics stay accurate during the gap while the root cause is tracked down and fixed.

A controlled escape hatch for when the data — not the query — is the problem.
Keeps reported numbers trustworthy while a storage or ingestion bug is being fixed.

SQLMetricsManual OverrideOperations

Operations

Day-to-day tooling that replaces ad-hoc spreadsheets and scattered references.

— 09

Operations

In production

Type Quick-access links

Audience Operations team

Content Reference docs

Lives in Vantage

Documents

QUICK-ACCESS HUB FOR OPS REFERENCES

A quick-access hub for the documents and references the operations team reaches for every day.

Documents is a curated hub of quick-access links to the references, runbooks, and resources the operations team uses most. It puts the scattered, frequently-needed docs in one predictable place inside Vantage.

One predictable home for the docs Ops actually uses, instead of hunting across drives.

OperationsQuick LinksReference Hub

— 10

Operations

In production

Replaces Tracking Google Sheet

Tracks Carrier component status

Customizable Yes

Lives in Vantage

Acquisition Sheet

CUSTOMIZABLE CARRIER-COMPONENT TRACKER

A customizable, in-app replacement for the Google Sheet the team used to track carrier component statuses.

Acquisition Sheet is an improved, customizable version of the Google Sheet the team relied on to track carrier component statuses. It keeps the familiar tracking workflow but moves it into Vantage, with the structure, controls, and customization a shared spreadsheet could never offer.

Same carrier-component tracking the team already knows, now native to Vantage.
Customizable structure replaces a brittle, hard-to-govern shared spreadsheet.

OperationsCarrier TrackingCustomizable

Salesforce

Real-time Salesforce tooling for the support team.

— 11

Integrations

In production

API Salesforce REST

Query SOQL

Write Direct field writeback

Users QDS Support team

CLIO

CASE LIFECYCLE & ISSUE OPERATIONS

A real-time Salesforce case queue with inline editing, auto-fill, and direct write-back, the daily driver for QDS Support.

CLIO is a real-time Salesforce case queue built for the QDS Support team. It pulls cases via the Salesforce REST API using SOQL, provides inline field editing directly inside the dashboard, auto-fills case fields from the description text, and writes changes back to Salesforce via the same REST API. No tab-switching, no copy-paste.

Reduced the click-cost of casework: inline editing means support reps update Salesforce without ever leaving the queue.
Auto-fill demonstrates light rules-based extraction, pattern-matching on description text to surface the right defaults.
Clean REST client pattern: SOQL for reads, REST writeback for updates, all behind Vantage auth.

Salesforce RESTSOQLWritebackPlotly DashReal-Time

— 12

Analytics

In production

Approach Rules-based

Taxonomy ACA domain keywords

LLM? No, by design

Source Salesforce cases

Salesforce Ticket Quality

RULES-BASED CS TICKET CLASSIFICATION

Deliberately not an LLM. A domain-specific ACA keyword taxonomy that classifies CS tickets faster, cheaper, and more predictably.

Not every problem needs an LLM. Salesforce Ticket Quality classifies customer-support tickets by error reason and solution summary using a domain-specific ACA keyword taxonomy, fully rules-based, fully deterministic, zero inference cost. The dashboard surfaces error and resolution patterns across the support pipeline, giving leadership visibility into recurring CS friction.

Rules-based by design: structured ACA terminology classifies reliably without LLM cost, latency, or non-determinism.
Demonstrates judgment about when not to reach for AI, an underrated discipline in 2026.
Backed by the same Vantage auth/access model as the LLM-augmented tools.

Rules-Based ClassificationACA TaxonomySalesforce RESTPandasPlotly Dash

Automation

LLM-driven rate-file processing — moving from the Adapter Agent to the new, wizard-driven THEA.

— 13

AI / LLM

Live · phasing out

Trigger Jira ticket tagged 'agent'

Isolation Subprocess sandbox

Repairs Up to 3 LLM attempts

Impact ~15 min → 30 sec / ticket

Adapter Agent

SELF-REPAIRING ACA RATES PROCESSING

An agentic loop with real fallback logic — and a real outcome metric. Per-ticket processing dropped from ~15 minutes of manual work to under 30 seconds.

Being phased out in favor of the new, wizard-driven THEA. When a Jira ticket is tagged agent, the Adapter Agent picks it up, pulls the matching script from the carrier library, runs it in an isolated subprocess, samples 3 plans at age 60 as a sanity check, attaches the output back to the ticket, POSTs results to the database, and routes the ticket to QA for fuller review. If the script fails mid-run, the agent attempts up to 3 repairs on AWS Bedrock Claude before escalating to a human.

~15 minutes of manual work → under 30 seconds. The processing-time delta is the headline metric, and the agent escalates only what it genuinely couldn't recover.
Subprocess sandboxing contains LLM-generated repairs: the model can suggest anything, but it can only break its own subprocess. The host stays clean.
Age-60 sample as a built-in canary. Output isn't trusted until three plans at a fixed age look right, then results POST to the database and the ticket routes to QA.

Agentic PipelinesLLM Code RepairSubprocess SandboxingJira REST v3Bedrock ClaudeHuman-in-the-loop

— 14

AI / LLM

In production

Input Any rate source (+ crosswalk)

Engine Claude Haiku 4.5

Output Age-banded / family-tiered template

Learns Stores carrier scripts

THEA

TEMPLATE HARVESTING & EXTRACTION AUTOMATION

A guided wizard: upload any rate source and THEA explores it, generates the right template with Haiku 4.5, and learns from every run.

THEA is a guided wizard for turning any carrier rate source into a clean, typed template. You upload a rate file (and a crosswalk if one applies) and THEA uses Claude Haiku 4.5 to explore the source, find the information it needs, and generate a defined template — age-banded or family-tiered — based on what the source actually contains. When it finishes, THEA asks you to confirm the template is complete, learns from any corrections, and stores the correctly generated carrier script for reuse. Think of it as Claude Cowork shaped for rate processing, so outsourced team members can put Claude to work on real files without ever being on the team account.

Guided wizard: upload any rate source (plus a crosswalk if applicable) and THEA works out how to process it.
Haiku 4.5 generates the right template shape — age-banded or family-tiered — from what it finds in the source.
Confirms completeness, learns from mistakes, and stores correct carrier scripts so the next file of that type is instant.
Lets outsourced teammates use Claude on real work without needing a seat on the team account.

Claude Haiku 4.5Guided WizardTemplate GenerationCrosswalksSelf-learningCarrier Scripts

Under Development & Future Tools

Live and usable, but not the current focus — future ideas that still need some love.

— 15

AI / LLM

Live · no active development

Function 1 Docs Q&A over internal flows

Function 2 PDF → ingest-ready records

Coverage ~15% of carrier universe

Roadmap SageMaker training in flight

SAGE

SBC ANALYSIS AND GENERATION ENGINE

Two functions under one engine, documentation Q&A on our internal grammar, and an LLM-assisted PDF-to-ingest translator that hits ~15% of carriers today.

SAGE pairs two related capabilities. (1) Answers questions about our internal ingestion flows and data-grammar rules over internal documentation. (2) Parses carrier PDFs with pdfplumber, maps fields into our internal schema (the same target ARIA uses), then uses AWS Bedrock Claude with grammar rules, regex patterns, and embedded LLM instructions to translate extracted data into ingest-ready records. The translation path is under active development and covers roughly 15% of the carrier universe; we're training an AWS SageMaker model on our own data to extend coverage.

Two related capabilities under one engine. The docs Q&A surfaces the grammar; the PDF translator applies it.
Shared internal schema with ARIA means SAGE-translated records land in the same shape downstream consumers already trust.
SageMaker model in training on real ingestion data to push coverage beyond what Bedrock prompt-engineering reaches.

Bedrock ClaudeAWS SageMakerpdfplumberGrammar RulesRegexRAG-adjacent

— 16

AI / LLM

Live · no active development

Model Claude Haiku 4.5

Answers Coverage / availability

Users Customer success

Status Maintained

CSM Assistant

COVERAGE & DATA-AVAILABILITY ASSISTANT

A chat assistant that answers data-availability and coverage-gap questions for customer success, backed by Claude Haiku.

CSM Assistant lets customer success managers ask plain-language questions about data availability, coverage gaps, and ingestion status and get answers in seconds, without pinging data ops. It's live and usable today but isn't under active development right now.

Self-serve answers on coverage and ingestion status for CSMs.
Built on Claude Haiku, and available as an interactive demo in the Explore section below.

Claude HaikuCoverageCustomer Success

03 · Explore

Try the tools, not just read about them.

Interactive mockups of what these tools actually look like in production. Same data shape, sanitized values, no real plans. Start with ARIA, the SBC review agent.

ARIA — SBC Review

A simulated run from last night's batch. Click any finding to inspect the structured checks, the extracted-vs-database diff, and the audit trail. Dismissing a flag updates the queue and the summary counts in real time.

Sample data · not real plans

No batch run yet · click Run new batch to scan SBCs

—

Plans reviewed

—

Open findings

—

Dismissed

—

Run duration

Select a finding from the queue to inspect the structured checks, evidence diff, and audit trail.

04 · Skills

Five core disciplines, grounded in production work.

The same five categories the resume uses. Every tag below shows up in a tool that ships in the projects section.

Data Quality & Testing

Source-to-target reconciliation, contract-style validation, anomaly & schema-drift detection, and the recurring-defect tracking that closes the loop with data operations.

Source-to-Target ReconciliationData ProfilingAnomaly & Schema-Drift DetectionReferential IntegrityNull-Pattern AnalysisAggregation ValidationUAT SupportIssue Logging & Trend ReportingCustom Data Contracts (sqlglot)dbt / Great Expectations Patterns

SQL & Data

Large-scale Athena/Trino SQL plus the analytics layer on top, interactive Dash, Pandas/openpyxl pipelines, and domain-specific ACA taxonomies.

Advanced SQLAthena / TrinoPandasopenpyxlPlotly / Dash DashboardsYoY BenchmarkingRules-Based ClassificationDomain Taxonomies

Engineering & Cloud

Production Python on AWS, Flask behind Gunicorn and Nginx, subprocess sandboxing for untrusted code, TTL caching, runtime feature flags, and CI/CD quality gates.

Python (production)FlaskREST APIsNginxGunicornSubprocess SandboxingRuntime Feature FlagsTTL CachingAWS Elastic BeanstalkAthenaS3CognitoBedrockCloudWatchSeleniumCI/CD Quality Gates

AI Tooling

Bedrock-backed pipelines in production, RAG over mixed document sets, structured tool use to constrain agent behavior, and LLM repair loops that recover broken jobs without paging a human.

AWS Bedrock (Claude)AWS SageMaker (in progress)RAG over Mixed DocumentsLLM Repair LoopsFew-Shot PromptingStructured Tool Use

Integrations & Security

External APIs (Jira, Salesforce, GitHub) and a production security model from first principles, PKCE, RS256 JWT against Cognito JWKS, stateless HMAC CSRF, ALB-aware rate limiting, and CVE-clean dependencies on every build.

Jira REST v3Salesforce REST (SOQL + Write-back)GitHubPostmanOAuth 2.0 / PKCERS256 JWT / JWKSHMAC-SHA256 CSRFpip-audit / bandit in CIALB-Aware Rate Limiting

05 · Experience

Five-plus years at Ideon, three roles, one trajectory.

Data operations → quality assurance → engineering. Each role was the foundation for the next.

Feb 2024 – Present Ideon · New York, NY

Quality Assurance Data Manager

Sole developer of Vantage · Data QA & platform engineering

Built and now operate Vantage, an internal data QA platform on AWS that validates pipeline outputs across staged data layers, mirroring a Bronze/Silver/Gold medallion architecture. Designed the CI/CD quality gate, the production security model, and every tool that lives on the platform.

Vantage

Built on AWS Elastic Beanstalk, Athena, S3, and Cognito. Cognito OAuth 2.0 / PKCE auth with role-based access. Layered architecture separates test definition, execution, and reporting.
BenefitWatch

Automated data validation engine, single-file registry of 103 SQL tests covering source-to-target reconciliation, business-logic validation, and aggregation checks. Runs 742 unique carrier/audience combos daily and on-demand against the Data Operations system. Pagination, sorting, filtering, and per-row dismissal in the failure dashboard. Cost-tagged Athena queries so each test's actual cost is visible.
Custom CI gate enforcing data contracts

Same job as dbt tests or Great Expectations, built bespoke. The SQL-suite check validates that every batch×audience returns a non-empty dict[str, str], no stale years or unresolved tokens, every batch references the correct table, and every query parses as valid Trino SQL via sqlglot. Paired with ruff, mypy, pip-audit, bandit, pytest, and Selenium browser checks. No deployment proceeds without all checks green.
Adapter Agent

Built to automate ACA rates processing. When a Jira ticket is tagged agent, it pulls the matching script from the carrier library, runs it in an isolated subprocess, samples 3 plans at age 60 as a sanity check, POSTs results to the database, and routes the ticket to QA. If the script fails mid-run, the agent attempts up to 3 LLM repairs on AWS Bedrock Claude before escalating to a human. Per-ticket processing dropped from ~15 minutes of manual work to under 30 seconds.
ARIA: Automated Review and Intelligence Agent

Automated data profiling over semi-structured SBC PDFs. Discovers files on S3, uses pdfplumber to extract coverage period, plan name, and benefit cost-share fields, then runs regex-based checks for name alignment, coverage dates, deductibles, and benefit values. Processes 50+ plans per batch with audit-logged dismissals.
BenefitWatch Analytics

Quality-trend engine reading S3 run history to surface active, new, resolved, and recurring data quality flags across all carriers. The data-quality issue log and recurring-defect tracker the JD calls out.
SAGE: SBC Analysis and Generation Engine

Two functions. (1) Answers questions about internal ingestion flows and data-grammar rules over internal documentation. (2) Parses carrier PDFs with pdfplumber, maps fields into our internal schema, then uses Bedrock Claude with embedded grammar rules and regex patterns to translate extracted data into ingest-ready records. Covers ~15% of the carrier universe today; we're training an AWS SageMaker model on our data to extend coverage.
THEA: Template Harvesting & Extraction Automation

Takes Excel or PDF rate files, generates a Python extraction script for each, runs it in a sandboxed subprocess on Bedrock Claude, and produces typed CSV. THEA runs its own validations and surfaces the CSV for analyst review; on approval, it POSTs the rates to the database and ships the generated script to the S3 carrier library alongside the existing ones.
Production security and reliability

pip-audit and bandit on every CI build catch dependency CVEs and security anti-patterns before deploy. At runtime: PKCE S256, stateless HMAC-SHA256 CSRF, RS256 JWT validation against Cognito JWKS, ALB-aware per-user rate limiting.

Jun 2021 – Feb 2024 Ideon · New York, NY

Data Quality Assurance Coordinator

Validation engineering · Predecessor work to BenefitWatch

Built the predecessor SQL validation routines that became BenefitWatch, where I started treating queries and schemas as testable artifacts rather than one-off scripts. Applied automated and manual validation techniques (regression and smoke) across the carrier system under test.

Predecessor SQL validation routines that directly informed BenefitWatch's registry pattern.
Regression and smoke testing across carriers; partnered with Engineering and Data Operations on root-cause analysis to reduce recurring defect categories.
Mentored teammates on automation tools and led annual training on data standards.

Jul 2020 – Jun 2021 Ideon · New York, NY

Data Operations Coordinator

Data contract foundations · QHPs, SBCs, rates

Standardized health-insurance data (QHPs, SBCs, rates) across multiple sources and formats. The data contracts I defined in this role became the basis for the automated contract-style validation I build against today.

Standardized QHPs, SBCs, and carrier rate data across multiple sources and formats.
Defined the data contracts that became the basis for automated contract-style validation.
Diagnosed and resolved validation script failures; prepared datasets for testing and operational use.

06 · Side Projects

What I build outside of work.

Same engineering instincts, different domain. The patterns I use at Ideon — structured ingestion, idempotent updates, static-export frontends, LLMs only where they earn their keep — show up here too.

Personal project

F1 Live Dashboard

A self-updating Formula 1 race-results dashboard. SQLite + a Python post-race agent + a data-driven Chart.js frontend. The original was a single-file HTML with hardcoded arrays; this version rebuilds it as a real ingestion pipeline.

Solo build · SQLite · Python agent · static-JSON frontend

Open the full live dashboard ↗

Live at /f1 · Phase 4 in progress

A SQLite database with 9 tables (circuits, drivers, teams, races, results, standings, lap records, regulation changes, agent runs) holds every season's data. A Python post-race update agent fetches results from the public Jolpica/Ergast and OpenF1 APIs after each race weekend, applies INSERT OR REPLACE for idempotent updates, and exports data/data.json for the frontend to read. The same static-JSON-export pattern Vantage uses internally.

Idempotent ingestion. Every write uses INSERT OR REPLACE on a unique key, so re-running the agent on the same round is safe. An agent_runs table logs every run with status, rows updated, and notes — same audit-trail discipline as BenefitWatch.
Decoupled frontend. The agent writes one data.json; the frontend does a single fetch() on load. No server, no API to maintain.
LLM only where it earns its keep. Phase 4 wires Claude into agent/check_news.py to detect regulation and calendar changes from news copy — the kind of unstructured signal that doesn't come down a JSON API.
Two update modes. Structured (API-fetched results, standings, lap records) and contextual (LLM-summarised regulation/calendar diffs). Each writes to its own table with provenance.

SQLitePythonChart.jsJolpica/Ergast APIOpenF1 APIClaude APIStatic JSONIdempotent Ingestion

Live read from data.json

The exact JSON the agent exports after the most recent race weekend. Re-rendered here in the portfolio's own theme.

—

Driver Standings

Constructor Standings

07 · Growing

Things I'm exploring next.

Reserved space for new work. I'll keep adding here as projects ship.

Exploring

Evaluation harnesses for LLM pipelines

Designing repeatable eval suites for ARIA-style structured-check validators, fixed inputs, judged outputs, regression detection across model versions.

LLM EvalsRegression Detection

Exploring

Observability for agentic systems

Per-step traces, escalation rates, and recovery success metrics for the Adapter Agent, turning agent behavior into a first-class operational signal.

ObservabilityAgentic Systems

08 · Contact

Let's talk.

Questions about the work, the platform, or anything else on this site. The fastest way to reach me is email.

Get in touch

tylerhenn52@gmail.com

linkedin.com/in/tyler-henn52 ↗

Location

New York, NY

Education

B.S. Computer Science
& Information Security
John Jay College · CUNY

Resume

Download PDF ↓