Komodo Bill Type Coverage for SSP Encounters
Date: 2026-04-20 Source tables:
tq_dev.internal_dev_csong_ssp.medical_headers_bothtq_dev.internal_dev_csong_ssp.medical_service_lines_bothtq_dev.internal_dev_csong_ssp.ssp_groupings_v3(ssp-grouper anchor codes)
Scope: year >= 2024 (matches the SSP pricing pipeline's claim filter).
Question
How many distinct encounters (encounter_key) fall outside the SSP pricing
pipeline's current encounter-matching filters?
Specifically:
- Critical access (
bill_type_codestarts with85or86) — CAH claims are classifiedclaim_pos = 'other'and so are skipped byssp_encounter_matches_op(filtersclaim_pos = 'OP'). Coverage is measured against OP SSPs (CPT/HCPCS anchors) since the bulk of CAH volume is outpatient lab / imaging / ER / E&M rather than inpatient stays. - ASC (
bill_type_codestarts with83) — skipped byssp_encounter_matches_op(filtersclaim_pos = 'OP'); coverage measured against OP SSPs (CPT/HCPCS anchors). - Physician-group / professional only —
claim_pos = 'PRO'visits whosevisit_idhas no facility (non-PRO) claim at all.
Headline numbers
| Category | SSPs considered | Distinct encounters | Matched | Match rate |
|---|---|---|---|---|
| Critical access (85/86) | OP (CPT/HCPCS) SSPs | 15,199,390 | 12,286,564 | 80.8% |
| ASC (83) | OP (CPT/HCPCS) SSPs | 20,308,509 | 16,211,546 | 79.8% |
| PRO-only (no facility) | OP (CPT/HCPCS) SSPs | 2,600,727,525 | ~1.68 B (0.1% sample) | 64.7% |
Match rate = share of the category's distinct encounters that join
to at least one SSP anchor code. All three categories match CPT/HCPCS
procedure codes on medical_service_lines_both against the CPT/HCPCS
anchors in ssp_groupings_v3. An encounter is counted as "matched" if
any of its service lines lands on an in-scope anchor — it tells us how
many of these otherwise-skipped encounters the SSP pipeline could
capture if its filters were widened.
CAH volume is dominated by outpatient labs, ER, imaging, and E&M — an
OP-anchored scope picks up ~81% of CAH encounters, vs. the ~2% a
DRG-based IP scope would reach (only 2.5% of CAH encounters report a
diagnosis_group at all).
Baseline medical_headers_both distribution (year >= 2024)
| Category | bill_type_code prefix | claim_pos | Distinct encounters |
|---|---|---|---|
| Hospital outpatient | 13% | OP | 358,584,676 |
| Hospital inpatient | 11% | IP | 21,418,872 |
| Critical access | 85%, 86% | other | 15,199,390 |
| ASC | 83% | other | 20,308,509 |
| Professional | (null) | PRO | 3,886,009,642 |
| All rows | — | — | 4,590,706,144 |
Visit composition (year >= 2024)
| Visit category | Distinct visits | PRO rows | non-PRO rows |
|---|---|---|---|
| PRO-only visit (no facility claim) | 2,349,235,787 | 2,600,727,525 | 0 |
| Facility-only visit | 301,111,613 | 0 | 340,964,439 |
| PRO + facility visit | 112,022,650 | 1,285,282,119 | 363,732,063 |
The PRO-only visit class is the overwhelming majority of the medical_headers_both
volume — 2.35B visits vs 413M visits that have at least one facility claim.
1. Critical access (bill_type_code LIKE '85%' OR '86%')
CAH claims are classified claim_pos = 'other' and so are skipped by
the SSP pricing pipeline's OP matcher. Match CAH encounters to OP
(CPT/HCPCS) SSPs via medical_service_lines_both — this picks up the
outpatient lab / imaging / ER / E&M volume that makes up the bulk of
CAH activity.
- Distinct encounters: 15,199,390
- Encounters matched to at least one OP (CPT/HCPCS) SSP: 12,286,564 (80.8%)
- Encounter × SSP pairs: 20,429,874
- Distinct OP SSPs touched: 164
Top SSPs for CAH encounters (full TSV: cah_by_op_ssp.tsv):
| OP SSP | Encounters |
|---|---|
| LA.0.blood_test | 5,269,785 |
| EM.0.ER | 2,342,327 |
| RA.0.xray | 1,999,065 |
| CO.0.op_clinic | 1,820,698 |
| LA.0.urinalysis | 1,231,499 |
| RA.0.CT | 1,033,688 |
| RA.0.cardiography | 1,013,636 |
| RH.0.pt | 998,516 |
| CO.0.IV_push | 748,931 |
| RA.0.ultrasound | 551,359 |
| RA.0.mammogram | 444,699 |
| CO.0.IV_infusion | 442,112 |
| RA.0.mri | 291,262 |
| BE.0.substance_abuse | 228,818 |
| RA.0.echocardiography | 175,861 |
Blood tests, ER visits, outpatient clinic E&M, imaging, and physical therapy dominate CAH OP volume.
2. ASC (bill_type_code LIKE '83%')
- Distinct encounters: 20,308,509
- Encounters that match at least one SSP anchor code: 16,211,546 (79.8%)
- Encounter × SSP pairs: 19,387,025
Top SSPs for ASC encounters (full TSV: op_ssps_by_bill_category.tsv):
| SSP | Encounters |
|---|---|
| GA.0.colonoscopy | 7,021,310 |
| GA.0.egd | 3,055,884 |
| OP.0.cataract_removal | 2,469,360 |
| NU.0.nerve_block | 1,374,260 |
| MS.0.surgical_knee_repair | 403,606 |
| EM.1.m_other_factors | 352,183 |
| MS.0.surgical_shoulder_repair | 334,902 |
| EN.0.tonsil_removal | 285,765 |
| UR.0.cystourethroscopy | 283,085 |
| MS.0.excision_hand_wrist | 242,359 |
3. Physician-group / PRO-only visits
PRO-only = a visit_id whose only rows in medical_headers_both are
claim_pos = 'PRO'.
Full-population counts:
- Distinct PRO-only visits: 2,349,235,787
- Distinct PRO-only encounters: 2,600,727,525
Coverage on a 0.1% visit-hash sample (2,349,170 visits → 2,599,490 encounters; sample matches the full population to 4 significant digits):
- Encounters matched to at least one SSP anchor: 1,682,396 / 2,599,490 = 64.7%
- Extrapolated: ~1.68 B PRO-only encounters map to an SSP.
- Encounter × SSP pairs in sample: 1,943,351 (many PRO-only encounters touch only one SSP vs. CAH/ASC which often touch several).
Top SSPs for PRO-only encounters (sample counts × 1,000 ≈ full population;
full TSV: pro_only_by_ssp.tsv):
| SSP | Sample encounters | Extrapolated |
|---|---|---|
| CO.0.op_clinic | 754,966 | 754,966,000 |
| LA.0.blood_test | 238,142 | 238,142,000 |
| RH.0.pt | 199,284 | 199,284,000 |
| BE.0.psychotherapy | 127,893 | 127,893,000 |
| RA.0.xray | 90,911 | 90,911,000 |
| LA.0.urinalysis | 69,125 | 69,125,000 |
| RA.0.cardiography | 53,399 | 53,399,000 |
| CO.0.vaccine | 51,071 | 51,071,000 |
| RA.0.ultrasound | 48,995 | 48,995,000 |
| EM.0.ER | 48,341 | 48,341,000 |
These are dominated by office-visit / E&M-style anchors (office clinic, psychotherapy, PT) that legitimately wouldn't have a facility claim.
All OP SSPs by bill-type category
All 167 SSPs with CPT/HCPCS anchor codes (OP-style). Columns are full-population
distinct encounters (for ASC) or 0.1% visit-hash sample counts with ×1000
extrapolation (for PRO-only). CAH per-SSP counts are reported separately in
section 1 (cah_by_op_ssp.tsv) — same OP-anchor join rule. Sorted by
ASC + (PRO-only × 1,000) descending.
Full TSV: op_ssps_by_bill_category.tsv.
| OP SSP | ASC | PRO-only (sample) | PRO-only (×1000) |
|---|---|---|---|
| CO.0.op_clinic | 18,568 | 754,966 | 754,966,000 |
| LA.0.blood_test | 60,096 | 238,142 | 238,142,000 |
| RH.0.pt | 15,572 | 199,284 | 199,284,000 |
| BE.0.psychotherapy | 5,022 | 127,893 | 127,893,000 |
| RA.0.xray | 36,548 | 90,911 | 90,911,000 |
| LA.0.urinalysis | 57,896 | 69,125 | 69,125,000 |
| RA.0.cardiography | 21,338 | 53,399 | 53,399,000 |
| CO.0.vaccine | 563 | 51,071 | 51,071,000 |
| RA.0.ultrasound | 109,914 | 48,995 | 48,995,000 |
| EM.0.ER | 6,742 | 48,341 | 48,341,000 |
| RA.0.CT | 8,322 | 28,928 | 28,928,000 |
| LA.0.pathology | 135,058 | 23,898 | 23,898,000 |
| RH.0.st | 262 | 20,852 | 20,852,000 |
| RA.0.mammogram | 1,611 | 19,342 | 19,342,000 |
| RA.0.mri | 6,106 | 16,084 | 16,084,000 |
| GA.0.colonoscopy | 7,021,310 | 6,349 | 6,349,000 |
| RA.0.echocardiography | 1,100 | 12,922 | 12,922,000 |
| MS.0.arthrocentesis | 101,963 | 11,930 | 11,930,000 |
| BE.0.psych_eval | 389 | 7,893 | 7,893,000 |
| ON.0.chemotherapy | 386 | 7,608 | 7,608,000 |
| DE.0.actinic_destruction | 161 | 7,377 | 7,377,000 |
| GA.0.egd | 3,055,884 | 3,787 | 3,787,000 |
| CO.0.IV_infusion | 13,161 | 6,648 | 6,648,000 |
| OP.0.cataract_removal | 2,469,360 | 3,247 | 3,247,000 |
| EM.0.critical_care | 124 | 5,662 | 5,662,000 |
| DE.0.skin_biopsy | 1,457 | 5,337 | 5,337,000 |
| CO.0.IV_push | 12,777 | 5,243 | 5,243,000 |
| NU.0.nerve_block | 1,374,260 | 3,815 | 3,815,000 |
| EM.0.wound_treatment | 44,416 | 5,095 | 5,095,000 |
| RA.0.CTA | 556 | 4,758 | 4,758,000 |
| RA.0.dexa | 849 | 4,546 | 4,546,000 |
| RA.0.audiology | 24 | 3,808 | 3,808,000 |
| EM.0.debridement | 10,079 | 3,369 | 3,369,000 |
| UR.0.cystourethroscopy | 283,085 | 2,548 | 2,548,000 |
| MS.0.trigger_point_injection | 37,511 | 2,599 | 2,599,000 |
| OB.0.fetal_biophysical | 46 | 2,386 | 2,386,000 |
| BE.0.substance_abuse | 2,400 | 2,250 | 2,250,000 |
| NU.0.sleep_study | 1,867 | 2,239 | 2,239,000 |
| RH.0.ot | 9,509 | 2,160 | 2,160,000 |
| RA.0.mpi | 76 | 2,051 | 2,051,000 |
| CO.0.allergy_testing | 5 | 1,762 | 1,762,000 |
| LA.0.genomic | 633 | 1,725 | 1,725,000 |
| RA.0.PET | 29 | 1,720 | 1,720,000 |
| ON.0.imrt_planning | 379 | 1,719 | 1,719,000 |
| DE.0.skin_lesion | 73,067 | 1,606 | 1,606,000 |
| EN.0.laryngoscopy | 17,595 | 1,369 | 1,369,000 |
| DE.0.cyst_aspiration | 63,937 | 1,227 | 1,227,000 |
| DE.0.mohs | 3,145 | 1,057 | 1,057,000 |
| CA.0.cardiac_catheterization | 24,403 | 837 | 837,000 |
| MS.0.knee_replacement | 205,531 | 623 | 623,000 |
| MS.0.surgical_shoulder_repair | 334,902 | 314 | 314,000 |
| MS.0.surgical_knee_repair | 403,606 | 243 | 243,000 |
| DE.0.lesion_biopsy_percutaneous | 8,146 | 604 | 604,000 |
| EN.0.tonsil_removal | 285,765 | 273 | 273,000 |
| MS.0.excision_hand_wrist | 242,359 | 305 | 305,000 |
| ON.0.imrt | 51 | 543 | 543,000 |
| MS.0.hip_replacement | 116,081 | 380 | 380,000 |
| CA.0.varicose_ablation | 12,592 | 467 | 467,000 |
| RH.0.cardiac | 27 | 470 | 470,000 |
| EN.0.sinus_surgery | 100,273 | 366 | 366,000 |
| RE.0.prostate_biopsy | 81,464 | 378 | 378,000 |
| RE.0.hysteroscopy | 144,089 | 303 | 303,000 |
| GA.0.cholecystectomy | 51,956 | 340 | 340,000 |
| DE.0.fine_needle_biopsy | 1,794 | 324 | 324,000 |
| EN.0.nasal_repair_surg | 165,190 | 145 | 145,000 |
| GA.0.hernia_repair | 73,938 | 234 | 234,000 |
| OB.1.delivery | 119 | 268 | 268,000 |
| RA.0.cardiac_PET | 1 | 261 | 261,000 |
| MS.0.ns_wrist_fracture | 4,329 | 253 | 253,000 |
| MS.0.laminectomy_lumbar | 47,162 | 210 | 210,000 |
| CA.0.PCI | 6,025 | 251 | 251,000 |
| OP.0.retinal_repair | 113,245 | 142 | 142,000 |
| PU.0.bronchoscopy | 3,572 | 248 | 248,000 |
| RE.0.hysterectomy | 13,316 | 219 | 219,000 |
| GA.0.sigmoidoscopy | 73,037 | 150 | 150,000 |
| RE.0.ovarian_surgery | 43,696 | 161 | 161,000 |
| HE.1.bone_marrow_biopsy | 3,429 | 201 | 201,000 |
| MS.0.surgical_shoulder_replacement | 36,771 | 166 | 166,000 |
| MS.0.cervical_ACDF | 19,718 | 176 | 176,000 |
| BE.0.psychoses | 1,928 | 192 | 192,000 |
| DE.0.tumor_local_excision | 59,221 | 132 | 132,000 |
| HE.1.CVC | 17,553 | 172 | 172,000 |
| NU.0.spinal_neurostimulator | 66,897 | 118 | 118,000 |
| HE.0.lymph_node_biopsy | 18,560 | 165 | 165,000 |
| UR.0.circumcision | 29,790 | 152 | 152,000 |
| MS.0.spine_fusion_lumbar | 10,786 | 152 | 152,000 |
| MS.0.surg_acl_repair | 99,463 | 50 | 50,000 |
| MS.0.surgical_hip_fracture | 27,362 | 122 | 122,000 |
| GA.0.appendectomy | 2,561 | 139 | 139,000 |
| MS.0.ns_hand_fracture | 1,806 | 139 | 139,000 |
| MS.0.ns_foot_fracture | 557 | 139 | 139,000 |
| RA.0.EPS | 587 | 138 | 138,000 |
| ON.0.SRS | 32 | 130 | 130,000 |
| DE.0.lesion_localization | 2,107 | 125 | 125,000 |
| ON.1.mastectomy | 23,852 | 101 | 101,000 |
| MS.0.ns_tibia_fracture | 643 | 108 | 108,000 |
| CA.0.pacemaker_insertion | 3,753 | 95 | 95,000 |
| GA.0.colectomy | 32 | 96 | 96,000 |
| MS.0.surg_hand_arthroplasty | 42,718 | 53 | 53,000 |
| OP.0.glaucoma_surg | 41,536 | 49 | 49,000 |
| NU.0.SNS_insertion | 26,390 | 63 | 63,000 |
| CA.0.rhythm_monitor | 7,995 | 81 | 81,000 |
| BE.0.electroconvulsive | 30 | 87 | 87,000 |
| RA.0.vascular_embolization | 1,221 | 78 | 78,000 |
| MS.0.knee_hip_revision | 2,002 | 76 | 76,000 |
| ME.0.thyroid_parathyroid_surgery | 10,685 | 64 | 64,000 |
| GA.0.liver_biopsy | 1,227 | 73 | 73,000 |
| NU.0.SNS_programming | 2,029 | 69 | 69,000 |
| MS.0.ns_arm_fracture | 173 | 70 | 70,000 |
| EN.0.middle_ear_mastoid_surg | 28,576 | 41 | 41,000 |
| UR.0.av_access_dialysis | 9,840 | 54 | 54,000 |
| HE.0.lymphadenectomy | 2,483 | 59 | 59,000 |
| ON.0.mastectomy | 6,492 | 54 | 54,000 |
| RE.0.cervical_excision | 13,467 | 47 | 47,000 |
| CA.0.ICD_replacement | 4,280 | 55 | 55,000 |
| PU.0.thoracoscopy | 91 | 58 | 58,000 |
| UR.0.turp | 8,887 | 48 | 48,000 |
| MS.0.ns_shoulder_fracture | 34 | 52 | 52,000 |
| MS.0.ns_ankle_fracture | 320 | 47 | 47,000 |
| ON.1.breast_reconstruction | 8,766 | 38 | 38,000 |
| BE.0.SA_pharmacotherapy | 0 | 46 | 46,000 |
| RE.0.prostatectomy | 890 | 43 | 43,000 |
| MS.0.VCF_treatment | 3,564 | 40 | 40,000 |
| CA.0.pacemaker_replacement | 3,052 | 33 | 33,000 |
| MS.0.surg_ankle_repair | 20,812 | 15 | 15,000 |
| CA.0.carotid_revascularization | 32 | 33 | 33,000 |
| RA.0.parathyroid_scan | 9 | 32 | 32,000 |
| UR.0.hydrocelectomy | 9,510 | 21 | 21,000 |
| UR.0.nephrectomy | 174 | 30 | 30,000 |
| DE.0.pilonidal_cyst | 11,713 | 17 | 17,000 |
| RA.0.mra | 13 | 25 | 25,000 |
| RA.0.mre | 0 | 25 | 25,000 |
| RE.0.colpopexy_pelvic_floor | 807 | 23 | 23,000 |
| MS.0.ns_nose_fracture | 10,427 | 10 | 10,000 |
| PU.0.vena_cava_filter | 154 | 20 | 20,000 |
| RA.0.qMRI | 1 | 19 | 19,000 |
| UR.0.bladder_neck | 3,978 | 15 | 15,000 |
| ON.0.breast_recon_revision | 7,404 | 11 | 11,000 |
| MS.0.ns_knee_fracture | 18 | 17 | 17,000 |
| NU.0.intrathecal_pump | 4,157 | 12 | 12,000 |
| UR.0.urethroplasty | 4,153 | 11 | 11,000 |
| EN.0.parotid_excision | 3,016 | 12 | 12,000 |
| CA.0.leadless_pacemaker | 11 | 14 | 14,000 |
| DE.0.panniculectomy | 3,090 | 10 | 10,000 |
| GA.0.pancreatectomy | 1 | 13 | 13,000 |
| MS.0.arthrotomy_hand_wrist | 5,841 | 7 | 7,000 |
| MS.0.cervical_disc_replacement | 7,515 | 5 | 5,000 |
| CA.0.cardiac_biopsy | 38 | 12 | 12,000 |
| DE.0.tumor_radical_resection | 677 | 11 | 11,000 |
| RE.0.myomectomy | 1,673 | 10 | 10,000 |
| MS.0.ns_thigh_fracture | 63 | 11 | 11,000 |
| UR.1.kidney_transplant | 0 | 11 | 11,000 |
| MS.0.surg_ankle_replacement | 1,544 | 9 | 9,000 |
| CA.0.thromboembolectomy | 18 | 10 | 10,000 |
| MS.0.ns_spine_fracture | 0 | 9 | 9,000 |
| MS.0.ns_hip_fracture | 10 | 8 | 8,000 |
| MS.0.ns_pelvic_fracture | 200 | 7 | 7,000 |
| NU.0.cranial_neurostimulator | 453 | 6 | 6,000 |
| RA.0.mrs | 1 | 4 | 4,000 |
| DE.0.lesion_biopsy_open | 403 | 3 | 3,000 |
| MS.0.ns_collarbone_fracture | 8 | 2 | 2,000 |
| RA.0.ablation | 327 | 0 | 0 |
| MS.0.arthrotomy_foot_ankle | 77 | 0 | 0 |
| MS.0.ns_jaw_fracture | 10 | 0 | 0 |
| MS.0.ns_sternum_fracture | 4 | 0 | 0 |
| MS.0.ns_facial_fracture | 3 | 0 | 0 |
| RE.0.brachytherapy | 0 | 0 | 0 |
SQL
Baseline distribution
SELECT
SUBSTR(bill_type_code, 1, 2) AS bt_prefix,
claim_pos,
COUNT(*) AS row_count,
COUNT(DISTINCT encounter_key) AS encounter_count
FROM tq_dev.internal_dev_csong_ssp.medical_headers_both
WHERE year >= 2024
GROUP BY 1, 2
ORDER BY row_count DESC;
Visit composition
WITH visit_pos AS (
SELECT
visit_id,
COUNT_IF(claim_pos = 'PRO') AS pro_rows,
COUNT_IF(claim_pos != 'PRO') AS nonpro_rows
FROM tq_dev.internal_dev_csong_ssp.medical_headers_both
WHERE year >= 2024
GROUP BY visit_id
)
SELECT
CASE
WHEN pro_rows > 0 AND nonpro_rows = 0 THEN 'pro_only_visit'
WHEN pro_rows > 0 AND nonpro_rows > 0 THEN 'pro_plus_facility_visit'
WHEN pro_rows = 0 AND nonpro_rows > 0 THEN 'facility_only_visit'
END AS visit_category,
COUNT(*) AS n_visits,
SUM(pro_rows) AS total_pro_rows,
SUM(nonpro_rows) AS total_nonpro_rows
FROM visit_pos
GROUP BY 1
ORDER BY n_visits DESC;
CAH encounters by OP (CPT/HCPCS) SSP
Same join shape as the ASC query below — CAH headers → service lines → CPT/HCPCS anchors.
WITH anchors AS (
SELECT DISTINCT ssp_grouper, code AS anchor_code
FROM tq_dev.internal_dev_csong_ssp.ssp_groupings_v3
WHERE ssp_grouper IS NOT NULL AND type IN ('CPT', 'HCPCS')
),
cah_headers AS (
SELECT DISTINCT encounter_key
FROM tq_dev.internal_dev_csong_ssp.medical_headers_both
WHERE year >= 2024
AND (bill_type_code LIKE '85%' OR bill_type_code LIKE '86%')
),
matched AS (
SELECT h.encounter_key, a.ssp_grouper
FROM cah_headers h
JOIN tq_dev.internal_dev_csong_ssp.medical_service_lines_both msl
ON h.encounter_key = msl.encounter_key
AND msl.year >= 2024
JOIN anchors a ON msl.procedure_code = a.anchor_code
WHERE msl.procedure_code IS NOT NULL
)
SELECT ssp_grouper, COUNT(DISTINCT encounter_key) AS encounter_count
FROM matched
GROUP BY 1
ORDER BY encounter_count DESC;
ASC encounters by OP (CPT/HCPCS) SSP
WITH anchors AS (
SELECT DISTINCT ssp_grouper, code AS anchor_code
FROM tq_dev.internal_dev_csong_ssp.ssp_groupings_v3
WHERE ssp_grouper IS NOT NULL AND type IN ('CPT', 'HCPCS')
),
asc_headers AS (
SELECT encounter_key
FROM tq_dev.internal_dev_csong_ssp.medical_headers_both
WHERE year >= 2024
AND bill_type_code LIKE '83%'
),
matched AS (
SELECT h.encounter_key, a.ssp_grouper
FROM asc_headers h
JOIN tq_dev.internal_dev_csong_ssp.medical_service_lines_both msl
ON h.encounter_key = msl.encounter_key
AND msl.year >= 2024
JOIN anchors a ON msl.procedure_code = a.anchor_code
WHERE msl.procedure_code IS NOT NULL
)
SELECT ssp_grouper, COUNT(DISTINCT encounter_key) AS encounter_count
FROM matched
GROUP BY 1
ORDER BY encounter_count DESC;
PRO-only visit encounters by SSP (0.1% visit-hash sample)
rand()-based visit sampling blew the per-node memory limit because the
GROUP BY visit_id runs across all 2.35 B PRO-only visits before sampling.
Switching to a deterministic visit-hash sample avoids the hash-builder
blowup — every row is filtered out in the scan, so the GROUP BY only sees
a 0.1% slice. Results are materialized into tmp_pro_only_visit_sample so
the downstream SSP join reuses the same sample.
-- Step 1: build the sample table (0.1% of visit_ids via xxhash64 mod 1000)
CREATE OR REPLACE TABLE tq_dev.internal_dev_csong_ssp.tmp_pro_only_visit_sample AS
WITH sampled_headers AS (
SELECT visit_id, encounter_key, claim_pos
FROM tq_dev.internal_dev_csong_ssp.medical_headers_both
WHERE year >= 2024
-- xxhash64(visit_id) mod 1000 == 0 → deterministic 0.1% sample of visit_ids
AND MOD(ABS(from_base(SUBSTR(to_hex(xxhash64(CAST(visit_id AS VARBINARY))), 1, 14), 16)), 1000) = 0
),
pro_only AS (
SELECT visit_id
FROM sampled_headers
GROUP BY visit_id
HAVING COUNT_IF(claim_pos != 'PRO') = 0
)
SELECT DISTINCT h.encounter_key, h.visit_id
FROM sampled_headers h
JOIN pro_only p ON h.visit_id = p.visit_id
WHERE h.claim_pos = 'PRO';
-- Step 2: overall PRO-only match rate
WITH anchors AS (
SELECT DISTINCT code AS anchor_code
FROM tq_dev.internal_dev_csong_ssp.ssp_groupings_v3
WHERE ssp_grouper IS NOT NULL AND type IN ('CPT', 'HCPCS')
),
sample_encs AS (
SELECT encounter_key
FROM tq_dev.internal_dev_csong_ssp.tmp_pro_only_visit_sample
),
matched AS (
SELECT DISTINCT s.encounter_key
FROM sample_encs s
JOIN tq_dev.internal_dev_csong_ssp.medical_service_lines_both msl
ON s.encounter_key = msl.encounter_key AND msl.year >= 2024
JOIN anchors a ON msl.procedure_code = a.anchor_code
)
SELECT
(SELECT COUNT(*) FROM sample_encs) AS total_sample,
(SELECT COUNT(*) FROM matched) AS matched_sample;
-- Step 3: PRO-only match by SSP (multiply by 1000 for full-population estimate)
WITH anchors AS (
SELECT DISTINCT ssp_grouper, code AS anchor_code
FROM tq_dev.internal_dev_csong_ssp.ssp_groupings_v3
WHERE ssp_grouper IS NOT NULL
AND type IN ('CPT', 'HCPCS')
),
matched AS (
SELECT s.encounter_key, a.ssp_grouper
FROM tq_dev.internal_dev_csong_ssp.tmp_pro_only_visit_sample s
JOIN tq_dev.internal_dev_csong_ssp.medical_service_lines_both msl
ON s.encounter_key = msl.encounter_key
AND msl.year >= 2024
JOIN anchors a ON msl.procedure_code = a.anchor_code
)
SELECT ssp_grouper, COUNT(DISTINCT encounter_key) AS encounter_count_sample
FROM matched
GROUP BY 1
ORDER BY encounter_count_sample DESC;