Overview
Before: Charge-Based Severity Assignment
Simple and what CMS does today, but two problems:
- Circularity
The charge depends on which tier we assigned. The tier also depends on the charge.
- Availability: We might not have charges in the future!
The Resource Intensity Index (RII)
Instead, we look at what the clinical encounter actually looked like — ICU admission, length of stay, ventilator use, number of providers involved, and etc — and derive a severity score from those signals.
For every diagnosis and procedure code (ICD-10 CM and PCS) within an SSP, we compute features from claims data.
We then compress those 14 signals into a single intensity score, and group codes into tiers by that score.
The 14 features
Features are listed in approximate order of importance.
| Column | Type | Definition | Why it signals intensity |
|---|---|---|---|
avg_length_of_stay | Length of Stay | Mean days from statement_from to statement_to | LOS is the primary driver of inpatient cost; longer stays compound every other per-day resource category and are the most consistent predictor of overall acuity |
rate_icu | Critical Care | Share of encounters with ICU rev codes (0200–0209) | ICU care requires the highest nursing ratios, continuous monitoring, and specialized equipment; the single strongest per-day cost driver |
rate_mechanical_ventilation | Critical Care | Share with ICD-10-PCS 5A19* (ventilation) | Invasive ventilation mandates ICU-level care, respiratory therapy, and sedation management; a near-certain indicator of extreme acuity |
rate_dialysis | Critical Care | Share with dialysis rev codes (0800–0859) or CPT 90935–90947 | Acute renal failure requiring dialysis signals multi-organ involvement and substantially increases nursing, nephrologist, and supply costs |
avg_relative_claim_charge | Cost | Charge ÷ avg charge for same SSP at same provider | Normalizing to the same SSP at the same facility isolates how much more resource-intensive a code's encounters are relative to a typical case, removing hospital price-level confounding |
avg_work_rvu | RVU | Mean summed work RVU across service lines | Captures physician time, technical skill, and clinical effort using a standardized national scale — a price-level-independent measure of procedural complexity |
avg_organ_system_count | Comorbidity Burden | Mean number of distinct ICD-10 organ system chapters present | More organ systems involved means broader disease involvement, more subspecialists, and greater care coordination complexity — strongly correlated with LOS and total cost |
rate_or | Procedural Intensity | Share with OR rev codes (0360–0369) | OR time involves surgeons, anesthesiologists, scrub techs, and sterile supplies; one of the largest single-event cost drivers |
rate_multiple_or_days | Procedural Intensity | Share with OR on more than one calendar day | Multiple OR days indicate complications, staged procedures, or wound management — each return to the OR is a discrete high-cost event |
avg_n_distinct_rev_codes | Utilization Breadth | Mean count of distinct revenue codes per encounter | More revenue code types means a wider variety of services rendered; captures overall care complexity more holistically than any single service |
avg_n_j_code_lines | Pharmacy / Infusion | Mean J-code or rev-0636 drug lines | J-codes represent high-cost injectable and infused medications; a higher count signals both pharmaceutical cost and clinical complexity |
avg_n_distinct_hcps | Care Team Size | Mean distinct NPIs (claim-level + service-line) | More distinct providers billing for a case reflects greater subspecialty involvement; complex, high-acuity cases routinely involve hospitalists, surgeons, and multiple consultants |
avg_n_imaging_lines | Imaging | Mean imaging service lines (CT rev 0350–0359; MRI rev 0610–0619) | More imaging studies mean higher radiology costs and suggest a diagnostic workup intensive enough to require repeated or multimodal evaluation |
rate_transfusion | Transfusion | Share with transfusion rev codes (0380–0399) or CPT 36430 | Transfusions indicate significant blood loss, severe anemia, or surgical complications — all markers of high-acuity encounters with elevated supply costs |
Other possible features
- average number of lines
- average total units
- anesthesia (rate and time units)
- recovery room rev codes
- observation rev codes
- ED rev codes
- complexity modifiers (e.g. 22, 23, 24)
- CT and MRI rate
- discharge to SNF/rehab rev codes
- average claim charge amount (not normalized to same SSP/provider)
- pharmacy cost intensity
- readmission rate
- mortality
Method 1 — Latent Variable Model (unsupervised)
- Uses all 14 correlated signals to extract a single dimension capturing shared variation.
- Assumes the resulting latent variable empirically aligns with overall resource intensity.
- Effectively answers: “What one number best summarizes all 14 signals?”
Pros:
- It's not anchored to charges; Relies on the 14 signals to find the underlying dimension of intensity
Cons:
- The latent variable might not perfectly align with true resource intensity
- Model prioritizes high-variable signals, which may not always be the strongest signal for resource intensity
Method 2 — Regression (supervised)
- Uses 13 clinical signals to predict the 14th: relative charge
- The predicted charge becomes the intensity score
Pros:
- Explicitly targets the outcome (charges)
Cons:
- Depends on charges, which may not align with true resource intensity
From score to tier
Both methods produce the same output: a continuous intensity score per code.
We thens split the scores into categorical tiers. The number of tiers is chosen automatically per SSP based on how cleanly the codes separate.
Data source
All signals are derived from Komodo institutional claims .