Skip to main content

OPG Base Rate Tuning

summary

Problems:

SEs have raised concerns about the accuracy of some OPG base rates. In particular:

  • some OPG rates appear worse than raw rates
  • some OPG rates are too low
  • some OPG rates are inferred from a very small sample of posted MRF rates

Solutions:

  1. Tighten the conditions under which OPG base rates are imputed.
  2. Implement a lower outlier bound for OP rates in general (effective starting v2.3.0)

Background

The OPG (Outpatient Procedure Grouper) base rate imputation tier uses payer-specific mappings from HCPCS codes to Outpatient Procedure Groups to infer the corresponding base rates. These mappings are sourced directly from our first-party data.

The CLD methodlogy for making these inferences is described in detail here.

The following fields are used in the Clear Rates pipeline to evaluate candidate base rates.

Definitions:

  • opg: the opg group name
  • opg_base_rate: the base rate for the opg group
    • this field is only populated if the OPG imputation conditions are met
  • opg_candidate_base_rate: the candidate base rate for the opg group
    • this field is always populated with the mode rate for the opg group
  • opg_n_freq: the frequency count of the mode rate for the opg group
  • opg_n_total: the total count of rates available for the opg group
  • opg_n_total_possible: the total possible count for the opg group

Solution

Current Logic:

  • opg_base_rate is populated whenever opg_n_freq / opg_n_total > 0.8

The current logic results in some OPG base rates being inferred from a very small number of posted MRF rates. For example, if there are only 2 posted MRF rates for an OPG group, and both rates are the same, then the OPG base rate will be set to that rate, even though the sample size is very small. We used to have a condition that required opg_n_total > 15, but this was removed in a previous update.

Proposed Logic: use if either of below are true

  • opg_n_freq / opg_n_total > 0.8 AND opg_n_total > 100
  • opg_n_freq / opg_n_total_possible > 0.5

We should bring back the minimum sample size condition, and increase it to 100. This will ensure that we have a sufficient number of posted MRF rates to make a reliable inference. In addition, we should add a new condition that requires the frequency count to be at least 20% of the total possible count. This will help for cases where the total count is small, but the frequency count is still relatively high. As an example, there are some OPG groups that only have 5 possible rates, so if 2 of those rates are the same, then the OPG base rate will be set to that rate, even though the sample size is small.

Analysis

The purpose of this analysis is to evaluate the following:

  1. With the proposed logic, how many OPG base rates would we lose?
  2. Are the proposed conditions reasonable?
    • what is the distribution of opg_n_total_possible?
    • what is the distribution of opg_n_freq / opg_n_total_possible?
  3. Are the imputations reasonable?
  4. Are imputed OPGs monotonic increasing?

1. Number of OPG Base Rates Lost

versionn_opg_imputed_rates
old90324394
new63009127

We would lose 90m - 63m = 27m OPG base rates

Code
df = pd.read_sql(f"""
SELECT COUNT(*) n_opg_imputed_rates, 'old' as version
FROM tq_dev.internal_dev_csong_cld_v2_2_2.tmp_int_imputations_derived_2025_09
WHERE 1.00 * opg_n_freq / opg_n_total > 0.8
UNION ALL
SELECT COUNT(*) n_opg_imputed_rates, 'new' as version
FROM tq_dev.internal_dev_csong_cld_v2_2_2.tmp_int_imputations_derived_2025_09
WHERE (1.00 * opg_n_freq / opg_n_total > 0.8 AND opg_n_total > 100)
OR (1.00 * opg_n_freq / opg_n_total_possible > 0.5)
""", con=trino_conn)

2. Are the Proposed Conditions Reasonable?

There are roughly 66 unique OPG groups. This is the distribution of total possible number of codes per OPG group. On average, there are 440 codes.

opg_n_total_possible
count66
mean440.303
std429.128
min3
1%3
10%22.5
20%64
30%155
40%251
100%317.5
60%449
70%566
80%663
90%985
95%1187.75
99%1845.45
max2045

In general, we impute if 80% of reported codes are the same and there are at least 100 reported codes.

Alternatively, we also impute if 50% of total possible codeshave the same rate.For smaller groups, (e.g. 10% have < 22 possible codes), this would only require 11 codes.

isn't this just a guess?

How can we be sure that these conditions are reasonable? We could analyze the imputed rates and see if there is any relationship between "bad" impuatations and the opg_n_freq / opg_n_total or opg_n_total_possible.

One approach is to tune params such that we minimize the number of non-monotonic groups (see section 4).

3. Are the imputations reasonable?

United

The plot below shows box plots for United OPGs, where each unit represents a provider-network.

alt text

Sample Providers

Here is a random sample of 10 provider-networks, color-coded by their slope.

Note one of the examples "dips" from OPG 3 to OPG 4, then goes back up for OPG 5. We should look for decreases like this, as they may indicate bad imputations.

alt text

4. Are imputed OPGs monotonic increasing?

About 10% of payer-network-providers have at least one violation where the base rate for an OPG group is less than or equal to the base rate for the previous OPG group.

We should review these + remove them.

monotonic_increasing_groupsnon_monotonic_groups
235421970
WITH filtered AS (
SELECT
payer_id,
network_id,
provider_id,
CAST(opg AS integer) AS opg,
opg_candidate_base_rate
FROM tq_dev.internal_dev_csong_cld_v2_2_2.tmp_int_imputations_derived_2025_09
WHERE
(
1.0 * opg_n_freq / opg_n_total > 0.8
AND opg_n_total > 100
)
OR (
1.0 * opg_n_freq / opg_n_total_possible > 0.5
)
AND CAST(opg AS integer) < 8
),
with_violations AS (
SELECT
payer_id,
network_id,
provider_id,
CASE
WHEN opg_candidate_base_rate < LAG(opg_candidate_base_rate) OVER (
PARTITION BY payer_id, network_id, provider_id
ORDER BY opg
)
THEN 1
ELSE 0
END AS violation
FROM filtered
),
group_flags AS (
SELECT
payer_id,
network_id,
provider_id,
MAX(violation) AS has_violation
FROM with_violations
GROUP BY
payer_id,
network_id,
provider_id
)
SELECT
SUM(CASE WHEN has_violation = 0 THEN 1 ELSE 0 END) AS monotonic_increasing_groups,
SUM(CASE WHEN has_violation = 1 THEN 1 ELSE 0 END) AS non_monotonic_groups
FROM group_flags

OPG Group Bounds

In the table below, we see the distribution of OPG group bounds for UHC hospitals.

  • p05 and p95 are the 5th and 95th percentiles of posted MRF rates for each OPG group.
  • med_p05 and med_p95 are the medicare 5th and 95th percentiles for each OPG group.

alt text

Here are 20 example contracts:

alt text