Overview
Goals:
- Maximize rate object coverage and rate accuracy
- Allow flexible rate canonicalization depending on user's error tolerance
Plans:
- Build logic to create imputation tiers, evaluate validation scores, and select a canonical rate based on tier-hierarchy and validation scores
- QA imputation, validation, canonicalization logic
- Build framework (i.e. identify metrics and set up a tool) to evaluate coverage contribution and accuracy of imputation tiers, to be used to add/adjust/remove imputation tiers and re-order hierarchies
- Evaluate each tier. Identify potential improvements and implement
- Build logic to compute probabilistic confidence scores during runtime, to be used to select the canonical rate, instead of tier-hierarchy and validation scores
- Research more imputation methodologies
Definitions:
-
Imputation Tier: an estimated rate, substitute rate, or derived rate that may be used when a rate object does not have a raw rate available
-
Canonicalization / Canonical Rate: The canonical rate is the "best" rate that we've identified for the rate object.
-
Validation: a rate is validated if:
- (a) payer and hospital reported a similar rate
- (b) the rate is within [x, y] percentiles of claims (cbsa-level, npi-level)
- (c) the rate is within [x%, y%] of medicare rate (state-level, npi-level)
-
Validation Score: depending on the validation methodology, a score is manually assigned; validations based on payer-hospital reporting similar rates get the highest score
-
Tier-Hierarchy: A list of raw rate fields and imputation tiers, whose order determines the rate that will be used when multiple raw rate fields or imputation tiers are available with the same validation score
-
Accuracy / Confidence Score: metrics used to estimate the likelihood that a rate is correct
-
Coverage / Coverage Contribution: A "rate object" is "covered" by a tier if a rate is available. Then "coverage" is the number or percentage of rate objects "covered" by that tier. And "the coverage contribution of tier X beyond tier Y" is the amount of coverage that X offers for rate objects that are not covered by tier Y. Typically, "coverage contribution" will be used to evaluate an imputation tier's contribution beyond raw rates. (A tier is not valuable if it has high accuracy, but only offers coverage where rates are already available).
Another way to assess this would be "coverage correlation".