Skip to main content

Spines

Labs Spine

Inclusion Criteria

  • The Labs spine was created based on common taxonomy codes associated with Lab NPIs. The Life Sciences team has been using these in their products via the provider_profiles table, which we then validated in claims data by looking at the taxonomies that commonly bill lab codes, plus made it refreshable.
TAXONOMY_SPECS = [
{'code': '291U00000X'},
{'code': '293D00000X'},
{'code': '292200000X'},
{'code': '291900000X'}
]
  • These taxonomy codes are looped through the 15 taxonomy switch columns in tq_production.reference_legacy.ref_cms_nppes_npi, searching for the first Y primary taxonomy value per type 2 NPI. If none are reported, it falls back on the first non-null taxonomy found. Taxonomy details come from tq_production.reference_legacy.ref_nucc_taxonomy_code.
CREATE OR REPLACE TABLE {{ schema }}.tmp_nppes_taxonomy_labs_{{ sub_version }} AS
SELECT
CAST(npi AS VARCHAR) AS npi,
provider_organization_name_legal_business_name,
provider_other_organization_name,
provider_business_practice_location_address_state_name,
provider_first_line_business_practice_location_address,
provider_business_practice_location_address_city_name,
provider_business_practice_location_address_postal_code,
entity_type_code,
{% set num_codes = 15 %}
CASE
{% for i in range(1, num_codes + 1) %}
WHEN healthcare_provider_primary_taxonomy_switch_{{ i }} = 'Y'
AND healthcare_provider_taxonomy_code_{{ i }} IS NOT NULL
THEN healthcare_provider_taxonomy_code_{{ i }}
{% endfor %}
ELSE COALESCE(
{% for i in range(1, num_codes + 1) %}
healthcare_provider_taxonomy_code_{{ i }}{{ "," if i < num_codes else "" }}
{% endfor %}
)
END AS selected_taxonomy_code
FROM {{ ref_cms_nppes_npi }};

  • Lab billing is nuanced and requires surfacing rates for all payers, rather than limiting to the payers in the state that the NPI is located. In order to target the high impact labs, we are including a field in spines called national_payer_coverage to be used as an identifier in Clear Rates. This also allows a clean framework to standardize naming conventions (i.e. Quest Diagnostics Inc and Quest Diagnostics will both group to the same name/provider_id). The inclusion list came from Guardant Health's requested provider list.
NATIONAL_PROVIDERS = {
"QUEST DIAGNOSTICS": "Quest Diagnostics",
"LABORATORY CORP": "LabCorp",
"GUARDANT HEALTH": "Guardant Health",
"BIOREFERENCE": "BioReference",
"MYRIAD GENETIC": "Myriad Genetics",
"TEMPUS": "Tempus",
"CARIS": "Caris",
"NEOGENOMICS": "NeoGenomics",
"INIVATA": "Inivata", # Acquired by NeoGenomics but still in NPPES
"FOUNDATION MEDICINE": "Foundation Medicine",
"NATERA": "Natera",
"BILLIONTOONE": "BillionToOne",
"EXACT SCIENCES": "Exact Sciences",
"INFORM DIAGNOSTICS": "Inform Diagnostics"
}
WITH provider_grouping AS (
-- Map each provider name to its canonical grouped name based on wildcard patterns
SELECT
provider_name,
provider_state,
CASE
{%- for provider in national_providers %}
WHEN UPPER(provider_name) LIKE '{{ provider }}%' THEN '{{ national_providers[provider] }}'
{%- endfor %}
ELSE provider_name
END as canonical_provider_name
FROM {{ schema }}.tmp_nppes_base_labs_{{ sub_version }}
)
    CASE
WHEN (
{%- for canonical_name in national_providers.values() %}
tpns.canonical_provider_name = '{{ canonical_name }}'{{ " OR" if not loop.last else "" }}
{%- endfor %}
)
THEN tpns.canonical_provider_name
ELSE REGEXP_REPLACE(
LOWER(tpns.canonical_provider_name),
'(^|[^a-z])([a-z])',
x -> x[1] || UPPER(x[2])
)
END as provider_name,

Structure

  • Labs follows a similar structure to the existing spines table, by splitting the NPIs into two tables that can be traced through via provider_id.
  • The tables are tq_production.spines.spines_provider_labs and tq_production.spines.spines_provider_labs_additional_npis.
  • The primary NPI for a provider_id that will be used in the main table is determined by the lowest numerical provider_id value (created by hashing primary NPI). This is grouped by the canonical_provider_name (cleaned version using `national_providers) and the state. This means that there will be separate provider IDs for Quest NY vs Quest CA for example.
    ROW_NUMBER() OVER (PARTITION BY dp.canonical_provider_name, bp.provider_state ORDER BY bp.npi ASC) as primary_rank,
MIN(bp.provider_id) OVER (PARTITION BY dp.canonical_provider_name, bp.provider_state) as group_provider_id
    CAST(from_big_endian_64(xxhash64(CAST(
npi
AS varbinary
))) AS VARCHAR) as provider_id,
  • The provider_parent_system field will be populated for labs that have more than 5 registered NPIs.
WITH npi_counts AS (
-- Count total NPIs per canonical provider
SELECT
canonical_provider_name,
COUNT(DISTINCT npi) as npi_count
FROM {{ schema }}.tmp_nppes_primary_labs_{{ sub_version }}
GROUP BY canonical_provider_name
)
    CASE
WHEN (
{%- for canonical_name in national_providers.values() %}
tpns.canonical_provider_name = '{{ canonical_name }}'{{ " OR" if not loop.last else "" }}
{%- endfor %}
)
AND nc.npi_count > 5
THEN tpns.canonical_provider_name
ELSE NULL
END as provider_parent_system
On this page: