Health Data Interoperability — The Vocabulary

Health Data Interoperability — The Vocabulary

Image source:

We discussed the question of Why HnW (Health and Wellness) Data interoperability (HDI) in the previous article in this series. We understood why it makes a lot of sense for the health data to be interoperable.

Health Data Interoperability — The What And the Why.

But how do we achieve interoperability in health data?

In one word — “Standards”.

It could have been the end of the article here. But this is a proverbial and superlative example of things that are ‘easier said than done’.

Image Source:

“Standards” are important both from a technical, as well as from a regulatory point of view. When discussing interoperability in Health and Wellness, we must talk about standards pertinent to the following domains.

  • Vocabulary.
  • Data exchange.
  • User Consent.
  • Data Security.

For each of these broad domains, an umpteen number of standards have evolved. They are each uniquely suited for their purpose. There also have been many efforts to unify these standards into one, all-purpose universal standard for each of these domains. But that is a long and complex process.

So, while we patiently wait for the emergence of the One Ring (standard) to Rule Them All, Let’s delve a little into the current state of the art, starting with standards for Health data Vocabulary.

Warning: Explaining vocabulary can become a little dry at times. But worry not! when the going gets tough, I’ll wrap it up in the code block like this

Thee can skipeth this unless thou art very much keen! 
Or in non-Shakespearean terms, TL; DR

Vocabulary Standards

For HnW data to make sense, using a standard vocabulary is the first must, unless you value absolute simplicity even at the cost of accuracy …

Image Source:

Consider the case of the word Rubber. In India, the word means a rubber eraser that children use to erase the pencil. In the USA, Rubber is generally understood to mean the material — line in an automobile tire. These two objects not only have different uses but also cater to a rather non-overlapping age group. One can imagine how many embarrassing, if funny, confusions this can lead to. We don’t want that kind of confusion between our doctor, pharmacy and path lab about our diagnosis or the name of our prescription medicine.

Vocabulary standards ensure that the healthcare data is accurately and consistently represented and understood by different healthcare systems.

Over the past, many vocabulary standards have evolved to provide a standard terminology for health concepts. Each of them is designed for a specific use case, for example, the classification of medicines, or to provide standard codes for medical procedures, etc. In the coming sections, let’s discuss a few of the most widely used standards out of the lot. The aim of this discussion is not to gain mastery over the lore of health data vocabulary but rather to develop an appreciation for the complex problem they are trying to solve and the beauty with which they are able to solve it.


International Classification of Diseases

The ICD standard is used worldwide as the international standard for reporting and monitoring diseases and health conditions. It is maintained by the World Health Organization (WHO) and is used by over 100 countries.

ICD provides a system of diagnostic codes for classifying diseases, including a variety of signs, symptoms, abnormal findings, social circumstances, external causes, etc. The system is organised hierarchically, with the highest level being chapters, followed by sections, categories, and subcategories. Each level provides increasing detail about a particular condition.

For example 

Here is how two kinds of headaches
are classified under ICD version 11.

Headache disorders - code 8A80 to 8A8Z
> Migraine - code 8A80
> Migraine with aura - code 8A80.1
> Migraine with aura and unspecified - code 8A80.1Z

Headache disorders - code 8A80 to 8A8Z
> Secondary Headache - code 8A84
> Has causing condition - code 1B2Z
Mycobacterial diseases and unspecified - cluster code 8A84/1B2Z

code browser:

ICD is used by a variety of stakeholders, including healthcare providers, researchers, and policymakers worldwide to collect, analyse, and share health data. It is also used for public health surveillance for disease outbreaks and the management of healthcare resources. The system is regularly updated by the WHO with new medical knowledge and changes in medical practices.


Systematised Nomenclature of Medicine Clinical Terms

SNOMED CT is a comprehensive clinical terminology system. It provides a standardised way of representing a hierarchy of clinical concepts and their relationships. For example, general concepts, such as “disorders”, are subdivided into more specific concepts, such as “cardiovascular disorders”. There are further subdivided into even more specific concepts, such as “myocardial infarction”. I don’t know what any of these disorders mean, but it illustrates the point.

One of the defining features of SNOMED CT is its ability to represent complex relationships between clinical concepts, like so

Concept_1 — Relation_x — Concept_2.

These relationships help us understand the concepts’ semantic meaning and their interrelatedness.

For example, let's take these three concepts. 

Concept 1: Common cold: code 82272006
Concept 2: Rhinovirus: code 415810000
Concept 3: Viral infectious disease: code 1400000

In SNOMED CT, there are relationships
between these three concepts that
describe the causative agent and
pathological process of the common cold.

Common cold: 82272006
> Has causative agent: 246075003
> Rhinovirus: 415810000

Rhinovirus: 415810000
> Is a: 116680003
> Viral infectious disease: 1400000

Common cold: 82272006
> Has pathological process: 405813007
> Viral infectious disease: 1400000

Note that even the relations like
'Is a' and 'Has causative agent'
have codes in SNOMED CT.

So… Why did the SNOMED CT concept go see a therapist? Because it had too many relationships to deal with!

Sorry, couldn’t resist this PJ.

SNOMED CT is considered the world’s most comprehensive, multilingual clinical healthcare terminology. The standard is available in more than 30 languages and used in over 70 countries. However, it is available for a license fee, not free to use.


Logical Observation Identifiers Names and Codes

LOINC is a database and universal standard for identifying medical laboratory observations.

LOINC system has two parts. The Laboratory LOINC codes are used to identify laboratory tests and results, such as blood tests, urine tests, etc. The Clinical LOINC codes are used to identify clinical observations, measurements, and their interpretations, such as vital signs, and physical measurements.

A fully specified name in LOINC includes the following six attributes:

  1. Component: The measured substance or property, such as glucose or haemoglobin.
  2. Property: The aspect of the component being measured such as the mass concentration.
  3. Time: Time interval for which the observation was made, such as fasting or post-prandial.
  4. System: The observed organ or system, such as respiratory or cardiovascular.
  5. Scale: The method used for the measurement, such as nominal or ordinal.
  6. Method: The procedure used for the observation, such as immunoassay or electrochemistry.
A unique code of the format nnnnn-n 
is assigned to each concept upon
registration. For example -

Haemoglobin Mass/Volume in
the Blood is codified as follows

LOINC code: 718–7
- Component: Hemoglobin
- Property: Mass concentration
- Time aspect: Pt (Taken at a single point in time)
- System: Bld (Blood)
- Scale: Qn (Quantitative Scale)
- Method: Spectrophotometry

LOINC has been adopted in over 180 countries and is available in multiple languages. The LOINC International Advisory Committee (LIAC) works to ensure that LOINC is usable across different regions and health systems. LOINC Standard is available for free use.


RxNorm provides a normalised naming and coding system for medicines and their clinical components for managing clinical drug information in electronic health records.

The structure of RxNorm includes three main components:

  • Concept: A unique identifier (RxCUI) representing a unique clinical drug or ingredient that can be prescribed, dispensed, or administered.
  • Term Type: The type of the term associated with a concept, for example, brand name, generic name, or ingredient.
  • Relationship: A relationship between concepts within RxNorm, like ingredient, strength, dose form, etc.

RxNorm concepts are connected by various relationships that indicate the associations between different medications. For example, a relationship can mean that a specific medicine contains a certain ingredient or that two medications have different strengths but are otherwise the same drug. Using these relationships, we can navigate the RxNorm graph from an ingredient to a fully specified drug, and so forth.

RxNorm was developed in 2001 by the National Laboratory of Medicine USA in consultation with the FDA and the HL7 Vocabulary Technical Committee.

Although primarily focused on medication-related information used in the United States, efforts have been made to expand its coverage to include other countries. NLM also offers a comprehensive set of APIs for easy integration with EHRs and other HnW systems. It is available for free use.


Current Procedural Terminology

CPT was created by the American Medical Association (AMA) in 1966. The original intended purpose of CPT was to code diagnostic and therapeutic procedures. Later this was adopted by the US government for billing and reimbursements by healthcare payers, such as insurance companies and government programs like Medicare. The codes are used to report services and procedures performed by healthcare providers and to determine the appropriate payment amount for those services.

CPT codes are now also used for data collection and analysis, research, and quality measurement purposes. CPT is limited in scope and depth, but it is the most widely used standard in the USA to report physician procedures and services for insurance reimbursement.


Canadian Classification of Health Interventions

The CCI standard was developed by the Canadian Institute for Health Information, to serve a similar role in Canada as the CPT standard in the USA. It is a comprehensive classification system for coding health interventions, procedures, and treatments for various healthcare settings.

The CCI standard is organised as a hierarchy of Chapter > Section > Subsection > Intervention. The codes themselves are alphanumeric strings of the format C.XX.00.ZZ, where each of the four sections corresponds to the respective level in the hierarchy

For example 
Code: 1.HD.40.EG01 represents the following components

- Section: 1 (Medical, Surgical, Diagnostic, and Therapeutic Procedures)
- Body system: HD (Hemic and Lymphatic Systems)
- Root operation: 40 (Transfusion)
- Qualifier: EG01 (Blood, Platelets, or Plasma for Transfusion)

Unlike CPT, CCI Is available as open source. Hence it might be easier to extend to different geographical contexts.

Phew… so many standards. But there are many more, spanning the length and breadth of the health domains and Geographies.

Here are a few more commonly used ones:

  • International Classification of Functioning, Disability and Health (ICF)
  • International Classification of Health Interventions (ICHI)
  • Healthcare Common Procedure Coding System (HCPCS)

And the list goes on.

It would be an exercise in futility to list all of them here. But I think by now, we already have a good appreciation for some of the most common ones. We already know a few different ways to equate the Tomey-tohs and Potey-tohs to Toma-toe and Pota-toe of Health and Wellness.

Unifying the standards

We now appreciate the complexity of the problem of standardising vocabulary in HnW. The more general the problem, the more complex it is. So, unifying these standards and building an all-encompassing one is probably still a long shot. However, it is not for the lack of trying.

For example, there have been intense efforts for sharing the ontology between SNOMED CT and ICD to harmonise the two. ICD-11, which was released in 2019, includes SNOMED CT codes for several disease entities and other health conditions. This effort is expected to continue and expand.

Many of these standards also provide code mappings to and from other standards for data portability. For example, RxNorm supports the mapping of other drug terminologies, such as NDC and SNOMED CT, to RxCUIs. This creates portability between different drug terminologies.

New systems are emerging that complement one vocabulary with another — like SNOMED for clinical terms with RxNorm for medications — to build a more comprehensive system for the vocabulary.


Observational Medical Outcomes Partnership

The Observational Health Data Sciences and Informatics — OHDSI, pronounced “Odyssey” — is a collaborative effort to provide a common framework for the analysis of healthcare data. OHDSI develops and promotes the use of the OMOP Common Data Model (CDM).

The OMOP CDM includes a common set of tables and fields that capture various aspects of healthcare data, such as patient demographics, clinical encounters, diagnoses, procedures, medications, laboratory results, etc.

The data model includes standardised codes for various concepts, such as diseases, procedures, and drugs, which enable standardised querying and analysis of the data. It also includes standardised terminologies, such as SNOMED CT, ICD, and RxNorm — Hey… we already know all about these fancy acronyms now! — to enable consistent data mapping and analysis across various sources.

OHDSI has also developed a suite of open-source tools for working with observational data in the CDM format. These include software for data extraction, transformation, analysis, and visualisation.


Common Terminology Services 2

CTS2 is an open standard developed by the Object Management Group (OMG!). It is intended to improve interoperability between standards, such as SNOMED CT, LOINC, and RxNorm, by providing a standard interface.

It defines a set of common services for managing terminology content.

These services include:

  • Query Service: For search for and retrieval of terminology content.
  • Mapping Service: For mapping concepts between different terminologies.
  • Code System Maintenance Service: To add, modify, and delete codes.
  • Value Set Definition Service: For defining and retrieving subsets of concepts from a terminology system.
  • Terminology Change Management Service.

CTS2 also defines a set of data models for code system, value set, and mapping, for representing terminology content.

CTS2 is designed to be an extensible standard that can be used in a variety of healthcare domains. It is a relatively new standard — the first version was released around 2010 — still in the process of gaining traction.

National Health Stack

In India, the National Health Stack (NHS) — the Indian government’s flagship digital health initiative — aims to provide a common technology framework for the integration of various health information systems in the country.

The NHS is based on open standards and architecture to facilitate data exchange between various systems. This is an absolute must for such a complicated health ecosystem as India. The NHS draws from several established standards we previously discussed, like ICD, SNOMED CT, LOINC, etc. I am hoping to write more about the NHS in a separate article in this series.

That was a long read. I admit, it turned out to be longer than I originally intended. But now that we know so much about these standards, are we poised to ask our doctor to write her observations in SNOME codes, and prescriptions in LOINC codes?

That may be more legible than the English written in a doctor’s handwriting.

Just a harmless joke. I have a lot of respect for doctors . I think they are the hardest-working professionals we have. Image Source: xkcd

Well… While that may not happen any time soon, we can certainly expect the encrypted versions of our prescriptions and diagnoses — residing somewhere in the cloud in an Electronic Health Record system or a Hospital Management System or the National Health Stack — Are using these coding systems that we now know so much about.

Isn’t that an empowering thought?

Now that we have established the vocabulary, the next challenge is to understand the standards of data exchange. Here is a highly technical diagram explaining why we need those.


But enough already, let’s leave it for another article, another day.

I hope this article added a little bit to your repertoire of knowledge. Any suggestion will be highly appreciated with 

Thanks for reading.

About me

I am a learner of architecture (not the buildings… the tech kind). In the past, I have worked with Semiconductor modelling, Digital circuit design, Electronic Interface modelling, and the Internet of Things. Currently, I am working with Data Interoperability and Data Warehouse architectures for Health and Wellness, at Walmart Health and Wellness.

Health Data Interoperability — The Vocabulary was originally published in Walmart Global Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Article Link: Health Data Interoperability — The Vocabulary | by Rahul | Walmart Global Tech Blog | Aug, 2023 | Medium