• BLOG.png


Automated Coding & Advanced Matching

Posted by Kim Rejndrup on Jul 10, 2018 11:00:00 AM

(This is the third of a three-part series* on medical coding during clinical trials.)

When you’re conducting a clinical trial, coding is one of the necessary steps. Coding is the process of mapping collected terms (verbatim terms), adverse events and drugs to terminologies contained in medical dictionaries, such as MedDRA and WHODrug. If a coding solution can achieve this mapping without human intervention, it is called auto-coding. However, if the mapping is performed by a medical specialist called a “coder,” the process is called manual coding or just “coding.”

In this blog post, I explore how a modern coding system can provide better assistance during both the auto-coding and manual coding processes.

Learning From Past Decisions: Synonyms & History

When considering matching a verbatim term with the official terminology, the first approach is to consider an exact match. For example, the verbatim term Leg pain is an exact match with a lower level term (LLT) in MedDRA. Therefore, auto-coding in this case is simple. However, what if Mild leg pain or Bad leg pain are the verbatim terms? In those cases, a coder could manually match both verbatim terms to Leg pain. The coding system should be able to use and apply those decisions to the next incident when a subject suffers from Mild leg pain to auto-code the verbatim term to Leg pain. If you have constructed a mapping of verbatim terms to terminology terms – often called a “synonym list – the coding system should allow you to load that mapping for use in the auto-coding process.

Advanced Matching: A Smarter Coding System

This is all well and good. The coding system learns from your decision; over time, it will be able to auto-code more and more. So why is auto-coding not getting anywhere close to 100%. Well, please consider the following verbatim terms:

Verbatim Term


Fever – 39C

Example A

“Fever” is an LLT in MedDRA, but the hyphen (-) and the temperature specification make use of a synonym list impractical.

Fever of 99.9 F

Example B

The “of” (between Fever and 99.9) and the temperature specification make use of a synonym list impractical.

4/4/18 Feverish

Example C

Same issue as above, with the added complication of a specific date, which will differ as time passes.

Leg FX

Example D

The use of abbreviation (FX) in the verbatim term makes a direct match impossible and a synonym list impractical.

Observing the verbatim terms, it is clear that they will almost never exactly match the medical dictionary terms, but what if the coding system were smarter? What if, after the coding system failed to get an exact match, it was able to manipulate the verbatim term and then get a match? To achieve that goal, the coding system should be able to try the following manipulations:

  • Remove special characters: e.g. take out the hyphen (-) from the term Fever - 39C (Example A).
  • Remove temperature and dates (Examples A, B and C).
  • Remove filler words, such as “of” (Example B).
  • Expand abbreviation, for example, “FX” = Fracture (Example D).

Given that these rules could change over time and could be language dependent, I think it is important that the coding system offers flexibility and expandability. In addition to the rules included in the product, the coding system should allow users to add their own rules and provide the capability to define for which terminologies and studies these rules are used.

The final part is the matching between the verbatim term and the terminology. So far I have only mentioned exact matching, but with the power of text indexing, there are a lot more options available. A great coding solution should be able to utilize those options. The following table shows a few examples of what is possible with today’s technology.

Verbatim Term

Terminology term


Pain leg

Leg pain

Should match independently of the word order

Swelling arm

Swollen arm

Language stemming matching


In a modern clinical coding system, I would expect that the solution will guide the coder, not just by using historical decisions, but also by automatically manipulating verbatim terms and using advanced matching to provide a series of suggested coding. Coders have been performing this kind of manipulation and searching manually and through multiple steps for a long time. Isn’t it time that the coding solution will do that automatically for you, and save you a lot of time?

About the Author

kim headshotKim Rejndrup recently joined OmniComm as senior vice president-Product Development. Mr. Rejndrup spent nearly 20 years at Oracle Corporation, where he was a vice president of development for many clinical research applications, including systems for electronic data capture (EDC), clinical trials management, data management, medical coding and data warehousing. His experience is driving OmniComm’s products to new levels of innovation.  

Join Kim Rejndrup for a Free Webinar About Coding

Want to learn more about coding and the upcoming FDA mandate?  Kim Rejndrup will take your questions during a live webinar:  11-July at 10 a.m. New York / 3 p.m. London Register today for this free webinar:  http://bit.ly/2rM1Wjw

* Related Posts

Part 1: Medical Coding and New FDA Guidelines for WHODrug B3/C3

Part 2: Automatic Coding & EDC Integration


Tags: Coding, AutoEncoder