• Single platform for all your clinical data understanding needs
  • Easily extract the meaning from all kinds of healthcare unstructured data
  • Rest APIs crafted to cater EHR, Data Warehousing, Data Analytics, Coding and Clinical Workflows or any healthcare application
  • No Hardware. No Software. We’ve taken care of the infrastructure (AWS HIPAA Compliant Cloud), so you don’t have to.
Sentence Boundary Detection

Identifying the correct boundary of a sentence. Sentence boundary detector identifies that the boundary of the sentence that it does not end at b.i.d. but at “frequently”.

e.g. The patient takes aspirin 325 mg b.i.d. frequently since May.

Section Detection

Unstructured clinical data primarily consists of healthcare documents transcribed based on physician dictations where the physician dictates the entire encounter he had with a patient. There could be different elements of this document which are classified into various sections like History of Present Illness, Past Medical History, Past Surgical History, Current Medications, Allergies, Physical Examination, etc. These sections may be documented in different forms in different hospitals and different physicians.

e.g. History of Present Illness may be documented as HPI, Present Illness, Brief History etc



This component breaks down a sentence into its constituent tokens so that the next component which is the Part of Speech tagger can assign a POS tag to each constituent token.

e.g. The tokens are generated as

Word Token 1: The Word Token 2: Patient
Word Token 3: takes Word Token 4: aspirin
Number Token 1: 325

Word Token 5: mg
Word Token 6: b.i.d. Word Token 8: frequently
Symbol Token 1.

Part of Speech (POS) Tagger

POS tagger tags the different grammatical components of a sentence based on Part of Speech like Noun, Verb, Adjective etc.

e.g. The patient takes aspirin 325 mg b.i.d since May.

The/DT patient/NN (Noun)                          mg/NN (Noun)
takes/VBZ (Verb) aspirin/NN (Noun)        b.i.d/NN (Noun) since/IN (Preposition)
325/CD (Cardinal Number)                          May/NN (Noun)


Chunker breaks a sentence into different phrases like Noun Phrases (NP), Verb phrases (VP), prepositional phrases (PP), Adjective Phrases (ADJP) etc. as per syntactic rules.

e.g. “Sensation is intact to light touch in both lower extremities” is broken down. The chunker output would be: Sensation (NP) > is (VP) > intact (ADJP) > to (PP) > light touch (NP) > in (PP) > both lower extremities


Parser establishes relationships between different phrases in a sentence following phrase structure rules as defined in syntactic English grammar.

e.g. “Sensation is intact to light touch in both lower extremities”. The parser gives the output as show in the visual.

Dependency Parsing

This component establishes relationship between different words in a sentence.

e.g. “Sensation is intact to light touch in both lower extremities”. As shown in the visual, the dependency parser relates intact ↔ sensation, touch ↔ light and so on.

Dictionary Lookup Process

The dictionary look up component of NLP maps the concepts identified from the document against concepts present in the ontology which is a comprehensive collection of medical concepts classified into their types. Based on this look up, it assigns tags of disease (problem), procedure, anatomical structure etc. to the concepts.

e.g. “The patient takes metformin for his diabetes”. Metformin is tagged as a medicine and diabetes as a disease or problem.

Relationship Finder

NLP has a built-in algorithm that establishes the primary level of relationship between concepts such as anatomical structure and problems or diseases, procedures and anatomical structure and procedure and medical devices.

e.g. “The patient has intermittent pain located on the left side of his chest”. NLP can relate pain to the chest and identifies chest pain although they are not co-located within the sentence.

UEI Detection

The UEI (Unique Entity Identifier) detection module uses the relationships identified between different words in a sentence to identify matching concepts in the ontology or knowledge base to assign a unique identifier called a UEI or Unique Entity Identifier.

e.g. “The patient complains of pain in the leg”. The ontology has a UEI for leg pain. The UEI detection module identifies that “pain in the leg” is the same as “leg pain” and assigns it the relevant UEI.


NLP has a negation-detection algorithm that identifies such indicators to identify negation in sentences.

e.g. “There is absence of any cardiac enlargement.” Here the word “absence” indicates that cardiac enlargement is not present or is negated.

Temporal Section Detection

Temporal Status Detection is a component of NLP that identifies the status of each concept with respect to its temporality which is present, past, future etc

e.g. “The patient has had an MI in the past”. The sentence is the past tense, detected by the word “past”; and its mapping with “MI”.

Modifier Detection

Modifiers are terms used to further describe the specificity of concepts in a medical document. This component identifies which concept with a modifier is related to and forms the groups of words like “intermittent pain” are marked as modifiers for pain.

e.g. “The patient has intermittent pain located on the left side of his chest”. “Intermittent” and “Pain” words are grouped to specify the concept.

Drug Mention Annotator

There are various attributes associated with a drug (medicine). The drug mention annotator identifies the various parameters and establishes a relationship between the drug and its parameters.

e.g. “The patient’s Lasix was changed from 20 mg to 40 mg tablets p.o. b.i.d.”

Drug: Lasix Strength 20 mg, 40 mg    Frequency: b.i.d (twice a day)
Route: p.o. (by mouth)                          Status: Change.

Pay only for what you use.

View Pricing

Read APIs Documentation