Unit 4-Health care and Deep Learninh
Unit 4-Health care and Deep Learninh
Introduction on Deep Learning – DFF network CNN- RNN for Sequences – Biomedical Image
and Signal Analysis – Natural Language Processing and Data Mining for Clinical Data – Mobile
Imaging and Analytics – Clinical Decision Support System.
effectiveness of algorithm very much depends on how insightful the programmer is.
Deep learning attempts to mimic the human brain—albeit far from matching its ability— enabling
systems to cluster data and make predictions with incredible accuracy.
Deep learning is a subset of machine learning, which is essentially a neural network with three or
more layers. These neural networks attempt to simulate the behaviour of the human brain—albeit
far from matching its ability—allowing it to “learn” from large amounts of data.
Deep learning models are capable of learning to focus on the right features by themselves,
requiring little guidance from the programmer.
Basically, deep learning mimics the way our brain functions i.e. it learns from experience. As
you know, our brain is made up of billions of neurons that allows us to do amazing things. Even
the brain of a one year old kid can solve complex problems which are very difficult to solve even
using super-computers.Recognize the face of their parents and different objects as well.
Discriminate different voices and can even recognize a particular person based on his/her voice.
Draw inference from facial gestures of other persons and many more.
Deep learning uses the concept of artificial neurons that functions in a similar manner as the
biological neurons present in our brain. Therefore, we can say that Deep Learning is a subfield
of machine learning concerned with algorithms inspired by the structure and function of the brain
called artificial neural networks. Now, let us take an example to understand it. Suppose we want
to make a system that can recognize faces of different people in an image. If we solve this as a
typical machine learning problem, we will define facial features such as eyes, nose, ears etc.
and then, the system will identify which features are more important for which person on its own.
Now, deep learning takes this one step ahead. Deep learning automatically finds out the features
which are important for classification because of deep neural networks, whereas in case of
Machine Learning we had to manually define these features.
The inspiration for deep learning is the way that the human brain filters the information. Its main
motive is to simulate human-like decision making. Neurons in the brain pass the signals to perform
the actions. Similarly, artificial neurons connect in a neural network to perform tasks clustering,
classification, or regression. The neural network sorts the unlabeled data according to the
similarities of the data. That’s the idea behind a deep learning algorithm.
a) Input layer
b) Hidden layer
c) Output layer
Input Layer
• It receives the input data from the observation. This information breaks
into numbers and the bits of binary data that a computer can understand.
Variables need to be either standardized or normalized to be within the
same range.
Hidden Layer
• The “deep” in Deep Learning refers to have more than one hidden layer.
Output Layer:
Weight:
The connection between neurons is called weight, which is the numerical values. The weight
between neurons determines the learning ability of the neural network. During the learning of
artificial neural networks, weight between the neuron changes. Initial weights are set randomly.
Transfer Function
The transfer function translates the input signals to output signals. Four types of transfer
functions are commonly used, Unit step (threshold), sigmoid, piecewise linear, and Gaussian.
Sigmoid
Piecewise Linear
Gaussian
Gaussian functions are bell-shaped curves that are continuous. The node output (high/low) is
interpreted in terms of class membership (1/0), depending on how close the net input is to a chosen
value of average.
Linear
Like a linear regression, a linear activation function transforms the weighted sum inputs of the
neuron to an output using a linear fnction.
Activation Function
Activation function decides, whether a neuron should be activated or not by calculating weighted
sum and further adding bias with it.
Here, z(1) is the vectorized output of layer 1.W(1) be the vectorized weights assigned to neurons
of hidden layer i.e. w1, w2, w3 and w4. X be the vectorized input features i.e. i1 and i2. b is the
vectorized bias assigned to neurons in hidden layer i.e. b1 and b2. a(1) is the vectorized form of
any linear function.
• a(2) = z(2)
• In FFNN we obtain the input for the hidden layer by applying the activation function, for
this, we only need the input vector and the weights matrix.
• The unfolded architecture of the RNNs can be altered as per the requirement, say if you
want to do the sentiment classification task we can have multiple inputs and single output
• while in the case of language generation models we need to have multiple inputs and
multiple outputs, also RNNs can be stacked together for some special use cases.
Types of Recurrent Neural Networks
There are four types of Recurrent Neural Networks:
➢ One to One
➢ One to Many
➢ Many to One
➢ Many to Many
One to One RNN
This type of neural network is known as the Vanilla Neural Network. It's used for general
machine learning problems, which has a single input and a single output.
One to Many RNN
This type of neural network has a single input and multiple outputs. An example of this is the
image caption.
Natural language processing (NLP) is the ability of a computer program to understand human
language as it is spoken and written -- referred to as natural language. It is a component of artificial
intelligence (AI).
Electronic health records (EHR) of patients are major sources of clinical information that
arecritical to improvement of health care processes. Automated approach for retrieving
informationfrom these records is highly challenging due to the complexity involved in converting
clinical textthat is available in free-text to a structured format. Natural language processing (NLP)
and datamining techniques are capable of processing a large volume of clinicaltext (textual patient
reports)
The input for a NLP system is the unstructured natural text that is extracted from patient’s
medical record and send it to report analyser.
Report Analyzer:
The clinical text differs from the biomedical text with the possible use of pseudotables, i.e.,natural
text formatted to appear as tables, medical abbreviations, and punctuation in addition to the natural
language. The text is normally dictated and transcribed to a person or speech recognition software
and is usually available in free-text format. Some clinical texts are even available in the image or
graph format which are in unstructured format.
As a result, NLP processing techniques are applied to convert the unstructured free-text into a
structured format.
The first and foremost task of report analyzer is to preprocess the clinical input text by applying
NLP methodologies. The major preprocessing tasks in a clinical NLP include text segmentation,
text irregularities handling, domain specific abbreviation, and missing punctuation
Text Analyzer
Text analyzer is the most important module in clinical text processing that extracts the clinical
information from free-text and makes it compatible for database storage.The syntactic andsemantic
interpreter component of the text analyzer generates a deeper structure such as constituent or
dependency tree structures to capture the clinical information present in the text. The conversion
rules or ML algorithms encode the clinical information from the deep tree structures. An advantage
of the rule-based approach is that the predefined patterns are expert- curated and are highly
specific. The database handler and inference rules component generates a processed form of data
from the database storage
MORPHOLOGICAL ANALYSIS:
➢ Stop word Remove - it remove unwanted word like punctuations and articels etc.
➢ Stemming – It is the process of reducing word into its base forms. example:base form of
took is take i.e,the word took is derived from take.
o bigram:-It process two words at a time and so on. By this we can find the
probabilty of the word
Research in NLP for clinical domain makes the computers understand the free-form clinical text
for automatic extraction of clinical information. The general aims of clinical NLP understandings
include the theoretical investigation of human language to explore the details of language from
computer implementation point of view and the more natural man-machine communications that
aims at producing a practical automated system. Due to the complex nature of the clinical text, the
analysis is carried out in many phases such as morphological analysis, lexical analysis, syntactic
analysis, semantic analysis, and data encoding.
LEXICAL ANALYSIS:
The words or phrases in the text are mapped to the relevant linguistic information such as syntactic
information, i.e., noun, verb, adverb, etc., and semantic information i.e., disease, procedure, body
part, etc. Lexical analysis is achieved with a special dictionary called a lexicon, which provides
the necessary rules and data for carrying out the linguistic mapping. The development of
maintenance of a lexicon requires extensive knowledge engineering and effort to develop and
maintain. The National Library of Medicine (NLM) maintains the Specialist Lexicon with
comprehensive syntactic information associated with both medical and English terms.
Semantic Analysis
It is used to check whether the sentence is meaningful or not. It find some importent tokens and
find its base words. It find parts of speech of each word (It is done in lexical analysis). It need to
check, the two words come together in a sentence does they make a sense. It is done by mapping
syntactic structure and objects in a domain.
It determines the words or phrases in the text that are clinically relevant, and extracts their semantic
relations. The natural language semantics consists of two major features:
➢ The representation of the meanings of a sentence, which can allow the possible
manipulations (particularly inference)
➢ Relating these representations to the part of the linguistic model that deals with the
structure (grammar or syntax).
The semantic analysis uses the semantic model of the domain or ontology to structure and
encodes the information from the clinical text. The semantic model is either frame oriented or
conceptual graphs. The generated structured output of the semantic analysis is subsequently
used by other automated processes.
SYNTATIC ANALYSIS:
The word “syntax” refers to the study of formal relationships between words in the text. The
grammatical knowledge and parsing techniques are the major key elements to perform syntactic
analysis. The context free grammar (CFG) is the most common grammar used for syntactic
analysis. CFG is also known by various other terms including phrase structure grammar (PSG) and
definite clause grammar (DCG). The syntactic analysis is done by using two basic parsing
techniques called top-down parsing and bottom-up parsing to assign POS tags (e.g., noun, verb,
adjective, etc.) to the sequence of tokens that form a sentence and to determine the structure of the
sentence through parsing tools.
DATA ENCODING:
The process of mining information from EHR requires coding of data that is achieved either
manually or by using NLP techniques to map free-text entries with an appropriate code. The coded
data is classified and standardized for storage and retrieval purposes in clinical research. Manual
coding is normally facilitated with search engines or pick-up list.
The use of data mining in healthcare is being adopted by organizations with a focus on optimizing
the efficiency and quality of their predictive analytics.
In the healthcare industry specifically, data mining can be used to decrease costs by increasing
efficiencies, improve patient quality of life, and perhaps most importantly, save the lives of more
patients.
Text mining in clinical domain is usually more difficult than general domains (e.g. newswire
reports and scientific literature) because of the high level of noise in both the corpus and training
data for machine learning (ML). Healthcare systems and specifically health record systems contain
both structured and unstructured information as text.
It is a subfield of biomedical NLP to determine classes of information found in clinical text that
are useful for basic biological scientists and clinicians for providing better health care.
More specifically, it is estimated that over 40% of the data in healthcare record systems contains
text, so-called clinical text, sometimes also called electronic patient record text.
Clinical text contains valuable information about symptoms, diagnoses, treatments, drug use and
adverse (drug) events for the patient that can be utilized to improve healthcare for other patients.
However, clinical text also contains sensitive information such as personal names, telephone
numbers and addresses of the patient and relatives. This information needs to be pseudonymized
before the clinical text can be utilized for secondary use.
Text mining and data mining techniques to uncover the information on health, disease, and
treatment response support the electronically stored details of patients’ health records. A
significant chunk of information in HER and CDA are text and extraction of such information by
conventional data mining methods is not possible. The semi-structured and unstructured data in
the clinical text and even certain categories of test results such as echocardiograms and radiology
reports can be mined for information by utilizing both data mining and text mining techniques.
Information extraction
Information extraction (IE) is a specialized field of NLP for extracting predefined types of
information from the natural text. It is defined as the process of discovering and extracting
knowledge from the unstructured text.
IE differs from information retrieval (IR) that is meant to be for identifying and retrieving relevant
documents. In general, IR returns documents and IE returns information or facts.
A typical IE system for the clinical domain is a combination of components such as tokenizer,
sentence boundary detector, POS tagger, morphological analyzer, shallow parser, deep parser
(optional), gazetteer, named entity recognizer, discourse module, template extractor, and template
combiner.
A careful modeling of relevant attributes with templates is required for the performance of high
level components such as discourse module, template extractor, and template combiner. The high
level components always depend on the performance of the low level modules such as POS tagger,
named entity recognizer, etc.
IE for clinical domain is meant for the extraction of information present in the clinical text. The
Linguistic String Project–Medical Language Processor (LSP–MLP), and Medical Language
Extraction and Encoding system (MedLEE) are the commonly adopted systems to extract
UMLS concepts from clinical text.
Preprocessing
The primary source of information in the clinical domain is the clinical text written in natural
language. However, the rich contents of the clinical text are not immediately accessible by the
clinical application systems that require input in a more structured form. An initial module adopted
by various clinical NLP systems to extract information is the preliminary preprocessing of the
unstructured text to make it available for further processing. The most commonly used
preprocessing techniques in clinical NLP are spell checking, word sense disambiguation, POS
tagging, and shallow and deep parsing.
Spell Checking
The misspelling in clinical text is reported to be much higher than any other types of texts. In
addition to the traditional spell checker, various research groups have come out with a variety of
methods for spell checking in the clinical domain: UMLS-based spell-checking error correction
tool and morpho-syntactic disambiguation tools.
The process of understanding the sense of the word in a specific context is termed as word sense
disambiguation. The supervised ML classifiers and the unsupervised approaches automatically
perform the word sense disambiguation for biomedical terms.
POS Tagging
An important preprocessing step adapted by most of the NLP systems is POS tagging that reads
the text and assigns the parts of speech tag to each word or token of the text. POS tagging is the
annotation of words in the text to their appropriate POS tags by considering the related and
adjacent words in a phrase, sentence, and paragraph. POS tagging is the first step in syntactic
analysis and finds its application in IR, IE, word sense disambiguation, etc. POS tags are a set of
word categories based on the role that words may play in the sentence in which they appear. The
most common set contains seven different tags: Article, Noun, Verb, Adjective, Preposition,
Number, and Proper Noun.
Parsing is the process of determining the complete syntactic structure of a sentence or a string of
symbols in a language. Parser is a tool that converts an input sentence into an abstract syntax tree
such as the constituent tree and dependency tree, whose leafs correspond to the words of the given
sentence and the internal nodes represent the grammatical tags such as noun, verb, noun phrase,
verb phrase, etc. Most of the parsers apply ML approaches such as PCFGs (probabilistic context-
free grammars) as in the Stanford lexical parser [50] and even maximum entropy and neural
network.
Few parsers even use lexical statistics by considering the words and their POS tags. Such taggers
are well known for overfitting problems that require additional smoothing. An alternative to the
overfitting problem is to apply shallow parsing, which splits the text into nonoverlapping word
sequences or phrases, such that syntactically related words are grouped together. The word phrase
represents the predefined grammatical tags such as noun phrase, verb phrase,
prepositional phrase, adverb phrase, subordinated clause, adjective phrase, conjunction phrase, and
list marker. The benefits of shallow parsing are the speed and robustness of processing. Parsing is
generally useful as a preprocessing step in extracting information from the natural text.
Context-Based Extraction
The fundamental step for a clinical NLP system is the recognition of medical words and phrases
because these terms represent the concepts specific to the domain of study and make it possible
to understand the relations between the identified concepts. Even highly sophisticated systems of
clinical NLP include the initial processing of recognizing medical words and phrases prior to the
extraction of information of interest. While IE from the medical and clinical text can be carried
out in many ways, this section explains the five main modules of IE.
Concept Extraction
Extracting concepts (such as drugs, symptoms, and diagnoses) from clinical narratives constitutes
a basic enabling technology to unlock the knowledge within and support moreadvanced reasoning
applications such as diagnosis explanation, disease progression modeling, and intelligent analysis
of the effectiveness of treatment. The first and foremost module inclinical NLP following the
initial text preprocessing phase is the identification of the boundaries of the medical terms/phrases
and understanding the meaning by mapping the identified term/phrase to a unique concept
identifier in an appropriate ontology. The recognition of clinical entities can be achieved by a
dictionary-based method using the UMLS Metathesaurus, rule- based approaches, statistical
method, and hybrid approaches. The identification and extraction ofentities present in the clinical
text largely depends on the understanding of the context. For example, the recognition of diagnosis
and treatment procedures in the clinical text requires the recognition and understanding of the
clinical condition as well as the determination of itspresence or absence. The contextual features
related to clinical NLP are negation (absence of a clinical condition), historicity (the condition had
occurred in the recent past and might occur in the future), and experiencer (the condition related
to the patient). While many algorithms are available for context identification and extraction, it is
recommended to detect the degree of certainty in the context.
Association Extraction
Clinical text is the rich source of information on patients conditions and their treatments with
additional information on potential medication allergies, side effects, and even adverse effects.
Information contained in clinical records is of value for both clinical practice and research;
however, text mining from clinical records, particularly from narrative-style fields (such as
discharge summaries and progress reports), has proven to be an elusive target for clinical Natural
Language Processing (clinical NLP), due in part to the lack of availability of annotated corpora
specific to the task. Yet, the extraction of concepts (such as mentions of problems, treatments, and
tests) and the association between them from clinical narratives constitutes the basic
enabling technology that will unlock the knowledge contained in them and drive more advanced
reasoning applications such as diagnosis explanation, disease progression modeling, and
intelligent analysis of the effectiveness of treatment.
Negation
“Negation” is an important context that plays a critical role in extracting information from the
clinical text. Many NLP systems incorporate a separate module for negation analysis in text
preprocessing. However, the importance of negation identification has gained much of its interest
among the NLP research community in recent years. As a result, explicit negation detection
systems such as NegExpander, Negfinder, and a specific system for extracting SNOMED-CT
concepts as well as negation identification algorithms such as NegEx that uses regular expression
for identifying negation and a hybrid approach based on regular expressions and grammatical
parsing are developed by a few of the dedicated research community. While the NegExpander
program identifies the negation terms and then expands to the related concepts, Negfinder is a
more complex system that uses indexed concepts from UMLS and regular expressions along
with a parser using LALR (look-ahead left-recursive) grammar to identify the negations.
Extracting Codes
Extracting codes is a popular approach that uses NLP techniques to extract the codes mapped to
controlled sources from clinical text. The most common codes dealing with diagnoses are the
International Classification of Diseases (ICD) versions 9 and 10 codes. The ICD is designed to
promote international comparability in the collection, processing, classification and presentation
of mortality statistics.
Preprocessing of texts such as tokenisation and text segmentation.
•Stemming: Stemming is a natural language processing technique that lowers inflection in words
to their root forms, hence aiding in the preprocessing of text, words, and documents for text
normalization.
•Compound splitting: Dealing with word compounding in statistical machine translation (SMT)
is essential to mitigate the sparse data problems that productive word generation causes. There
are several issues that need to be addressed: splitting compound words into their correct
components (i.e. disambiguating between split points), deciding whether to split a compound word
at all, and, if translating into a compounding language, merging components into a compound
word
Generally, the same building blocks used for regular texts can also be utilised for clinical text
processing. However, clinical texts contain more noise in the form of incomplete sentences,
misspelled words and non-standard abbreviations that can make the natural language processing
cumbersome.
Applications:
Healthcare associated infections are also called hospital associated infections or nosocomial
infections. An important goal in defeating HAIs is to collect statistics by detecting and measure
the prevalence of HAIs, but also to predict and warn if a particular patient has a high risk of
obtaining HAI. HAIs can encompass, for example, pneumonia, urinary tract infection, sepsis or
various wound infections but also norovirus (winter vomiting disease). Two machine learning
algorithms ; Support Vector Machine (SVM) and Random Forest (RF) in the Weka toolkit were
applied on the annotated Stockholm EPR Detect-HAI Corpus.
Adverse drug events (ADEs) are a major public health problem, around 5% of all hospital
admissions in the world are due to ADEs
All drugs are poisonous in some sense but given in the correct amount they may cure a disease.
(d) Time-related, becomes apparent some time after the use of the drug.
First of all, ICD-10 diagnosis codes related to adverse drug events that are assigned to the patient
records need to be studied.
Medical terminologies, classification systems and available controlled vocabularies are used in
healthcare to report, administer, classify and explain diseases and treatment, including medication.
Mobile imaging is the technique of creating visual representations of the interior of a body for
clinical analysis and medical intervention, as well as visual representation of the function of some
organs or tissues.
Introduction:
Mobile technology and smart devices, especially smartphones, allows new ways of easier imaging
at the patient’s bedside and possess the possibility to be made into a diagnostic tool that can be
used by both professionals as well as lay people. Smartphones usually contain at least one high-
resolution camera that can be used for image formation. However, careful consideration has to be
taken when dealing with cameras in general, and with nonscientific cameras specifically. Many
parameters are usually reported on camera in public commercials, but not all of them are useful.
Especially, pixel resolution can be misleading as the number of pixels itself is not a measure of
quality. Quality is usually measured in signal-to-noise ratio (SNR)
• Shot noise, which is dependent on the quality of the sensor and the discretization of
different number of photons. This noise mostly occurs when only a few photons hit the
sensor.
• Transfer noise, which is introduced by connectivity in the sensor. This is usually static
for all images and can be reduced using background subtraction with an image acquired
in complete darkness.
In case of a camera, the signal is the amount of light captured by the sensor. Since image
noise is reduced, more photons are available. The most important parameter for the quality of
an optical system is the amount of light accumulated on each pixel. This parameter is
determined by the physical size of a pixel (or chip size in relation to number of pixels), as
larger pixel acquires more light, and the diameter of the entry lens, which regulates the amount
of light. The size of the entry lens is usually given in f-stop k (written as 1:k or f/k), the ratio
of distance from sensor to entry lens to diameter of entry lens, the lower, the better. Most
modern smartphones have similar optical parameters as regular consumer cameras, while being
built at a far smaller scale.
First integrations of these cameras into clinical routine and research have already shown manifold
applications for mobile technology in medicine.One example is the usage of the smartphone
camera to take pictures of test strips for automatic analysis.
Another example is the use of smartphone cameras to document necrotic skin lesions caused from
the rare disease calciphylaxis in a multicenter clinical registry. Here, special care must be taken
when dealing with multiple different smartphones or lighting conditions due to different
efficiencies in capturing colors.
A color reference has to be used to calibrate the camera colors in a later step. To control
illumination, zoom, and distance, the German company FotoFinder has developed an integrated
lens system that is easily attached to and powered by an iPhone transforming it into a
dermatoscope.
Beside the integrated camera, additional image formation methods can also be used on smart
devices by either incorporating special sensors (like ultrasound or ECG) or by connecting
themwired or wireless to more powerful imaging machines like micronuclearmagnetic resonance
(micro-NMR) for bedside diagnostic.
Data Visualization
The task of transforming an acquired image dataset into a perceptible form is called
visualization.This is rather simple for most 2D methods like digital photographs, but for 3D
volumes, in particular, if voxels are annotated with several features or monitored over time (3D+t).
In general, all data is displayed by transforming it into a colored 2D representation. Hence, we
need to consider the output devices as well as the definition and value ranges of the initial data.
Visualization Basics
The human eye is capable of detecting light between 390 and 700 nm wavelengths. Images that
are recorded and displayed within this so-called visible spectrum show the data in “true color.”
But because many modalities like X-ray, ultraviolet, or infrared imaging capture wavelengths
outside the visible spectrum, a modification of the recorded data has to be performed. The resulting
image (e.g., a grayscale image for X-ray) is displayed in “false color.” A special case ofthis is the
so-called “pseudo color,” which means that the color of an image has been artificed to enhance
certain features. Here, a single channel image and a so-called color map are used to convert each
value of the single channel into a corresponding color.
As an example, the Doppler signal contains information on direction of movement for each
position. This movement can be either positive (towards the detector), negative (away from the
detector), or zero (no movement). To superimpose this information to morphologic image data (B
mode), a different color scheme is applied. The zero level would be encoded in black, negative
values in blue, and positive values in red. Larger absolute value of the signal results in brighter
color.
Output Devices
All data is displayed on a computer screen, where colors are mixed from three basic channels: red,
green, and blue (RGB). This results in a cubic color space.Setting all three colors to the samevalue
creates different shades of gray. Each color is usually scaled from 0 (dark) to 255 (bright). This
equals a bit depth of 8, meaning that 8 bits in memory are allocated for each color channel yielding
in total 256 power 3∼ 16 million possible values. Higher bit-depth color or gray values are also
possible but rarely used, as they are not well supported by computer screens and file formats.
However, in some cases a higher contrast or distribution of color or gray values is needed, e.g., for
diagnostics in radiology. Therefore, computer screens in diagnostic radiology support higher bit
depth (e.g., grayscale bit depth of 10), and have a better contrast (e.g., 1400:1 compared to 1000:1
regular) and brightness (e.g., 400 cd/m2 brightness compared to 200 cd/m2 regular) than regular
computer screens.
Printers differ from screens in that the background color of a screen (no color turned on) is black,
while the background color of a printout (paper) is white. Thus, higher values in color for screens
result in brighter colors, while higher amounts of color from a printer result in darker colors.
Therefore, printers usually use cyan, magenta, yellow, and black (CMYK) color space to
compensate for the nonblack background. Black is used as a key ingredient when mixing the colors
to minimize the fluid on the paper. Therefore, computer screens in diagnostic radiology support
higher bit.
Mobile Visualization
Recently, visualization and display technology has been dominated by trends in mobile
computing.For example, prior to the introduction of the first retina display with the iPhone 4 in
2010, almost all computer and smartphone displays had a pixel density of about 70–100 pixels per
inch(ppi). Increase in resolution was mostly achieved through larger monitor screens.
However, the introduction of the retina display increased the pixel density above 300 ppi,
improving perceived contrast and also outperforming radiology displays in many other aspects
(e.g., iPhone 4 brightness: 500 cd/m2). Thereby, these new types of screens show great potential
for radiologists.
Additionally, modern smartphones and tablet computers provide a high amount of processing
power (e.g, 64-bit dual core, 1.3 GHz in iPhone 5s) that can be used for image visualization.
Almost all 2D and surface-rendering visualization techniques can be employed in real time. Real
time means that the result is delivered fast enough to make an impact on the current situation, or,
in terms of visualization of data, so that no delay between action (e.g., zooming) and result
(zoomed image) is perceived. Usually, this requires 15 to 20 frames per second (fps).[Frame rate,
then, is the speed at which those images are shown]
Volume rendering
Example, using H.264 video compression that is standard in mobile communication. On the other
hand, the client captures touches, swipes, and other interactions of the user and sends these to the
server to update the live view. Streaming of video data has the benefit of allowing the user to use
a mobile device while having the computational power of a workstation. The drawback of this
approach is the needed bandwidth to stream images in real time from the server to themobile
device.
For example, a video with 30 frames per second (fps) and a resolution of 1920 by 1080 pixels
(FullHD/1080 p) requires about 1 Mb/sec bandwidth. This is not possible through most current
wireless networks like 3G, which is limited between 350 and 2000 kilobits per second
(kbit/s),depending on country and reception.
Calibration
Important for distributed visualization on a range of different devices is calibration. This means
that the same image is displayed in the exact same way on all devices, even if background
illumination differs between these devices. For this, an application has been developed that allow
users to calibrate their devices visually on their own. In this application the user is guided through
8 steps, each showing a visual pattern. In each step, the user has to adjust a slider to change the
visibility of the pattern.
One concern that is often raised when visualizing biomedical images on mobile devices is the
appropriateness for diagnostics. For example, software that displays medical images might have
to undergo investigation by the Food and Drug Administration (FDA) or other local legal
authorities to be cleared for commercial marketing. Smartphones and tablet computers do not
necessarily meet the requirements to undergo these studies. Therefore, the appropriateness and
legitimacy of the device chosen for visualization should always be taken into account when
considering the use of a mobile device for diagnostic or visualization of medical images.
Image Analysis
Image analysis is the task of extracting abstract information or semantics and knowledge from the
raw pixels of image and signal data.
This is the most challenging task in biomedical imaging as it supports researchers and clinicians
in finding clues for disease or certain phenotypes (diagnostics), supports novices and experts in
performing procedures (therapy) and follow-up to the outcome, and allows scientist to gain
knowledge from imaging data.
With the growing number of digital imaging devices, automated knowledge extraction becomes
more and more important. The new trend towards mobile and personalized health data additionally
drives the need for automation. For example, many applications for the smartphone-
based investigation of skin cancers do already exist but only a few are actually accurate. Pulse
frequency is determined accurately and contactless by any smartphone device simply filming the
face and determining the very slight periodic changes in skin color, which are usually not observed
by humans.
Basically all images from biomedical imaging modalities and especially those from smart phone
cameras are noisy and contain artifacts. Therefore, preprocessing is required before the data can
be used for analysis. Additional preprocessing can also help to prepare the image for certain
analysis tasks, such as edge detection. Most of the preprocessing algorithms are low in
computation time and memory requirements and hence suitable for mobile devices.
Gaussian filter
A Gaussian filter is commonly used to remove noise and recording artifacts from an image by
blurring. The filter consists of a multidimensional Gaussian distribution that is convolved with the
image. For convolution, the center value is replaced with the accumulated weighted values
according to the mask. High frequency noise in the image is thereby reduced.
On convolution of the local region and the Gaussian kernel gives the highest intensity value to
the center part of the local region(38.4624) and the remaining pixels have less intensity as the
distance from the center increases.Sum up the result and store it in the current pixel
The median filter is also used to reduce noise. For this filter, a sliding window with a fixed size
(here a 3x3 pixel) is moved across the image. The center point of the window is replaced by the
median value within the window. For median computation, the image pixel values at current mask
position (A to I) are sorted, and the center is replaced by the fifth value in the sorted row.This
removes outliers in an otherwise smooth area while maintaining the value of the majority of the
pixels.
Sobel filter
The Sobel filter is used to enhance edges in the image. For this, an asymmetric filter is convolved
with the image. The mask that is visualized is sensitive to vertical edges, in particular to vertical
edges from black to white. Usually, this mask is turned by 90◦ and the signs are changed ending
up with a set of eight different masks. All eight masks are applied individually and, for instance,
the maximum is used as a replacement for the center pixel to obtain an edge map.
Feature Extraction
Features are simplified descriptors of an image or part of an image. Features are used to compare
two images, or find similarities or shared objects between multiple images. Image features can be
either global (describing the image as a whole) or local (describing a part of any size of the image).
A very basic global image feature is the image histogram. A histogram is a probability
distribution of the pixel/voxel values in the image. For each possible value, the number of
occurrences is counted in the image. This results in a very simplified representation as information
on the intensity is maintained, but all spatial information is lost. Global features,such as the
shape of the histogram, can be used, for instance, to distinguish between classes of images, e.g.,
hand and skull radiographs
Local features describe only a part of the image at a certain spatial position. Most are created in
two separate steps. The first one is feature detection, in which points of interest (POIs) are
localized.The second step features description. For each of the detected points, a description of
this position (possibly including some surrounding areas) is created. Since images can be acquired
under different conditions like scale and rotation, certain invariance against thesechanges is needed
for both detector and descriptor.
Recognizing objects in images is one of the most important problems in computer vision. A
common approach is to first extract the feature descriptions of the objects to be recognized from
reference images, and store such descriptions in a database. When there is a new image, its feature
descriptions are extracted and compared to the object descriptions in the database to seeif the
image contains any object we are looking for. In real-life applications, the objects in the images to
be processed can differ from the reference images in many way:
➢ Orientation
➢ Viewpoint
➢ Illumination
➢ Partially covered
Scale-invariant feature transform (SIFT) is an algorithm for extracting stable feature description
of objects call keypoints that are robust to changes in scale, orientation, shear, position, and
illumination.
mo
BioMedical Image Analysis
Introduction
In its broadest sense, an image is a spatial map of one or more physical
properties of a sub- ject where the pixel intensity represents the value of a physical
property of the subject at that point. Imaging the subject is a way to record spatial
information, structure, and context information. In this context, the subject could be
almost anything: your family sitting for a family photo taken with your smartphone,
the constellations of orion’s belt viewed from a telescope, the roads of your
neighbor- hood imaged from a satellite, a child growing inside of its mother viewed
using an ultrasound probe. The list of possible subjects is endless, and the list of
possible imaging methods is long and ever- expanding. But the idea of imaging is
simple and straightforward: convert some scene of the world into some sort of array
of pixels that represents that scene and that can be stored on a computer.
Naturally, if we wanted to describe all of the possible subjects and modalities,
that would be an entire book of its own. But, for our purposes, we are interested
in biomedical images, which are a subset of images that pertain to some form of
biological specimen, which is generally some part of human or animal anatomy.
The imaging modality used to acquire an image of that specimen generally falls into
one of the categories of magnetic resonance imaging (MRI), computed tomog-
raphy (CT), positron emission tomography (PET), ultrasound (U/S), or a wide range
of microscopy modalities such as fluorescence, brightfield, and electron microscopy.
Such modalities have various purposes: to image inside of the body without harming
the body or to image specimens that are too small to be viewed with the naked eye.
These modalities enable us to image biological structure, function, and processes.
While we often think of images as 2D arrays of pixels, this is an overly
restrictive conception, especially as it pertains to biomedical images. For example,
if you broke a bone in your leg, you might get a 3D MRI scan of the region, which
would be stored as a three-dimensional array of pixel values on a disk. If that leg
needed to be observed over time, there might be multiple MRI scans at different
time intervals, thus leading to the fourth dimension of time. A fifth dimension of
modality would be added if different types of MRI scans were used or if CT, PET, U/S,
or biological images were added. When all of these time-lapse datasets of different
modalities are registered to each other, a rich set of five-dimensional information
becomes available for every pixel representing a physical region in the real world.
Such information can lead to deeper insight into the problem and could help
physicians figure out how to heal your leg faster.
Another multidimensional example is common in the area of microscopy. To
visualize cellular dynamics and reactions to drugs (for example, for the purpose of
discovering targets for treating cancer), a group of cells could be imaged in their
3D context using confocal microscopy, which enables optical sectioning of a region
without harming the structure. This region could have multiple markers for different
regions of the cell such as the nucleus, cytoplasm, membrane, mitochondria,
endoplasmic reticulum, and so forth. If these are live cells moving over time, they
can be imaged every few seconds, minutes, hours, or days, leading to time-lapse
datasets. Such five-dimensional datasets are common and can elucidate structure-
structure relationships of intracellular or extra- cellular phenomena over time in
their natural 3D environment.
If we were to stop at this point in the description, we would be left in a rather
frustrating position: having the ability to image complex structures and processes,
to store them on a computer, and to visualize them but without any ability to
generate any real quantitative information. Indeed, as the number of imaging
modalities increases and the use of such modalities becomes ubiquitous coupled
with increasing data size and complexity, it is becoming impossible for all such
datasets to be
carefully viewed to find structures or functions of interest. How is a physician
supposed to find every single cancerous lesion in the CT scans of hundreds of
patients every day? How is a biologist supposed to identify the one cell acting
unusually in a field of thousands of cells moving around randomly? At the same
time, would you want such events to be missed if you are the patient?
Being able to look inside of the body without hurting the subject and being able
to view bi- ological objects that are normally too small to see has tremendous
implications on human health. These capabilities mean that there is no longer a
need to cut open a patient in order to figure out the cause of an illness and that we
can view the mechanisms of the building block of our system, the cell. But being
able to view these phenomena is not sufficient, and generating quantitative infor-
mation through image analysis has the capability of providing far more insight into
large-scale and time-lapse studies. With these concepts in mind, the need for
computationally efficient quantitative measurements becomes clear.
Biomedical image analysis is the solution to this problem of too much data. Such
analysis meth- ods enable the extraction of quantitative measurements and
inferences from images. Hence, it is possible to detect and monitor certain
biological processes and extract information about them. As one example, more
than 50 years after the discovery of DNA, we have access to the comprehensive
sequence of the human genome. But, while the chemical structure of DNA is now
well understood, much work remains to understand its function. We need to
understand how genome-encoded com- ponents function in an integrated manner to
perform cellular and organismal functions. For example, much can be learned by
understanding the function of mitosis in generating cellular hierarchies and its
reaction to drugs: Can we arrest a cancer cell as it tries to replicate?
Such analysis has major societal significance since it is the key to understanding
biological systems and solving health problems. At the same time, it includes many
challenges since the images are varied, complex, and can contain irregular shapes.
Furthermore, the analysis techniques need to account for multidimensional datasets
I(x, y, z, λ,t, ), and imaging conditions (e.g., illumination)
cannot always be optimized.
In this chapter, we will provide a definition for biomedical image analysis and
explore a range of analysis approaches and demonstrate how they have been and
continue to be applied to a range of health-related applications. We will provide a
broad overview of the main medical imaging modal- ities (Section 3.2) and a
number of general categories for analyzing images including object de- tection,
image segmentation, image registration, and feature extraction. Algorithms that fall
in the category of object detection are used to detect objects of interest in images
by designing a model for the object and then searching for regions of the image that
fit that model (Section 3.3). The output of this step provides probable locations for
the detected objects although it doesn’t neces- sarily provide the segmented outline
of the objects themselves. Such an output feeds directly into segmentation
algorithms (Section 3.4), which often require some seeding from which to grow
and segment the object borders. While some segmentation algorithms do not
require seeding, accurate locations of the objects provides useful information for
removing segmented regions that may be ar- tifacts. Whereas detection and
segmentation provide detailed information about individual objects, image
registration (Section 3.5) provides the alignment of two or more images of either
similar or different modalities. In this way, image registration enables information
from different modalities to be combined together or the time-lapse monitoring of
objects imaged using the same modality (such as monitoring tumor size over time).
Feature extraction combines object detection, image segmentation, and image
registration together by extracting meaningful quantitative measurements from the
output of those steps (Section 3.6). Taken as a whole, these approaches enable the
genera- tion of meaningful analytic measurements that can serve as inputs to other
areas of healthcare data analytics.
Chest and abdomen CT Whole-body FDG-PET T1-weighted MRI brain
FIGURE 3.1 (See color insert.): Representative images from various medial
modalities.
Computed Tomography
Computed Tomography (CT) creates 2D axial cross-section images of the body
by collecting several 1D projections of conventional X-ray data using an X-ray
source on one side and a detec- tor on the other side. The 1D projection data are
then reconstructed into a 2D image. Modern CT systems are capable of acquiring
a large volume of data extremely fast by increasing the axial cov- erage. A CT image
displays a quantitative CT number usually reported in Hounsfield units, which is a
measure of the attenuation property of the underlying material at that image
location. This makes CT inherently amenable to quantification. CT has become the
mainstay of diagnostic imaging due to the very large number of conditions that are
visible on CT images. A recent development has been the advent of so-called Dual
Energy CT systems, where CT images are acquired at two different en- ergy levels.
This makes it possible to do a very rich characterization of material composition
using differential attenuation of materials at two different energy levels. The
simplest form of CT image reconstruction algorithms use variations of the filtered
back-projection method, but modern iterative model-based methods are able to
achieve excellent reconstruction while limiting doses to a patient. Common artifacts
associated with CT images including aliasing, streaking, and beam hardening.
Positron Emission Tomography
Positron Emission Tomography (PET) is a nuclear imaging modality that uses
radioactively labeled tracers to create activity maps inside the body based on uptake
of a compound based on metabolic function. PET measures the location of a line
on which a positron annihilation event occurs and as a result two simultaneous 511
keV photons are produced and detected co-linearly using co-incidence detection.
PET allows assessment of important physiological and biochemical processes in
vivo. Before meaningful and quantitatively accurate activity uptake images can be
generated, corrections for scatter and attenuation must be applied to the data. Newer
iterative recon- struction methods model attenuation, scatter, and blur and have
sophisticated methods of dealing with motion that may take place during the image
acquisition window.
Ultrasound
Ultrasound is one of the most ubiquitous imaging modalities due in large part to
its low cost and completely noninvasive nature. Ultrasound imaging transmits high
frequency sound waves using specialized ultrasound transducers, and then collects
the reflected ultrasound waves from the body using specialized probes. The variable
reflectance of the sound waves by different body tissues forms the basis of an
ultrasound image. Ultrasound can also depict velocities of moving structures such
as blood using Doppler imaging. Imaging a growing fetus in the womb and
cardiovascular imaging are two of the most common ultrasound imaging
procedures. Due to very fast acquisition times, it is possible to get excellent real-
time images using ultrasound to see functioning organs such as the beating heart.
Modern ultrasound systems employ sophisticated electronics for beam forming and
beam steering, and have algorithms for pre-processing the received signals to help
mitigate noise and speckle artifacts.
Microscopy
In addition to in vivo radiological imaging, clinical diagnosis as well as research
frequently makes uses of in vitro imaging of biological samples such as tissues
obtained from biopsy speci- mens. These samples are typically examined under a
microscope for evidence of pathology. Tradi- tional brightfield microscopy imaging
systems utilize staining with markers that highlight individual cells or cellular
compartments or metabolic processes in live or fixed cells. More rich proteomics
can be captured by techniques such as fluorescence-based immunohistochemistry
and images can be acquired that show expression of desired proteins in the sample.
Images from such microscopy systems are traditionally read visually and scored
manually. However, newer digital pathology plat-
forms are emerging and new methods of automated analysis and analytics of
microscopy data are enabling more high-content, high-throughput applications.
Using image analysis algorithms, a mul- titude of features can be quantified and
automatically extracted and can be used in data-analytic pipelines for clinical
decision making and biomarker discovery.
Introduction
Clinical Decision Support Systems (CDSS) are computer systems designed to
assist clinicians with patient-related decision making, such as diagnosis and
treatment. Ever since the seminal To Err Is Human [1] was published in 2000, CDSS
(along with Computer-Based Physician Order Entry systems) have become a
crucial component in the evaluation and improvement of patient treatment. CDSS
have shown to improve both patient outcomes and cost of care. They have
demonstrated to minimize analytical errors by notifying the physician of potentially
harmful drug interactions, and their diagnostic procedures have been shown to
enable more accurate diagnoses. There are a wide variety of uses for CDSS in
clinical practice. Some of the main uses include:
• Assisting with patient-related decision making.
• Determining optimal treatment strategies for individual patients.
• Aiding general health policies by estimating the clinical and economic outcomes of different
treatment methods.
• Estimating treatment outcomes under circumstances where methods like randomized trials
are either impossible or infeasible.
In 2005, Garg et al. [2] conducted a review of 100 patient studies and concluded
that CDSS improved diagnosis in 64% and patient outcomes in 13% of the studies
tested. That same year, Duke University conducted a systematic review of 70
different cases and concluded that decision support systems significantly improved
clinical practice in 68% of all trials. The CDSS features attributed to the analysis’
success included:
• natural integration with clinical workflow.
• electronic nature.
• providing decision support at the time/location of care rather than before or after the patient
encounter.
Two particular fields of healthcare where CDSS have been hugely influential
are the pharmacy and billing. Pharmacies now use batch-based order checking
systems that look for negative drug interactions and then report them to the
corresponding patient’s ordering professional. Meanwhile,
in terms of billing, CDSS have been used to examine both potential courses of
treatment and con- ventional Medicare conditions in order to devise treatment plans
that provide an optimal balance of patient care and financial expense.
In this chapter, we will provide a survey of different aspects of CDSS along with
various chal- lenges associated with their usage in clinical practice. This chapter is
organized as follows: Sec- tion 19.2 provides a brief historical perspective
including the current generation CDSS. Various types of CDSS will be described
in Section 19.3. Decision support during care provider order en- try is described in
19.4 while the diagnostic decision support is given in 19.5. Description of the
human-intensive techniques that can be used to build the knowledge base is given
in Section 19.6. The primary challenges with the usage of CDSS are studied in
Section 19.7 while the legal and ethical issues concerned is discussed in Section
19.8. Section 19.9 concludes our discussion.
Historical Perspective
In this section, we provide a historical perspective on the development of CDSS.
We will first describe the most popular early CDSS that were developed several
decades ago and then we will discuss the current generation CDSS. For each of the
CDSS, we will give the high-level idea of its functioning and also mention the
primary drawbacks.
Early CDSS
Ever since the birth of the medical industry, health scientists have recognized the
importance of informed clinical decision making. Unfortunately, for a long time,
efficient methods for researching and evaluating such methods were quite rare.
Clinicians often relied on extensive research and hand- written records to establish
the necessary knowledge for a well-informed decision. Naturally, this proved to be
both error prone and very time consuming. Fortunately, the evolution of business-
related computing in the 1970s and 1980s gave clinicians an easy mechanism for
analyzing patient data and recommending potential courses of treatment and thus,
CDSS were born.
Early systems rigidly decided on a course of action, based on the user’s input
[3]. The user would input any necessary information, and the CDSS would output
a final decision, which in turn would be the user’s course of action:
• Caduceus (aka The Internist) [4]: This system was developed in the 1970s as a means of
implementing an artificial intelligence model for use in CDSS, with the central goal of the
physician using a “hypothetico-deductive” approach to medical diagnosis. One of the sys-
tem’s unique features was its use of a probabilistic method for ranking diagnoses. It evaluated
patient symptoms and then searched its knowledge base for the most likely disease, based
on the statistics of existing patients with the specified symptoms. Unfortunately, Caduceus’
diagnostic accuracy was not good. For instance, in 1981, a study using pre-existing clinico-
pathological conference cases was conducted and then published in The New England Journal
of Medicine. Caduceus was unable to match the diagnostic accuracy of real-life experts in this
study, due to its limited knowledge base and small number of diagnostic algorithms. Thus,
the system was unable to gain widespread acceptance with the medical community.
In the mid 1980s, Caduceus evolved into QMR (Quick Medical Reference).
QMR differed significantly from Caduceus in that, while Caduceus was used
mainly for diagnostic consul- tation (i.e., suggesting rigid courses of treatment
to clinicians), QMR was more flexible. It allowed clinicians to modify and
manipulate its suggested diagnoses/treatments in whichever
way they wished, while allowing them to utilize its knowledge base to establish
their own hy- potheses with regards to the treatment of more complex and
difficult cases [4]. While QMR contained an extensive medical database
(approximately 570 diseases in all), it had the major disadvantage of requiring
frequent updates whenever new diseases were discovered. Further- more,
according to a 1994 study comparing QMR with three other clinical decision
support systems, the system gave considerably fewer “correct” patient
diagnoses (by the standards of a group of physicians) than the three competing
systems [5]. Thus, by 2001, QMR was largely abandoned in favor of less
cumbersome and more accurate CDSS.
• MYCIN [6]: This was originally developed in the 1970s as a means for identifying infec-
tious diseases and recommending antibiotics for treatment. A unique aspect of MYCIN was
its emphasis on artificial intelligence (AI). Its AI model was constructed through a rule-based
system, in which roughly 200 decision rules (and counting) were implemented into the sys-
tem, forming the knowledge base. To determine possible patient diagnoses, MYCIN’s internal
decision tree was consulted, and diagnostic options were reached by running through its var-
ious branches. The rule-based system was very flexible in that it allowed clinicians to either
modify existing rules or devise new ones as they saw fit, making MYCIN adaptable to chang-
ing medical trends and discoveries. Therefore, it was considered an expert system, since its
AI component allowed for results that were theoretically similar to those of a real-life expert.
Unfortunately, there were many significant problems with MYCIN. First, it
worked very slowly, with a typical analysis requiring upwards of 30 minutes.
Second, there was concern over whether physicians ran the risk of putting too
much trust in computerized results at the expense of their own judgment and
inquiry. Third, there was the issue of accountability: Who would be held liable
if the machine made an error in patient diagnosis? Perhaps the most important
problem was how ahead of its time MYCIN was. It was developed before
desktop computing and the Internet existed, so the system was based on a rather
dated model for com- puter interaction [7]. Nonetheless, its influence was far
reaching and is still felt to this day, with many systems either combining it
with other expert systems (Shyster-MYCIN [8]) or using it as an influence on
the development of new systems (GUIDON [9]).
• Iliad [10]: Iliad is another “expert” CDSS. It contains three modes of usage: Consultation,
Simulation, and Simulation-Test. In Consultation mode, users enter real-life patient findings
into the system. Iliad then analyzes these findings and compiles a list of possible diagnoses,
with each diagnosis ranked in terms of its likelihood of correctness. A unique feature of Iliad
is its handling of “gaps” in patient information. If the patient data appears incomplete, Iliad
will suggest methods of completion and/or compromise, so that the clinician may continue
working on a possible diagnosis. In Simulation mode, Iliad assumes the role of a complaining
patient. It offers a typical real-life complaint and then demands input, testing, etc., from the
clinician. The clinician’s questions, responses, and diagnostic decisions are evaluated by Iliad,
with feedback provided once analysis is complete. Finally, in Simulation-Test mode, Iliad runs
a similar real-life patient simulation, except that feedback is not given to the clinician. Instead,
Iliad silently evaluates his/her performance and then sends it to another user. Needless to say,
because of its highly scholastic focus, Iliad is often used for educational purposes. In fact,
studies have shown that it is very effective in training aspiring medical professionals for real-
life practice [10].
Unlike many other systems, which use knowledge-frame implementations, Iliad
uses a framed version of the Bayes model for its analysis [11]. This makes it
much easier for the system to recognize multiple diseases in a single patient
(further information on Bayes classification can be found in Section 19.3.1.2).
For diseases that are mutually dependent, a form of cluster analysis is included.
This groups the diseases into independent categories, based not only on
the disease type, but also on clinician-specified factors such as their specific point
of infection. This is so that the diseases may be efficiently analyzed and a more
effective Bayesian classifier may be devised.
The 1980s saw tremendous growth and development in the field of clinical
decision support. Greater involvement from the Association of American Medical
Colleges in clinical library practice provided the necessary funding and resources
for developing functional computerized information systems. Such systems
included everything from electronic health records to financial management
systems. Furthermore, PDAs (personal digital assistants) aided the development of
CDSS by giving them portability. Patient data and clinical decision-making
software could now be carried in the clinician’s pocket, allowing him/her to easily
reach informed decisions without cutting into their time with the patient. Although
PDAs were more akin to basic information systems than CDSS, they were major
stepping-stones in the development of CDSS that would allow clinicians to make
diagnostic and treatment decisions while remaining physically close to their
patients.
CDSS Today
Today’s CDSS have much broader and more flexible methods for making
clinical decisions, using both clinician and machine knowledge to give a series of
potential “suggestions,” with the clinician deciding on the suggestion that is most
appropriate to her specific needs [3].
• VisualDx [12]: This is a JAVA-based clinical decision support system that, as the name sug-
gests, is often used as a visual aid in assisting healthcare providers with diagnosis. This is use- ful
in instances where surface level diseases (such as those of the skin) are present, and doctorsneed
visual representations of these diseases to aid with diagnosis. A unique feature of Visu-alDx
is that, rather than being organized by a specific diagnosis, it is organized by symptomsand
other visual clues. It uses a sophisticated matching process that visually matches images of
the specific patient’s abnormalities with pre-existing images within a built-in database of
more than 6,000 illnesses. It then uses the results of these comparisons to recommend courses
of treatment.
VisualDX has significant limitations. In addition to a vast image database, the
system contains a written summary of each image. Unfortunately, these
summaries are relatively brief and are, therefore, prone to overgeneralization.
For example, skin biopsies are often recommended for “sicker” patients.
However, it is unclear what is actually meant by “sicker.” This is especially
problematic when we consider that skin biopsies are rarely performed unless
standard skin therapy has proven ineffective. Nevertheless, VisualDx has
been demonstrated to be quite useful when diagnosing surface-level illness.
The system is operational to this day, with a significant update in 2010
enabling companionship with a similar product called UpToDate [3].
• DXplain [13]: This is a web-based diagnosis system developed in the late 1980s by the Amer-
ican Medical Association. A unique feature of this system is its simplicity: Clinicians enter
patient information using nothing but their own medical vocabulary, and the system outputs a
list of potential diagnoses from a knowledge base consisting of thousands of diseases (with up
to ten different references each), along with the potential relevance of its choices. Therefore, it
functions as a clinical decision support system for physicians with little computer experience.
DXplain has been demonstrated to be both reliable and cost efficient,
especially in academic environments [3]. For example, a 2010 study consisting
of more than 500 different diagnos- tic cases was assigned to various
Massachusetts General Medicine residents. They concluded that medical
charges, Medicare Part A charges, and service costs significantly decreased
when
using DXplain for diagnostic recommendation [14]. DXplain has also been
frequently demon- strated to give very accurate diagnoses. For example, in a
2012 study conducted by Lehigh University, the system was compared with
four other CDSS. The conclusion drawn was that it was second only to Isabel
(discussed below) in terms of accuracy [15].
• Isabel [16]: This is one of the most comprehensive CDSS available. Like DXplain, it is a web-
based system designed with physician usability in mind. Originally, it focused mainly on
pediatrics, but it was soon expanded to cover adult symptoms. Isabel contains two subsystems:
a diagnostic checklist utility and a knowledge mobilizing utility. The diagnosis checklist tool
enables physicians to enter patient demographics and clinical features into the system, which
then returns a set of recommended diagnoses. The knowledge mobilizing utility may then be
used to research additional information about the recommended diagnoses [3].
Isabel has been demonstrated to give exceptionally accurate diagnoses of most
patient cases. In the Lehigh University study, for example, it was shown to be
the most accurate of the five systems tested. Other studies, such as a 2003 study
conducted by the Imperial College School of Medicine, have also
demonstrated this system to be very accurate [17]. Unfortunately, Isabel is a
relatively new CDSS and, thus, more extensive testing must be performed in
order to give a firm assessment of its overall reliability.
Knowledge-Based CDSS
Contemporary CDSS are rooted in early expert systems. These systems
attempted to replicate the logic and reasoning of a human decision maker, reaching
firm decisions based on existing knowl- edge. Knowledge-based CDSS rose out of
the intuitive realization that medicine was a good field for applying such
knowledge. A computer could (theoretically) mimic the thought processes of a
real-life clinician and then give a finalized diagnosis based on the information at hand
(Figure 19.1). During the 1990s and 2000s, however, CDSS moved away from
attempting to make rigorous clinical decisions in favor of offering a variety of
possible diagnostic/treatment options and then allowing the clinician herself to
make a finalized decision [7]. There are multiple reasons for this change in focus.
These include an underlying fear of computers being inherently prone to errors, the
realization that artificial intelligence still had a long way to go before it could
successfully mimic the knowledge and reasoning skills of real-life clinicians, the
infringement computerized decision making placed on physician/patient relations,
etc. Thus, today’s CDSS present a variety of diagnos- tic/treatment options to
clinicians, allowing them to evaluate first-hand the patient’s symptoms and
personal testimonies while utilizing the systems as reference points for possible
diagnoses.
Knowledge-based CDSS are those with a built-in reference table, containing
inbred information about different diseases, treatments, etc. They use traditional AI
methods (such as conditional logic) to reach decisions on courses of treatment. There
are three main parts to a knowledge-based CDSS. They are the knowledge base, the
inference engine, and the user communication method.
The knowledge base is essentially a compiled information set, with each piece
of information structured in the form of IF-THEN rules. For example, IF a new
order is placed for a slowly- changing blood test, AND IF the blood test was ordered
within the past 48 hours, THEN we alert the
FIGURE 19.1: A general knowledge-based clinical decision
support system.
physician to the possibility of duplicate test ordering. The knowledge base functions
in conjunction with whichever algorithmic structure the system uses for its analysis.
To put it simply, the user inputs patient information, and then the system searches
through its knowledge base for matching diseases or treatment possibilities [2].
The inference engine applies a system of logic to the knowledge base, allowing
it to “become smarter” by establishing new and/or updated knowledge. It contains
the necessary formulae for combining the rules in the knowledge base with any
available patient data, allowing the system to create patient-specific rules and
conditions based on its knowledge of both the patient’s medical history and the
severity of his/her current condition. A particularly important aspect of the inference
engine is its mutual exclusion from the knowledge base. Because CDSS
development is very time consuming, reusability is key. Anybody should be
allowed to construct a new CDSS through an existing inference engine.
Unfortunately, most real-life systems are developed with a specific goal in mind
(for example, diagnosing breast cancer). Thus, it is either difficult or impossible to
use them beyond their intended purpose.
Finally, the user communication method is where the clinician herself inputs
the patient’s relevant data and then receives the corresponding results. In some
CDSS, the patient data must be
manually entered. Most of the time, however, patient data is provided through a
computer-based record. The record is inputed either by the clinician or an external
lab or pharmacy and is, thus, already electronically scaled. It is the clinician’s job
to properly manipulate the system to obtain the outcome she wishes. Diagnostic
and treatment outcomes are generally represented as either recommendations or
alerts. Occasionally, if an alert has been generated after an initial order was placed,
automated emails and wireless notifications will be sent.
The usual format for a knowledge-based CDSS is that the clinician is asked to
supply a certain amount of input, which is then processed through both the system’s
knowledge base and reasoning engine. It then outputs a series of possible diagnostic
or treatment options for her.
Input
While there is substantial variance in the manner in which clinical information
is entered into a CDSS, most systems require the user to choose keywords from
his/her organization’s word dic- tionary. The challenge clinicians typically face
with this requirement is that different CDSS have different word vocabularies. The
quality of output in a CDSS depends on how well its vocabulary
matches the clinician’s keywords. In general, however, items related to the patient’s
medical history and current symptoms are going to be the suggested input.
One potentially effective method of giving detailed input is to use an explicitly
defined time model, in which the user specifies various time intervals and the
events that occurred within them. Unfortunately, this complicates user input and
would, thus, likely prove too cumbersome for the average clinician. A simpler
solution would be to use an implicit time model, in which broad temporal
information is part of the specified user input (for example, “history of recent
exposure to strep”) [7]. While this simplified approach has the disadvantage of
temporal ambiguity (does “recent” mean “just last week” or “last year”?), it has
proven to be a viable method of measuring time in a CDSS.
Inference Engine
The inference engine is the part of the CDSS that combines user input with all
other necessary data to devise a final list of “decisions.” To avoid confusion, this
process is usually hidden from the user. There are many different methods of
analyzing user input and devising results from it. One popular method is the
utilization of production rules. These are logical IF-THEN statements that, when
combined, form concrete solutions to problems. MYCIN is an example of a popular
CDSS that uses production rules. However, the most popular method of
probabilistic estimate in an inference engine is Bayes’ Rule, which computes the
conditional probabilities [7]. In mathematical terms, suppose we wish to compute
the probability of event A given event B, (or Pr(A|B)). As long as we already have
Pr(B|A), along with “prior probabilities” (Pr(A) and Pr(B)) at our disposal, we may
use Bayes’ Rule to compute Pr(A|B) as follows:
Pr(A) · Pr(B|A)
Pr(A|B) = (19.1)
Pr(B)
The result is an estimate of the patient’s likelihood for having hepatitis, given
the presence of jaundice.
In medicine, there is the challenge of computing the likelihood of two disjoint
yet potentially related events happening simultaneously in a patient [7]. For
example, suppose we wish to compute the probability of a patient having both
pneumonia and an abnormal chest radiograph:
Nonknowledge-Based CDSS
Nonknowledge-based CDSS differ from knowledge-based ones in that, rather
than a user- defined knowledge base, they implement a form of artificial
intelligence called Machine Learn- ing. This a process by which a system, rather
than consulting a precomposed encyclopedia, simply “learns” from past
experiences and then implements these “lessons” into its knowledge base. There are
two popular types of Nonknowledge-based CDSSs: Artificial Neural Networks and
Genetic Algorithms [7].
3. Optimizing clinical care: As the clinicians becomes accustomed to a CPOE system, they con-
sider ways of customizing it so that their work becomes easier and more effective. Not only
does this cater the system to the user’s liking, but it could reduce the potential for violations
such as inappropriate testing. For example, at Vanderbilt University, users of a system called
WizOrder were encouraged to modify the program so that they could create Registry Or- ders
where billing information would be more easily transferred. The challenge, in this case, comes
from the need to improve the effectiveness of the system while maintaining usability. Thus, it
is generally left up to the user to design a system that is able to successfully balance these two
issues.
4. Providing just-in-time focused education relevant to patient care: Most CPOE systems pro-
vide useful educational prompts and links to more detailed description about their material,
with the interface designed in a manner that encourages their use. These can be used in treat-
ment summaries or through a corresponding web browser. Such links have the benefit of
assisting the clinician with more complex orders.
Benefits and Challenges—The benefits of CPOE systems are that they can improve
clinical pro- ductivity, provide solid educational support, and positively impact
how patient care is given. They also make order entry much easier for both the
clinician and the user, providing a computerized framework for placing orders.
Thus, issues such as sloppy handwriting are nonexistent, while typos may be
corrected through a built-in autocorrect feature. On the other hand, the manner in
which error checking is handled may result in placing the orders containing
unidentified errors. This could be especially dangerous if the order happens to be
costly and critical to the patient’s survival. If there is an error in it, then whatever
money was spent on the order may get wasted. Worse yet, the patient’s life may be
in danger. Computerized order entry systems also have the disadvantage of re- lying
on an Internet-based framework, meaning occasionally bad transmissions and server
problems are inevitable.
FIGURE 19.2: The scoring criteria for Mayo Clinic’s depression test. It explicitly states that it is not
meant to be used as a diagnostic tool.
An organization known as the Foundation for Informed Medical Decision Making (FIMDM)1 has
worked to expand upon the traditional diagnostic decision support process by focusing primarily on
treatment decisions that take into account the patient’s personal preferences in terms of health outcomes.
Specifically, they use video clips to depict the possible outcomes of each treatment, giv- ing the patient
an idea of what the experiences relating to these outcomes will be like and better preparing the patient
for the clinical decision-making process. FIMDM provides tools for many diseases, ranging from breast
cancer to coronary artery disease. Offline CD ROM-based software also exists for diagnostic decision
support. Interestingly, in some instances, such software actually provides deeper and more detailed
diagnostic information than what is available on the World Wide Web. For example, the American
Medical Association has the “Family Medical Guide.” This is a multilevel software package consisting
of seven different modules:
1. A listing of possible diseases, disorders, and conditions.
2. A map of the human body.