Introduction
Hepatic encephalopathy is one of the major complications of liver cirrhosis. It causes a continuum of brain/behavioral changes, practically subdivided into degrees by the most recommended classifications: International Society for Hepatic Encephalopathy and Nitrogen Metabolism (ISHEN) and West Haven classification (WHC). ISHEN divides encephalopathy into overt and covert [1], while the WHC subdivides covert encephalopathy into minimal and stage 1, and overt encephalopathy into stages 2 to 4 [2] (Table 1). The mildest form, minimal hepatic encephalopathy (MHE), is defined by the 2014 EASL/AASLD guideline (European Association for the Study of the Liver and American Association for the Study of Liver Diseases). The definition is: psychometric and neuropsychological alterations of tests exploring psychomotor speed, executive functions, or neurophysiological alterations without clinical evidence of mental change [3]. These “test-only” alterations are of importance as they predict the risk of overt hepatic encephalopathy [4], determine the prognosis (survival) [5], and determine the quality of life and health costs for the affected patient [5]. Therefore, the 2022 EASL guideline suggests screening of all cirrhotic patients without signs of overt hepatic encephalopathy. For those with MHE, therapy with non-absorbable disaccharide should be started [6].
Table 1
Staging of hepatic encephalopathy (with some changes from [3])
A large variety of tests are used to detect MHE. They are subdivided into two categories: neurophysiological and neuropsychological. The main neurophysiological tests are electroencephalography (EEG) and continuous flicker frequency (CFF). The more widely used neuropsychological tests are Psychometric Hepatic Encephalopathy Score (PHES), Continuous Reaction Time (CRT), Inhibitory Control Test (ICT), Stroop test, and the most recent one – Animal Naming Test (ANT). Artificial intelligence (AI) is also finding its way into the diagnostics of hepatic encephalopathy, including MHE [7]. There are also tests rarely used nowadays, such as the SCAN test, RBANS (the Repeatable Battery for the Assessment of Neuropsychological Status), the Vienna Test System, and the Cogstate Research test battery.
The 2014 EASL/AASLD Guideline suggested using at least two tests to detect MHE: a combination of PHES and a computerized test system or a neurophysiological test [3]. The up-to-date EASL Guideline from 2022 recommends a single test for which experience and local norms are available, while suggesting further study of the ANT [6].
Briefly, MHE is, by definition, determined only by test changes, and there is no gold standard for the test used, with a recommendation for a single “appropriate test” diagnosis and a growing need for wider MHE screening and prophylaxis. Consequently, the characteristics and comparison of the different available tests are of utmost interest and importance.
Characteristics of the tests used to diagnose minimal hepatic encephalopathy
As there is no gold standard for the diagnosis of MHE, many authors have compared available tests, highlighting one or the other’s methodological advantages or disadvantages. The characteristics of the different tests are the place to start when choosing what to use. While some differences appear minor, others may be of importance.
Creation specificity. Some tests (PHES and ANT) have been created specifically for MHE detection. Other tests focus on mental health and are used in Alzheimer’s disease and other psychiatric or neurologic disorders (RBANS, Vienna Test System). In the case of the neurophysiological tests (EEG and CFF), a link is created between directly detected brain function and cognition deficits.
Neurocognitive domains. When comparing neuropsychological tests, one must keep in mind that the different tests focus on different neurocognitive domains: processing speed, working memory, cognitive attention, fine motor function, reaction time, visual or verbal anterograde memory and visuospatial ability. Many studies have been performed to determine what neurocognitive changes occur exactly in MHE. Attempts were made to rate the importance of the different neurocognitive alterations [8]. Nevertheless, all tests are considered equal if used appropriately, with the best possible experience, tools and local norms [6].
Computerized vs. pencil and paper. This characteristic applies only when comparing psychometric tests. The 2009 ISHEN guideline states that a pencil-and-paper test is more appropriate [8]. Perhaps tests of this type are more easily administered to the patient, without the need for any computer or device knowledge. On the other hand, nowadays electronic devices are widespread, and the ability to use them is almost uniform. Computerized tests usually have machine evaluation and detection of results, making them less time-consuming. Moreover, data are becoming automatically electronic, eliminating the need for manual data input to perform statistics. The portability of mobile devices has made computerized tests an appropriate bedside test possibility. In computerized tests, the different battery forms are easily loaded and even generated.
Translatability. Paper and pencil tests can be easily translated into various languages, needing nothing more than language knowledge. On the other hand, computerized tests require additional software knowledge, to translate the application itself.
Time to perform. Some tests are very time consuming. This makes them useful only in scientific research and not as a general screening tool. Such is the case with the Vienna test system and EEG [9].
Retest reliability/learning effect. The best test should be available for retesting in order to evaluate therapy or for follow-up. Paper and pencil tests, such as the PHES [10] and RBANS, have demonstrated a learning effect and therefore need several alternate forms [11].
Affecting factors: age, gender, education (illiteracy). Some tests require some kind of education on behalf of the test subject, be it knowledge of the alphabet, numbers, or others. Unfortunately, this renders such tests inappropriate for illiterate subjects. In some areas, this severely reduces the available test population and, therefore, the acquired results. In other cases, age or gender has to be taken into consideration.
Local norms. All the psychometric tests show slight differences when different populations are being analyzed. Therefore, in order to use them properly, normative reference data are needed [6]. Such validations on a national basis are greatly needed and, thankfully, are becoming more and more available. Revalidation in the same population after a certain amount of time (different generations) is also appropriate, as changes in educational and lifestyle variables will surely impact performance [12]. One must also keep in mind that, in a highly mobile modern society, population characteristics may change even faster.
Copyright. Almost all tests that do not use specific equipment are copyrighted. Thankfully, the copyright holders (the authors of the test) are usually keen on sharing the test. Nevertheless, in computerized tests, the specific computer application is also copyrighted. The patent for the app is usually the property of IT companies and, in some cases, is taxed.
Medical equipment. Neurophysiological tests such as CFF and EEG require specific medical equipment. While EEG machines are widely available and used, CFF devices are bought for the sole purpose of CFF testing.
Need for specific training. Neuropsychological tests can be administered by nurses or students after brief, appropriate training. Meanwhile, an EEG test and interpreting the results are highly specific tasks, requiring deep knowledge and commitment.
Description of the available minimal hepatic encephalopathy tests (Table 2)
Psychometric Hepatic Encephalopathy Score (PHES)
Psychometric Hepatic Encephalopathy Score is a battery that consists of several tests: Digit Symbol Test (DST), Number Connection Test A (NCT A), Number Connection Test B (NCT B), Serial Dotting Test (SDT), and Line Tracing Test (LTT). It was developed in 1989 after a vigorous selection among many psychometric tests for those that are best at identifying MHE [13]. It was created specifically for the detection of MHE. The test focuses on multiple domains: processing speed, working memory, cognitive attention, visual scanning efficacy and perception, fine motor function and motor speed. It is a paper and pencil test that is easily translatable and requires around 15 minutes to perform. A learning effect has been found [10], and several forms of the test have been developed. In that way, the learning effect was eliminated [14]. Illiteracy is a barrier for the test, as it requires both numerical and alphabetic knowledge. Variations have been created by replacing letters with symbols [15], but this also changes the neurocognitive domains that are tested [12]. The result of the test is also affected by the patient’s age, education and social background. Therefore, local norms have a pivotal role. As it is one of the first tests for MHE and is easily translatable, local norms have been developed in many countries, including most of western Europe, the USA, Korea and more. The test is copyrighted and is the property of Hannover Medical University. It is distributed through one of its authors, Prof. Karin Weissenborn. PHES does not require any sophisticated equipment, but training and knowledge are needed to administer it properly.
Continuous Reaction Time (CRT)
Continuous Reaction Time is a computerized test that consists of a motor reaction to auditory stimuli (500 Hz, 90 dB). The reaction time and the intra-individual variability are acquired. After 150 stimuli, one receives 150 reaction times and their variation, from which the CRT index is calculated (a variation coefficient of the test reaction times). With overt HE, the reaction time increases. Subtle MHE changes, however, are better detected with an increase in reaction variability (CRT index) [16]. A CRT index of < 1.9 is diagnostic for MHE [9]. The CRT was developed as a test for encephalopathy (metabolic or organic) in the 1960s by Renzi, Bruhn and Parsons. Its use in hepatology started in the 1980s [16, 17]. Shortly thereafter, the loss of stability in the reaction time (measured by the CRT index) was noted, as well as its ability to detect subtle changes such as MHE. The test itself focuses on sustained attention, attention stability, motor reaction speed and inhibitory control [18]. It does not require any translation and is administered in 10 minutes [19]. CRT has no learning effect and can be administered to illiterate subjects. The CRT reaction time is affected by age, gender, or psychoactive drugs. The more important CRT index remains unaffected by these variables. Education is not a factor as long as severely low IQ subjects are avoided [18]. CRT cannot be administered to patients with hearing disturbances. Local norms are available in Denmark, where the test was developed and is most widely used. The test is copyrighted. It requires a laptop (computer), simple software, headphones, and a handheld trigger button (EKHO, www.Bitmatic.com, Aarthus, Denmark). No sophisticated training is needed.
Inhibitory Control Test (ICT)
Inhibitory Control Test is a computerized test, in which patients are shown random letters at a time interval of 500 milliseconds. The task is to respond when “X” is followed by “Y” or “Y” is followed by “X” (these sequences are called targets). At the same time, evaluated subjects should not react to “lures” such as “X” following “X” or “Y” following “Y”. A lure threshold of 5 was identified as diagnostic for MHE [20]. In India, ICT was compared to PHES as the gold standard. A cut-off value for diagnosing MHE was set at 14 lures [21]. Later, not only the lure threshold but also the number of correct responses (target accuracy) was suggested as a factor. It correlated even better with other tests [14, 22]. The new combined parameter was named Weighted Lures [22]. The test was initially created to assess cognitive deficits in patients with schizophrenia, ADHD (attention-deficit/hyperactivity disorder), and traumatic brain injury [23-25]. In 2007, it was introduced as a tool to detect MHE [20]. It focuses on cognitive domains such as attention, response inhibition and working memory [11]. It is computerized and easily translatable (as far as changes to software are easy) and takes around 15 minutes to perform. The test has excellent retest reliability [20, 21], but some authors suggest a learning effect [14, 22]. Education/illiteracy and age seem to have an impact [14, 22]. ICT has been validated in the USA, Italy and India [21, 22, 26]. The test itself is not copyrighted and was previously available at www.hecme.tv. The webpage is no longer functioning. Nevertheless, the software tool needed is distributed by the author who introduced ICT for MHE – Prof. Jasmohan Bajaj. No specific equipment (except a computer) or sophisticated training is needed, so medical assistants can easily perform the test [26].
Stroop test
The Stroop test (nowadays also computerized) is based on the Stroop effect. The latter is demonstrated by showing test subjects names of colors painted in different color ink, e.g., the word “blue” painted in green ink rather than blue ink. The subject must name the color of the ink and not the meaning of the word. Usually, whenever the word and ink do not correspond, more time is needed to perform the task. Whenever a mistake is made, the run starts again. The time needed to complete five correct runs with the Stroop effect on and off, as well as the number of tries, is registered. The Stroop test has been used to validate brain dysfunction in cirrhotics since 2005 [27]. Later, a smartphone application (EncephalApp) was created and has been more extensively used [28-30]. The Stroop test focuses on neurocognitive domains such as attention, processing speed and cognitive flexibility [9]. As it is a computerized test, the resultant parameters are collected directly into data sheets. Translating the test itself is not a problem, but translating a smartphone application demands software capabilities. The time to perform the test is within 5-6 minutes, depending on the speed of the tested subject. Retest reliability is high, although a learning effect has been found. The latter is more demonstrable in cirrhotic patients with MHE who have never experienced overt hepatic encephalopathy [28]. The test is affected by multiple factors: age, education and gender [28, 31]. It has different cut-offs for test subjects below/above 45 years old. Local norms are available in multiple countries, including the USA, Korea, Brazil, and Germany [29, 32-34]. The test itself is not copyrighted; and the smartphone application is also free to download. The EncephalApp, however, is patented, and changes in the app itself (i.e., adding additional language) are charged. The only equipment needed is a computer or a smartphone (either an Android or iOS device). A smartphone or a tablet can be used interchangeably, without affecting the results [29]. No sophisticated training is required, and medical assistants can easily perform the test. New personnel are able to quickly achieve high interobserver reliability compared to experienced personnel [28].
Animal Naming Test (ANT)
In the ANT the test subject must name as many animals as he can in one minute. All repetitions and mistakes are eliminated. After an age and education adjustment, the simplified Animal Naming Test (s-ANT) was introduced [35]. This test is suggested as a fast-screening method for cirrhotic patients requiring further MHE testing [35, 36]. ANT has been used to standardize semantic verbal fluency since the 1990s. In 2017, it was introduced as a tool to detect MHE. The semantic fluency that ANT evaluates, at least partly, relies on neurocognitive domains connected to hepatic encephalopathy. Such domains are working memory, sustained attention and partly inhibition [35]. It is neither a computerized nor a paper and pencil test. It is the only test that can be performed via a phone call [11]. Translation is not needed, as long as the tested subject and the tester speak the same language. ANT, by default, takes exactly 1 minute to perform, being the fastest of all test modalities. It has a slight learning effect that peaks after its second use. Age and education are affecting factors – hence the suggested corrections introduced in the s-ANT. Those are made for people with less than 8 years of education or over 80 years of age [35]. Local norms are available in a growing number of countries: Italy, Germany, India, China and Taiwan [35-39]. The test is not copyrighted, and no medical equipment or previous training is required to perform it properly.
Critical flicker frequency (CFF)
In CFF the patient is in a dark, quiet room and is subjected to light stimuli with decreasing frequency. The task is to press a button when the light changes from constant to blinking (flickering). The frequency at which the patient perceives the change is recorded. After a test run, the test is performed eight times, and the average result is used. A threshold of 39 or 38 Hz is used for the detection of overt HE [40, 41]. For MHE, however, sensitivity and specificity decline. Therefore, some authors suggest not using it as a first-line diagnostic tool [42], but as an additional tool. In a meta-analysis, the sensitivity was 61% and the specificity was 79% [43]. The test was initially used in psychophysiology to assess the effect of drugs on the central nervous system [42] and in ophthalmology to diagnose changes in the optic nerve [11]. In 2002, it was introduced as a tool for detecting MHE [40]. CFF evaluates two neurocognitive domains: visual discrimination ability and general arousal [40]. The test is fully computerized, and requires no translation. It takes 10 to 20 minutes [11, 44] and has no learning effect [14]. CFF is not affected by variables such as gender, level of education or time of the day [40], but to some extent changes with age [14, 15]. Nevertheless, very low IQ patients are incapable of completing the test and producing accurate results. The patient needs to be highly functioning and have intact binocular vision. Some of the machines use red light, so they cannot be used by individuals who are red-green colorblind. Patients with alcoholic etiology of cirrhosis also had lower results compared to cirrhosis of other etiologies [14]. Local norms are available in Germany, India, Mexico, and Spain [40, 41, 45, 46]. The test is not copyrighted. It requires specific, expensive medical equipment, the most popular being the different versions of the HEPAtonorm Analyzer.
Electroencephalography (EEG)
Electroencephalography provides a way of exploring a patient’s brain activity without any need for cooperation or patient engagement. In the 1950s, slow waves were observed in the frontal regions of patients with overt hepatic encephalopathy [47]. Later, in patients with hepatic encephalopathy progressing to coma, a generalized slowing of background EEG mean dominant frequency was described, with alpha rhythm being replaced by theta waves and then by delta waves [48]. In fact, such changes (to a mild extent) are also seen with the mild cognitive impairment of MHE: the percentage of theta activity in a posterior lead is in excess of 35% [49]. Thresholds for the detection of any degree of hepatic encephalopathy using the relative power of delta and theta waves were developed [48] and later improved [50]. EEG does not evaluate neurocognitive domains but rather cortical brain activity. The test does not require patient cooperation and is therefore not affected by factors such as the need for translation, age, education, learning effect or local norms. EEG testing is a time-consuming procedure, even with the use of computer-assisted analysis. There is no copyright, but the procedure and interpretation are not easy to master. Therefore, it is hardly accessible to clinicians without a specific interest. The helping hand of an experienced physician is usually needed and recommended [51]. Professional, high-quality EEG equipment is a heavy financial burden, but newer, low-cost wireless headsets are becoming available.
Artificial intelligence (AI) and hepatic encephalopathy
Artificial intelligence has proved useful in many medical specialties, including hepatology. In the case of hepatic encephalopathy, AI is applied in several aspects: 1) it can learn to detect subtle changes in brain imaging (especially MRI and functional MRI). Differences in the MRI of cirrhotic patients with and without MHE have been described in several studies, and AI can assist in the discrimination [52, 53]; 2) It can learn to detect EEG changes; 3) It can be used to evaluate a patient’s speech and handwriting. AI powered natural language processing has revealed that severely ill cirrhotic individuals tend to use shortened sentences with fewer words [54]. Meanwhile speech and handwriting can be evaluated by AI for the diagnosis of neurological disorders such as Alzheimer’s disease and Parkinson’s disease. Such evaluation can prove useful for hepatic encephalopathy [7].
Comparing the tests for minimal hepatic encephalopathy
“A remarkable finding is that, as yet there is not a standardized, universally accepted diagnostic tool for subclinical hepatic encephalopathy” – a quote from an article by Prof. Amodio in 1996 [49] that remains absolutely true today. In fact, over the years, there has been a constant dispute as to which test is “better” at diagnosing MHE. All this started with the paper and pencil tests being compared with one another (initially the PHES and RBANS) and the EEG. Later, CFF, computerized tests, and all the other modalities appeared. The last is ANT, recognized as a tool in 2017. Each test has its advantages and disadvantages, rigorously cited by both creators/supporters and sceptics when comparing characteristics and results (Table 3) [10, 14, 15, 19, 21, 22, 28, 30, 32, 34-38, 41, 44-46, 55-61].
Table 3
Comparing MHE tests
Test | PHES | CRT | ICT | Stroop | ANT | EEG | CFF |
---|---|---|---|---|---|---|---|
PHES | [19, 56] | [14, 21, 22, 55, 57, 58] | [28, 30, 32, 34, 55] | [35-38, 59] | [10, 60, 61] | [14, 15, 41, 45, 46, 58, 61] | |
CRT | [19, 56] | [44] | |||||
ICT | [14, 21, 22, 55, 57, 58] | [28, 30, 55] | [22] | [14, 58] | |||
Stroop | [28, 30, 32, 34, 55] | [28, 30, 55] | [34] | ||||
ANT | [35-38, 59] | [35] | [36] | ||||
EEG | [10, 60, 61] | [22] | [35] | [45, 61] | |||
CFF | [14, 15, 41, 45, 46, 58, 61] | [44] | [14, 58] | [34] | [36] | [45, 61] |
[i] Numbers correspond to references comparing the two tests. Font type corresponds to approximate accordance between the two tests: underlined font for high accordance (above 80%), bold font for moderate accordance (60-80%) and italic font representing low accordance (below 60%), normal font indicates not enough data.
The previous EASL/AASLD guideline from 2014 [3] suggested that a two-test combination should be used to diagnose MHE. It supported PHES as one of the tests, with the other being either a computerized or a neurophysiologic one. This combined approach aimed at higher specificity for MHE diagnosis, but reports confirm different outcomes [55]. There is not a high consensus between the tests, so the aim for at least two abnormal ones leads to many patients with a single test abnormality being dropped out. The ultimate aim, however, is not just cognition evaluation but also outcome stratification: overt HE risk, quality of life, risk of rehospitalization, risk of death, economic burden. When single vs. combined testing were compared for the detection of outcomes, single tests proved noninferior to combined testing [55]. Other authors suggest that a single test is sufficient to diagnose MHE, but combined testing can stratify the group even further: MHE patients with one-test abnormalities vs. MHE patients with two abnormal tests. Such stratification proved useful as, while both groups were at risk for negative outcomes, the two-test group had a higher risk [62].
The 2022 EASL Guideline takes a step back to single testing [6], with an emphasis on available local norms and experience. In fact, all the tests yield valid results. MHE affects many cognitive domains, and there is no single test available to detect every single cognitive change. Different tests detect different parts of the whole brain-alteration picture, and all are valid in diagnosing MHE. One must also keep in mind that standardization of the “norms” for the different tests was done in a different manner. For some (most often PHES), alterations are found by comparing them to a an appropriate group of healthy individuals. In comparison, other tests chose their cut-offs in comparison to PHES or another modality.
Conclusions
There is no gold standard for diagnosing MHE. All the tests are valid, as they evaluate different cognitive domains. By analogy with music: a distorted musical composition can have alterations in rhythm, pitch, tempo, dynamics, or timbre, with any one of these signifying a problem. Nowadays, it is more important to start testing and to test everyone in need.