Skip Navigation Skip To Footer


Construct Validity of Six Sentiment Analysis Methods in the Text of Encounter Notes of Patients with Critical Illness.

Weissman GE, Ungar LH, Harhay MO, Courtright KR, Halpern SD


December 13th, 2018

Appears In

Journal of Biomedical Informatics

External Link

External Link


Sentiment analysis may offer insights into patient outcomes through the subjective expressions made by clinicians in the text of encounter notes. We analyzed the predictive, concurrent, convergent, and content validity of six sentiment methods in a sample of 793,725 multidisciplinary clinical notes among 41,283 hospitalizations associated with an intensive care unit stay. None of these approaches improved early prediction of in-hospital mortality using logistic regression models, but did improve both discrimination and calibration when using random forests. Additionally, positive sentiment measured by the CoreNLP (OR 0.04, 95% CI 0.002 - 0.55), Pattern (OR 0.09, 95% CI 0.04 - 0.17), sentimentr (OR 0.37, 95% CI 0.25 - 0.63), and Opinion (OR 0.25, 95% CI 0.07 - 0.89) methods were inversely associated with death on the concurrent day after adjustment for demographic characteristics and illness severity. Median daily lexical coverage ranged from 5.4% to 20.1%. While sentiment between all methods was positively correlated, their agreement was weak. Sentiment analysis holds promise for clinical applications but will require a novel domain-specific method applicable to clinical text.

Page 1 Created with Sketch.

We generate high-quality evidence to advance healthcare policies and practices that improve the lives of all people affected by serious illness.