Detection of medications associated with Alzheimer's disease using ensemble methods and cooperative game theory
Self archived versionpublished version
MetadataShow full item record
CitationBraithwaite, B. Paananen, J. Taipale, H. Tanskanen, A. Tiihonen, J. Hartikainen, S. Tolppanen, A-M. (2020). Detection of medications associated with Alzheimer's disease using ensemble methods and cooperative game theory. International journal of medical informatics, 141, 104142. 10.1016/j.ijmedinf.2020.104142.
To study the feasibility of evaluating feature importance with Shapley Values and ensemble methods in the context of pharmacoepidemiology and medication safety.
We detected medications associated with Alzheimer's disease (AD) by examining the additive feature attribution with combined approach of Gradient Boosting and Shapley Values in the Medication use and Alzheimer's disease (MEDALZ) study, a nested case-control study of 70,719 verified AD cases in Finland. Our methodological approach is to do binary classification using Gradient boosting (an ensemble of weak classifiers) in a supervised learning manner. Then we apply Shapley Values (from cooperative game theory) to analyze how feature combinations affect the classification result. Medication use with a five to one year time-window before AD diagnosis was ascertained from Prescription register.
Antipsychotics with low or medium dose, antidepressants with medium to high dose, and cardiovascular medications with medium to high dose were identified as the contributing features for separating cases with AD from controls. Medium to high amount of irregularity in the purchase pattern were an indicating feature for separating AD cases from controls. The similarity of medication purchases between AD cases and controls made the feature evaluation challenging.
The combined approach of Gradient Boosting and feature evaluation with Shapley Values identified features that were consistent with findings from previous hypothesis-driven studies. Additionally, the results from the additive feature attribution identified new candidates for future studies on AD risk factors. Our approach also shows promise for studies based on observational studies, where feature identification and interactions in populations are of interest; and the applicability of using Shapley Values for evaluating feature relevance in pattern recognition tasks.