Friday, August 21, 2020

Generating biomedical scientific hypotheses with AGATHA

 Accepted paper in the 29TH ACM International Conference on Information and Knowledge Management (CIKM)

Sybrandt, Tyagin, Shtutman, Safro "AGATHA: Automatic Graph-mining and Transformer based Hypothesis Generation Approach", preprint at http://arxiv.org/pdf/2002.05635.pdf

Medical research is risky and expensive. Drug discovery requires researchers to efficiently winnow thousands of potential targets to a small candidate set. However, scientists spend significant time and money long before seeing the intermediate results that ultimately determine this smaller set. Hypothesis generation systems address this challenge by mining the wealth of publicly available scientific information to predict plausible research directions. We present AGATHA, a deep-learning hypothesis generation system that learns a data-driven ranking criteria to recommend new biomedical connections. We massively validate our system with a temporal holdout wherein we predict connections first introduced after 2015 using data published beforehand. We additionally explore biomedical sub-domains, and demonstrate AGATHA's predictive capacity across the twenty most popular relationship types. Furthermore, we perform an ablation study to examine the aspects of our semantic network that most contribute to recommendation quality. Overall, AGATHA achieves best-in-class recommendation quality when compared to other hypothesis generation systems built to predict across all available biomedical literature. Reproducibility: All code, experimental data, and pre-trained models are available online: sybrandt.com/2020/agatha.

No comments:

Post a Comment

What we do/Team/In news

Quantum Computing     Quantum computers are expected to accelerate scientific discovery spanning many different areas such as medicine, AI, ...