Application of interpretable group-embedded graph neural networks for pure compound properties

Published in Computers & Chemical Engineering, 2023

Quantitative structure–property relationships (QSPRs) are important tools to facilitate and accelerate the discovery of compounds with desired properties. While many QSPRs have been developed, they are associated with various shortcomings such as a lack of generalizability and modest accuracy. Albeit various machine-learning and deep-learning techniques have been integrated into such models, another shortcoming has emerged in the form of a lack of transparency and interpretability of such models. In this work, two interpretable graph neural network (GNN) models (attentive group-contribution (AGC) and group-contribution-based graph attention (GroupGAT)) are developed by integrating fundamentals using the concept of group contributions (GC). The interpretability consists of highlighting the substructure with the highest attention weights in the latent representation of the molecules using the attention mechanism. The proposed models showcased better performance compared to classical group-contribution models, as well as against various other GNN models describing the aqueous solubility, melting point, and enthalpies of formation, combustion, and fusion of organic compounds. The insights provided are consistent with insights obtained from the semiempirical GC models confirming that the proposed framework allows highlighting the important substructures of the molecules for a specific property

Recommended citation: Aouichaoui, A. R., Fan, F., Abildskov, J., & Sin, G. (2023). Application of interpretable group-embedded graph neural networks for pure compound properties. Computers & Chemical Engineering, 176, 108291. https://doi.org/10.1016/j.compchemeng.2023.108291