Comparison of group-contribution and machine learning-based property prediction models with uncertainty quantification
Published in Computer Aided Chemical Engineering, 2021
This study demonstrates the development of three modeling approaches for predicting thermo physical property with the ability to quantify the uncertainty in the prediction. The modeling approaches consist of a classical non-linear group-contribution (GC) model (GCM), Gaussian-Process regression (GPR), and a deep neural network (DNN) all applied to the first-order groups defined by Marrero and Gani as the molecular descriptor. The uncertainty was quantified using different methods: linear error propagation using the parameter covariance matrix for the GCM, the inherent uncertainty quantification of GPR models, and using a probabilistic layer able to learn the distribution of model output sin DNN. The models have been applied to the lower flammability limit (LFL) at 298K. The model performance was evaluated using 5 folds cross-validation to ensure the models were exposed to all data and to detect potential overfitting,—a procedure frequently used with in machine learning. The models obtained produce a good fit to the experimental data when applied to all available data with a coefficient of determination (R2) above 0.9 for all models, a maximum mean absolute error of 0.39 [%-vol], and a maximum mean squared error of 0.51.
Recommended citation: Aouichaoui, A. R., Al, R., Abildskov, J., & Sin, G. (2021). Comparison of group-contribution and machine learning-based property prediction models with uncertainty quantification. In Computer Aided Chemical Engineering (Vol. 50, pp. 755-760). Elsevier. https://doi.org/10.1016/B978-0-323-88506-5.50118-2