Application of Outlier Treatment Towards Improved Property Prediction Models

Published in Computer Aided Chemical Engineering, 2022

Property prediction models based on the principle of a quantitative structure-property relation (QSPR) such as the group contribution models are an important tool that provides a quick, simple, and costless evaluation of various thermophysical properties of chemicals for various applications such as P-V-T calculations and product design. These models rely heavily on the interplay between the chosen descriptor (molecular information), the chosen mathematical formulation (to relate the descriptor to the target property), and the data used to produce such models. Therefore, such models suffer heavily if the quality of experimental data is low (inaccurate) or if there are discrepancies in the descriptors used or the mathematical representation chosen. In this work, we apply a systematic methodology to detect and treat outliers on 18 thermophysical properties and showcase the model improvements across various statistical metrics. This results in significant improvements across all property models illustrated through an increase in the coefficient of determination (R2), the standard deviation (σ), and the mean absolute error (MAE).

Recommended citation: Aouichaoui, A. R., Mansouri, S. S., Abildskov, J., & Sin, G. (2022). Application of Outlier Treatment Towards Improved Property Prediction Models. In Computer Aided Chemical Engineering (Vol. 51, pp. 1357-1362). Elsevier. https://doi.org/10.1016/B978-0-323-95879-0.50227-7