Evaluating the performances of quantitative structure-retention relationship models with different sets of molecular descriptors and databases for high-performance liquid chromatography predictions Article

cited authors

  • Wang, C; Skibic, MJ; Higgs, RE; Watson, IA; Bui, H; Wang, J; Cintron, JM

fiu authors


  • Quantitative structure-retention relationship (QSRR) models were studied for two databases: one with 151 compounds and the other with 1719 compounds. In both cases, the three modeling methods employed (multiple linear regression, partial least squares, and random forests) provided similar prediction results with regard to root-mean-square error of prediction. The reversed-phase retention related seven molecular descriptors provided better models for the smaller dataset, while the use of over 2000 molecular descriptors generated better models for the larger dataset. The QSRR models were then validated with a mixture of an active pharmaceutical ingredient and its four process/degradation impurities. Finally, classification of compounds based on similar log D profiles before QSRR modeling improved chromatographic predictability for the models used. The results showed that database composition had a desirable effect on prediction accuracy for certain input molecules. © 2009 Elsevier B.V. All rights reserved.

publication date

  • June 19, 2009

Digital Object Identifier (DOI)

start page

  • 5030

end page

  • 5038


  • 1216


  • 25