Cluster Analysis as a Strategy of Grouping to Construct Goodness-of-Fit Tests when the Continuous Covariates Present in the Logistic Regression Model
Jassim N. Hussain *
Department of Statistics, Faculty of Administration and Economics, University of Karbala, Karbala, Iraq.
Atheer J. Nassir
Department of Clinical Pharmacy, Faculty of Pharmacy, Pharmacy Building (A15), Compedown Campus, University of Sydney, Sydney, NSW, 2006, Australia.
*Author to whom correspondence should be addressed.
Abstract
When continuous covariates are present, classical Pearson and deviance goodness-of-fit tests to assess logistic model fit break down. Many goodness-of-fit (GOF) tests such as Hosmer–Lemeshow tests can be used in these situations. Meanwhile, it is simple to perform and widely used, it does not have desirable power in many cases and provides no further information on the source of any detectable lack-of-fit. We propose a new strategy of grouping based on a very general partitional clustering in the covariate space to construct two goodness-of-fit test statistics. Many simulation studies are implemented and clinical data set is analyzed to examine the performance of the proposed strategy of grouping and the developed GOF test statistics. The results show that the proposed strategy of grouping and GOF test statistics based on it has a potential for use in practice as a recommended strategy of grouping and as GOF test statistics to assess the adequacy of the logistic regression model.
Keywords: Continuous covariates, cluster analysis, goodness-of-fit test, logistic regression, strategy of grouping.