What Do We Choose When We Err? Model Selection and Testing for Misspecified Logistic Regression Revisited

Jan Mielniczuk , Paweł Roman Teisseyre


The problem of fitting logistic regression to binary model allowing for missppecification of the response function is reconsidered. We introduce two-stage procedure which consists first in ordering predictors with respect to deviances of the models with the predictor in question omitted and then choosing the minimizer of Generalized Information Criterion in the resulting nested family of models. This allows for large number of potential predictors to be considered in contrast to an exhaustive method. We prove that the procedure consistently chooses model t∗ which is the closest in the averaged Kullback-Leibler sense to the true binary model t. We then consider interplay between t and t∗ and prove that for monotone response function when there is genuine dependence of response on predictors, t∗ is necessarily nonempty. This implies consistency of a deviance test of significance under misspecification. For a class of distributions of predictors, including normal family, Rudd’s result asserts that t∗=t . Numerical experiments reveal that for normally distributed predictors probability of correct selection and power of deviance test depend monotonically on Rudd’s proportionality constant η .
Author Jan Mielniczuk (FMIS / DSPFM) - [Instytut Podstaw Informatyki Polskiej Akademii Nauk (IPI PAN) [Polish Academy of Sciences (PAN)]]
Jan Mielniczuk,,
- Department of Stochastic Processes and Financial Mathematics
- Instytut Podstaw Informatyki Polskiej Akademii Nauk
, Paweł Roman Teisseyre (FMIS)
Paweł Roman Teisseyre,,
- Faculty of Mathematics and Information Science
Publication size in sheets1.25
Book Matwin Stan, Mielniczuk Jan (eds.): Challenges in Computational Statistics and Data Mining, Studies in Computational Intelligence, vol. 605, 2016, Heidelberg New York Dordrecht London, Springer International Publishing, ISBN 978-3-319-18780-8, [978-3-319-18781-5], 399 p., DOI:10.1007/978-3-319-18781-5
front_matter.pdf / 110.53 KB / No licence information
Keywords in EnglishIncorrect model specification; Variable selection; Logistic regression
ASJC Classification1702 Artificial Intelligence
URL https://link.springer.com/chapter/10.1007%2F978-3-319-18781-5_15
Languageen angielski
Score (nominal)5
ScoreMinisterial score = 5.0, 09-01-2020, MonographChapterAuthor
Publication indicators WoS Citations = 0; Scopus SNIP (Source Normalised Impact per Paper): 2016 = 0.39
Citation count*
Share Share

Get link to the record

* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Are you sure?