Projections of a general binary model on a logistic regression

Mariusz Kubkowski , Jan Mielniczuk

Abstract

We consider a general binary model for which conditional probability of success given vector of predictors X equals q(βT 1 X, ..., βT k X) and a family of possibly misspecified logistic regressions fitted to it. In the case when X satisfies linearity condition we show that their algebraic structure is uniquely determined and that the vector β∗ corresponding to Kullback–Leibler projection on this family is a linear combination of β1, ..., βk. This generalizes the known result proved by P. Ruud for k = 1 which says that the true and projected vectors are collinear. It also follows that the projected vector has the same direction as the first canonical vector which justifies frequent observations that logistic fit yields well performing classifiers even if misspecification is expected. In the special case of additive binary model with multivariate normal predictors and when response function q is a convex combination of univariate responses we show that the variance of β∗T X is not larger than the maximal variance of the projected linear combinations for the corresponding univariate problems. In the case of balanced additive logistic model it follows that the contribution of βi to β∗ is bounded by the corresponding coefficient in the convex representation of response function q.
 Author Journal series Linear Algebra and Its Applications, ISSN 0024-3795, (A 30 pkt) Issue year 2018 Vol 536 Pages 152-173 Publication size in sheets 1.05 Keywords in Polish pierwszy wektor kanoniczny, ogólny model binarny, addytywny model binarny, zła specyfikacja Keywords in English first canonical vector, general binary model, additive binary model,logistic model, misspecification Abstract in Polish Rozpatruje się własności rzutów ogólnego modelu binarnego, w którym warunkowe prawdopodobieństwo sukcesu pod warunkiem wektora X jest równe q(1^Tbf X, łdots, k^Tbf X) i udowadnia się, ze w przypadku gdy wektor predyktorów X spełnia warunek liniowych regresji, to wektor rzutu jest liniową kombinacją wektorów 1,łdots ,k. Wynik jest uogólnieniem wyniku P. Ruuda dla $k=1$. DOI DOI:10.1016/j.laa.2017.09.013 URL https://www.sciencedirect.com/science/article/pii/S0024379517305372 Language en angielski Score (nominal) 30 Score Ministerial score = 30.0, 12-07-2018, ArticleFromJournalMinisterial score (2013-2016) = 30.0, 12-07-2018, ArticleFromJournal Publication indicators WoS Impact Factor: 2016 = 0.973 (2) - 2016=1.076 (5) Citation count*
Cite