Detail

Statistical inference with F-statistics when fitting simple models to high-dimensional data

Author(s)
Hannes Leeb, Lukas Steinberger
Abstract

We study linear subset regression in the context of the high-dimensional overall model y = ϑ + θ+ with univariate response y and a d-vector of random regressors z, independent of . Here, “high-dimensional” means that the number d of available explanatory variables is much larger than the number n of observations. We consider simple linear submodels where y is regressed on a set of p regressors given by x = Mz, for some d× p matrix M of full rank p < n. The corresponding simple model, that is, y = α + βx + e, is usually justified by imposing appropriate restrictions on the unknown parameter θ in the overall model; otherwise, this simple model can be grossly misspecified in the sense that relevant variables may have been omitted. In this paper, we establish asymptotic validity of the standard F-test on the surrogate parameter β, in an appropriate sense, even when the simple model is misspecified, that is, without any restrictions on θ whatsoever and without assuming Gaussian data.

Organisation(s)
Department of Statistics and Operations Research, Research Network Data Science
Journal
Econometric Theory
Volume
39
Pages
1249-1272
No. of pages
24
ISSN
0266-4666
Publication date
2021
Peer reviewed
Yes
Austrian Fields of Science 2012
101029 Mathematical statistics
Keywords
ASJC Scopus subject areas
Economics and Econometrics, Social Sciences (miscellaneous)
Portal url
https://ucris.univie.ac.at/portal/en/publications/statistical-inference-with-fstatistics-when-fitting-simple-models-to-highdimensional-data(283570bc-fd9b-482f-be0c-5b0aa2acbe9b).html