Detail
Statistical inference with F-statistics when fitting simple models to high-dimensional data
- Author(s)
- Hannes Leeb, Lukas Steinberger
- Abstract
We study linear subset regression in the context of the high-dimensional overall model y = ϑ + θ+ with univariate response y and a d-vector of random regressors z, independent of . Here, “high-dimensional” means that the number d of available explanatory variables is much larger than the number n of observations. We consider simple linear submodels where y is regressed on a set of p regressors given by x = Mz, for some d× p matrix M of full rank p < n. The corresponding simple model, that is, y = α + βx + e, is usually justified by imposing appropriate restrictions on the unknown parameter θ in the overall model; otherwise, this simple model can be grossly misspecified in the sense that relevant variables may have been omitted. In this paper, we establish asymptotic validity of the standard F-test on the surrogate parameter β, in an appropriate sense, even when the simple model is misspecified, that is, without any restrictions on θ whatsoever and without assuming Gaussian data.
- Organisation(s)
- Department of Statistics and Operations Research, Research Network Data Science
- Journal
- Econometric Theory
- Volume
- 39
- Pages
- 1249-1272
- No. of pages
- 24
- ISSN
- 0266-4666
- Publication date
- 2021
- Peer reviewed
- Yes
- Austrian Fields of Science 2012
- 101029 Mathematical statistics
- Keywords
- ASJC Scopus subject areas
- Economics and Econometrics, Social Sciences (miscellaneous)
- Portal url
- https://ucrisportal.univie.ac.at/en/publications/283570bc-fd9b-482f-be0c-5b0aa2acbe9b