I'm doing a logistic regression, which I understand I can do by simply saying
$$ \operatorname{logit}(Y)=\beta_0+\beta_1 x+\varepsilon $$
where $\varepsilon$ is normally distributed around 0ドル$. Then then we can use the usual OLS methodology to fit the $\beta$s, and when we set $\varepsilon =0,ドル this gives us our best estimate $\widehat{\operatorname{logit}(Y)}$.
My question is, how can we find $\hat Y$ from here. I think that it isn't as simple as $\hat Y=\operatorname{logit}^{-1}\left(\widehat{\operatorname{logit}(Y)}\right),ドル because I know by analogy, $\hat Y=\exp\left(\widehat{\log(Y)}+\frac{1}{2}\sigma^2\right)$.
I looked up a logit-normal distribution (https://en.wikipedia.org/wiki/Logit-normal_distribution), but it says that there's no analytical solution for the mean of such a distribution. But I think I must be missing something because what good is the logistic regression if not to estimate $Y$.
-
$\begingroup$ It might help to review the basic concepts; in logistic regression the logit transform is of the mean, rather than of the data (which means it works on data consisting only of 0 and 1, for example). See also the sections on the generalized linear model relating to intuition and the following overview section. There are many useful posts on site relating to logistic regression $\endgroup$Glen_b– Glen_b2017年05月29日 02:51:14 +00:00Commented May 29, 2017 at 2:51
-
$\begingroup$ Possible duplicates: How to specify a logistic regression as a transformed linear regression and logit link in glm and inverse logit $\endgroup$Glen_b– Glen_b2017年05月29日 03:05:09 +00:00Commented May 29, 2017 at 3:05
1 Answer 1
Your understanding of logistic regression has some errors.
The logistic regression equation is
$$ \operatorname{logit}(E(Y))=\beta_0+\beta_1 x $$
Notice, there is no random part of the model on the right hand side. The linear part estimates the logit of the expected value of $Y$ exactly.
The randomness comes from how $Y$ disperses around it's expectation. To write the model explicitly in your style, you would have to write something like
$$ Y \mid x = \operatorname{Bernoulli}\left(p = \operatorname{logit}^{-1}(\beta_0+\beta_1 x) \right) $$
As a consequence, you cannot use OLS technology to fit a logistic regression. Logistic regressions are fit using iterative optimization, usually based off Newton's method.