Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for change of how predicted values are defined #144

Open
langbe opened this issue Apr 8, 2020 · 2 comments
Open

Suggestion for change of how predicted values are defined #144

langbe opened this issue Apr 8, 2020 · 2 comments
Labels

Comments

@langbe
Copy link

@langbe langbe commented Apr 8, 2020

  • a possible bug
  • a question about package functionality
  • a suggested code or documentation change, improvement to the code, or feature request

When calculating predictions with cplot() what = "prediction" results in predictions holding values of the data at their mean or mode. Although the question what we are trying to estimate is not unique, in the context of AME it seems more reasonable to me to produce different predictions. My proposal is based on the work from Bartus
https://journals.sagepub.com/doi/pdf/10.1177/1536867X0500500303

Let us assume we have a binary variable x_i. According to equation (5) the AME estimator can be written as
AME_i = 1/n sum_{k = 1}^n [F(beta * x^k | x_i^k = 1) - F(beta * x^k | x_i^k = 0)]

or equivalently

AME_i = 1/n sum_{k = 1}^n F(beta * x^k | x_i^k = 1) - 1/n sum_{k = 1}^n F(beta * x^k | x_i^k = 0).

In the latter representation we may consider each of the two terms as (conditional) predicted value. The first is for x_i = 1 and the second for x_i = 0. Those individual predicted values are consistent with the corresponding AME estimate in the sense that the difference exactly equals the AME estimate. Currently, the difference of the two predicted values is not the same as the AME estimate.

For example

library("margins")
fm <- glm(vs ~ hp + wt + factor(am), data = mtcars, family = binomial)
ans <- cplot(fm, x = "am", draw = FALSE, type = "response")
ans$yvals[[1]] - ans$yvals[[2]]  # not equal to margins(fm, variables = "am")

Applying the above mentioned formula we get:

am_1 <- mean(predict(fm, newdata = within(mtcars, am <- 1), type = "response"))
am_0 <- mean(predict(fm, newdata = within(mtcars, am <- 0), type = "response"))
am_1 - am_0 # is identical to the AME produced for am
# -0.2386167

I realize that am_1 and am_0 are quite different than the current individual predictions (yvals), but I wanted to nevertheless propose to use those instead. Happy to hear your thoughts on that.

Remarks:

  • Note that I was not sure whether the cplot call as above but with what = "effect" is what I was looking for. However, the result has zero observations in it, so I assume it is not.

  • This should analogously generalize to categorical variables with more than 2 levels (and to continuous variables)

@langbe
Copy link
Author

@langbe langbe commented Jun 17, 2020

I realised that what I have proposed above is exactly the predictive margins concept from stata. Although this is not ported into the margins package I consider those predictions more consistent with the whole package. So, in the example above my suggestion is to still use prediction(fm, at = list(am = c(0, 1))) (and I would actually reconsider adding the predictive margins option as part of this package directly).

@leeper
Copy link
Owner

@leeper leeper commented Jun 20, 2020

Hi @langbe it's not going to be part of marigns, unfortunately, as I want the concepts to be in different packages for clarity. But you're right that this is what should be happening. cplot() came before the prediction package supported Stata-style predictive margins. My plan is to deprecate cplot() as it currently exists and replace it with a ggplot2-based plotting approach. As part of that, the new plotting interface will do as you suggest.

I've unfortunately had negative bandwidth for open source work due to the pandemic.

@leeper leeper added the question label Jun 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.