Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
Permalink
Browse files

update documentation (closes #119)

  • Loading branch information
leeper committed Dec 25, 2019
1 parent fd0d2b9 commit 9426bf796b6a1d5407964ed67e5a5ac95b8dadd8
Showing with 875 additions and 378 deletions.
  1. +1 −1 DESCRIPTION
  2. +54 −53 NEWS.md
  3. +2 −0 R/margins.R
  4. +3 −1 README.Rmd
  5. +46 −31 README.md
  6. +298 −118 man/cplot.Rd
  7. +39 −12 man/dydx.Rd
  8. +109 −39 man/marginal_effects.Rd
  9. +158 −62 man/margins.Rd
  10. +140 −53 man/persp.Rd
  11. +19 −6 man/plot.margins.Rd
  12. +1 −1 man/reexports.Rd
  13. +5 −1 vignettes/Introduction.Rmd
@@ -48,4 +48,4 @@ Enhances:
survey
ByteCompile: true
VignetteBuilder: knitr
RoxygenNote: 6.1.1
RoxygenNote: 7.0.2
107 NEWS.md

Large diffs are not rendered by default.

@@ -5,6 +5,8 @@
#' @title Marginal Effects Estimation
#' @description This package is an R port of Stata's \samp{margins} command, implemented as an S3 generic \code{margins()} for model objects, like those of class \dQuote{lm} and \dQuote{glm}. \code{margins()} is an S3 generic function for building a \dQuote{margins} object from a model object. Methods are currently implemented for several model classes (see Details, below).
#'
#' margins provides \dQuote{marginal effects} summaries of models. Marginal effects are partial derivatives of the regression equation with respect to each variable in the model for each unit in the data; average marginal effects are simply the mean of these unit-specific partial derivatives over some sample. In ordinary least squares regression with no interactions or higher-order term, the estimated slope coefficients are marginal effects. In other cases and for generalized linear models, the coefficients are not marginal effects at least not on the scale of the response variable. margins therefore provides ways of calculating the marginal effects of variables to make these models more interpretable.
#'
#' The package also provides a low-level function, \code{\link{marginal_effects}}, to estimate those quantities and return a data frame of unit-specific effects and another even lower-level function, \code{\link{dydx}}, to provide variable-specific derivatives from models. Some of the underlying architecture for the package is provided by the low-level function \code{\link[prediction]{prediction}}, which provides a consistent data frame interface to \code{\link[stats]{predict}} for a large number of model types. If a \code{prediction} method exists for a model class, \code{margin} should work for the model class but only those classes listed here have been tested and specifically supported.
#' @param model A model object. See Details for supported model classes.
#' @param data A data frame containing the data at which to evaluate the marginal effects, as in \code{\link[stats]{predict}}. This is optional, but may be required when the underlying modelling function sets \code{model = FALSE}.
@@ -5,7 +5,9 @@ output: github_document

<img src="man/figures/logo.png" align="right" />

The **margins** and **prediction** packages are a combined effort to port the functionality of Stata's (closed source) [`margins`](http://www.stata.com/help.cgi?margins) command to (open source) R. The major functionality of `margins` - namely the estimation of marginal (or partial) effects - is provided through a single function, `margins()`. This is an S3 generic method for calculating the marginal effects of covariates included in model objects (like those of classes "lm" and "glm"). Users interested in generating predicted (fitted) values, such as the "predictive margins" generated by Stata's `margins` command, should consider using `prediction()` from the sibling project, [**prediction**](https://cran.r-project.org/package=prediction).
The **margins** and **prediction** packages are a combined effort to port the functionality of Stata's (closed source) [`margins`](http://www.stata.com/help.cgi?margins) command to (open source) R. These tools provide ways of obtaining common quantities of interest from regression-type models. **margins** provides "marginal effects" summaries of models and **prediction** provides unit-specific and sample average predictions from models. Marginal effects are partial derivatives of the regression equation with respect to each variable in the model for each unit in the data; average marginal effects are simply the mean of these unit-specific partial derivatives over some sample. In ordinary least squares regression with no interactions or higher-order term, the estimated slope coefficients are marginal effects. In other cases and for generalized linear models, the coefficients are not marginal effects at least not on the scale of the response variable. **margins** therefore provides ways of calculating the marginal effects of variables to make these models more interpretable.

The major functionality of Stata's `margins` command - namely the estimation of marginal (or partial) effects - is provided here through a single function, `margins()`. This is an S3 generic method for calculating the marginal effects of covariates included in model objects (like those of classes "lm" and "glm"). Users interested in generating predicted (fitted) values, such as the "predictive margins" generated by Stata's `margins` command, should consider using `prediction()` from the sibling project, [**prediction**](https://cran.r-project.org/package=prediction).

## Motivation

@@ -5,7 +5,9 @@ output: github_document

<img src="man/figures/logo.png" align="right" />

The **margins** and **prediction** packages are a combined effort to port the functionality of Stata's (closed source) [`margins`](http://www.stata.com/help.cgi?margins) command to (open source) R. The major functionality of `margins` - namely the estimation of marginal (or partial) effects - is provided through a single function, `margins()`. This is an S3 generic method for calculating the marginal effects of covariates included in model objects (like those of classes "lm" and "glm"). Users interested in generating predicted (fitted) values, such as the "predictive margins" generated by Stata's `margins` command, should consider using `prediction()` from the sibling project, [**prediction**](https://cran.r-project.org/package=prediction).
The **margins** and **prediction** packages are a combined effort to port the functionality of Stata's (closed source) [`margins`](http://www.stata.com/help.cgi?margins) command to (open source) R. These tools provide ways of obtaining common quantities of interest from regression-type models. **margins** provides "marginal effects" summaries of models and **prediction** provides unit-specific and sample average predictions from models. Marginal effects are partial derivatives of the regression equation with respect to each variable in the model for each unit in the data; average marginal effects are simply the mean of these unit-specific partial derivatives over some sample. In ordinary least squares regression with no interactions or higher-order term, the estimated slope coefficients are marginal effects. In other cases and for generalized linear models, the coefficients are not marginal effects at least not on the scale of the response variable. **margins** therefore provides ways of calculating the marginal effects of variables to make these models more interpretable.

The major functionality of Stata's `margins` command - namely the estimation of marginal (or partial) effects - is provided here through a single function, `margins()`. This is an S3 generic method for calculating the marginal effects of covariates included in model objects (like those of classes "lm" and "glm"). Users interested in generating predicted (fitted) values, such as the "predictive margins" generated by Stata's `margins` command, should consider using `prediction()` from the sibling project, [**prediction**](https://cran.r-project.org/package=prediction).

## Motivation

@@ -81,10 +83,7 @@ margins_summary(mod1)
```

```
## factor AME SE z p lower upper
## cyl 0.0381 0.5999 0.0636 0.9493 -1.1376 1.2139
## hp -0.0463 0.0145 -3.1909 0.0014 -0.0748 -0.0179
## wt -3.1198 0.6613 -4.7176 0.0000 -4.4160 -1.8236
## Error in margins_summary(mod1): could not find function "margins_summary"
```

If you are only interested in obtaining the marginal effects (without corresponding variances or the overhead of creating a "margins" object), you can call `marginal_effects(x)` directly. Furthermore, the `dydx()` function enables the calculation of the marginal effect of a single named variable:
@@ -185,11 +184,6 @@ If one desires *subgroup* effects, simply pass a subset of data to the `data` ar
summary(margins(mod2, data = subset(margex, sex == 0)))
```

```
## Warning in model$family$mu.eta(predictions_link) * model_mat: longer object length is not a multiple
## of shorter object length
```

```
## factor AME SE z p lower upper
## age 0.0043 0.0007 5.7723 0.0000 0.0028 0.0057
@@ -202,11 +196,6 @@ summary(margins(mod2, data = subset(margex, sex == 0)))
summary(margins(mod2, data = subset(margex, sex == 1)))
```

```
## Warning in model$family$mu.eta(predictions_link) * model_mat: longer object length is not a multiple
## of shorter object length
```

```
## factor AME SE z p lower upper
## age 0.0150 0.0013 11.5578 0.0000 0.0125 0.0176
@@ -245,7 +234,7 @@ summary(marg3 <- margins(mod3))
plot(marg3)
```

![plot of chunk marginsplot](https://i.imgur.com/mT2PVnA.png)
![plot of chunk marginsplot](https://i.imgur.com/yiCBeSu.png)

In addition to the estimation procedures and `plot()` generic, **margins** offers several plotting methods for model objects. First, there is a new generic `cplot()` that displays predictions or marginal effects (from an "lm" or "glm" model) of a variable conditional across values of third variable (or itself). For example, here is a graph of predicted probabilities from a logit model:

@@ -256,11 +245,30 @@ cplot(mod4, x = "wt", se.type = "shade")
```

```
## Warning in model$family$mu.eta(predictions_link) * model_mat: longer object length is not a multiple
## of shorter object length
```

![plot of chunk cplot1](https://i.imgur.com/bAjidqq.png)
## xvals yvals upper lower
## 1 1.513000 0.927274748 1.25767803 0.59687146
## 2 1.675958 0.896156250 1.31282164 0.47949086
## 3 1.838917 0.853821492 1.36083558 0.34680740
## 4 2.001875 0.798115859 1.38729030 0.20894142
## 5 2.164833 0.727945940 1.37431347 0.08157841
## 6 2.327792 0.644257693 1.30643930 -0.01792391
## 7 2.490750 0.550714595 1.17940279 -0.07797360
## 8 2.653708 0.453441410 1.00638808 -0.09950526
## 9 2.816667 0.359598025 0.81514131 -0.09594526
## 10 2.979625 0.275390447 0.63577343 -0.08499254
## 11 3.142583 0.204601856 0.48756886 -0.07836515
## 12 3.305542 0.148285654 0.37415646 -0.07758515
## 13 3.468500 0.105415989 0.28892829 -0.07809631
## 14 3.631458 0.073865178 0.22356331 -0.07583296
## 15 3.794417 0.051216829 0.17224934 -0.06981569
## 16 3.957375 0.035248556 0.13162443 -0.06112732
## 17 4.120333 0.024132208 0.09961556 -0.05135115
## 18 4.283292 0.016461806 0.07467832 -0.04175471
## 19 4.446250 0.011201450 0.05550126 -0.03309836
## 20 4.609208 0.007609032 0.04093572 -0.02571766
```

![plot of chunk cplot1](https://i.imgur.com/0z3dRyy.png)

And fitted values with a factor independent variable:

@@ -269,7 +277,14 @@ And fitted values with a factor independent variable:
cplot(lm(Sepal.Length ~ Species, data = iris))
```

![plot of chunk cplot2](https://i.imgur.com/OVVrJF2.png)
```
## xvals yvals upper lower
## 1 setosa 5.006 5.14869 4.86331
## 2 versicolor 5.936 6.07869 5.79331
## 3 virginica 6.588 6.73069 6.44531
```

![plot of chunk cplot2](https://i.imgur.com/LcsN0OC.png)

and a graph of the effect of `drat` across levels of `wt`:

@@ -278,7 +293,7 @@ and a graph of the effect of `drat` across levels of `wt`:
cplot(mod4, x = "wt", dx = "drat", what = "effect", se.type = "shade")
```

![plot of chunk cplot3](https://i.imgur.com/QmZMJk9.png)
![plot of chunk cplot3](https://i.imgur.com/YbQqjSi.png)

`cplot()` also returns a data frame of values, so that it can be used just for calculating quantities of interest before plotting them with another graphics package, such as **ggplot2**:

@@ -309,7 +324,7 @@ ggplot(dat, aes(x = xvals)) +
theme_bw()
```

![plot of chunk cplot_ggplot2](https://i.imgur.com/DRDiScf.png)
![plot of chunk cplot_ggplot2](https://i.imgur.com/9dRY6Q2.png)

Second, the package implements methods for "lm" and "glm" class objects for the `persp()` generic plotting function. This enables three-dimensional representations of predicted outcomes:

@@ -318,7 +333,7 @@ Second, the package implements methods for "lm" and "glm" class objects for the
persp(mod1, xvar = "cyl", yvar = "hp")
```

![plot of chunk persp1](https://i.imgur.com/YTunIDC.png)
![plot of chunk persp1](https://i.imgur.com/lJbez6g.png)

and marginal effects:

@@ -327,7 +342,7 @@ and marginal effects:
persp(mod1, xvar = "cyl", yvar = "hp", what = "effect", nx = 10)
```

![plot of chunk persp2](https://i.imgur.com/r6sJge4.png)
![plot of chunk persp2](https://i.imgur.com/lkQ2ydu.png)

And if three-dimensional plots aren't your thing, there are also analogous methods for the `image()` generic, to produce heatmap-style representations:

@@ -336,7 +351,7 @@ And if three-dimensional plots aren't your thing, there are also analogous metho
image(mod1, xvar = "cyl", yvar = "hp", main = "Predicted Fuel Efficiency,\nby Cylinders and Horsepower")
```

![plot of chunk image11](https://i.imgur.com/aAGh9JA.png)
![plot of chunk image11](https://i.imgur.com/FC7oX52.png)

The numerous package vignettes and help files contain extensive documentation and examples of all package functionality.

@@ -352,8 +367,8 @@ microbenchmark(marginal_effects(mod1))

```
## Unit: milliseconds
## expr min lq mean median uq max neval
## marginal_effects(mod1) 3.39837 3.57076 4.318865 3.878095 4.440118 9.669389 100
## expr min lq mean median uq max neval
## marginal_effects(mod1) 3.272914 3.543947 4.32065 3.979385 4.717786 8.887426 100
```

```r
@@ -362,8 +377,8 @@ microbenchmark(margins(mod1))

```
## Unit: milliseconds
## expr min lq mean median uq max neval
## margins(mod1) 25.28022 27.49335 41.94639 32.4953 50.5766 146.9396 100
## expr min lq mean median uq max neval
## margins(mod1) 24.15647 27.09981 31.29218 29.34067 33.82544 64.94929 100
```

The most computationally expensive part of `margins()` is variance estimation. If you don't need variances, use `marginal_effects()` directly or specify `margins(..., vce = "none")`.

0 comments on commit 9426bf7

Please sign in to comment.
You can’t perform that action at this time.