Permalink
Browse files

make pkgdown site

  • Loading branch information...
leeper committed Mar 17, 2018
1 parent f26914a commit e819db8a350fa133c7e3b2672a25585d77ebc6a6
@@ -1,22 +1,31 @@
.github/*
.travis.yml
appveyor.yml
travis-tool.sh
README.Rmd
README.html
CONTRIBUTING.md
LICENSE.md
Makefile
/inst/standarderrors.pdf
/figure/*
/cache/*
/inst/doc/*.log
/inst/doc/*.Rmd
/vignettes/mtcars.csv
/vignettes/Cigar.csv
/vignettes/*.log
/vignettes/*.aux
/vignettes/*.pdf
/vignettes/*.sty
/vignettes/*.tex
^\.travis\.yml$
^appveyor\.yml$
^travis-tool\.sh$
^Makefile$
^README\.Rmd$
^README\.html$
^README_files$
^README_files/.+$
^CONTRIBUTING\.md$
^inst/standarderrors\.pdf$
^figure$
^figure/.+$
^cache/.+$
^docs$
^docs/.+$
^ignore$
^inst/doc/.+\.log$
^inst/doc/.+\.Rmd$
^vignettes/figure$
^vignettes/figure/.+$
^vignettes/.+\.aux$
^vignettes/.+\.bbl$
^vignettes/.+\.blg$
^vignettes/.+\.dvi$
^vignettes/.+\.log$
^vignettes/.+\.out$
^vignettes/.+\.pdf$
^vignettes/.+\.sty$
^vignettes/.+\.tex$
^data-raw$
@@ -11,7 +11,7 @@ README.md: README.Rmd
README.html: README.md
pandoc -o README.html README.md

../$(pkg)*.tar.gz: DESCRIPTION NAMESPACE
../$(pkg)*.tar.gz: DESCRIPTION NAMESPACE README.md
cd ../ && R CMD build $(pkg)

build: ../$(pkg)*.tar.gz
@@ -23,3 +23,6 @@ check: ../$(pkg)*.tar.gz
install: ../$(pkg)*.tar.gz
cd ../ && R CMD INSTALL $(pkg)*.tar.gz
rm ../$(pkg)*.tar.gz

website: R/* README.md DESCRIPTION
Rscript -e "pkgdown::build_site()"
190 README.md
@@ -1,157 +1,143 @@
# Tidy, Type-Safe 'prediction()' Methods

Tidy, Type-Safe 'prediction()' Methods
======================================

The **prediction** and **margins** packages are a combined effort to port the functionality of Stata's (closed source) [`margins`](http://www.stata.com/help.cgi?margins) command to (open source) R. **prediction** is focused on one function - `prediction()` - that provides type-safe methods for generating predictions from fitted regression models. `prediction()` is an S3 generic, which always return a `"data.frame"` class object rather than the mix of vectors, lists, etc. that are returned by the `predict()` methods for various model types. It provides a key piece of underlying infrastructure for the **margins** package. Users interested in generating marginal (partial) effects, like those generated by Stata's `margins, dydx(*)` command, should consider using `margins()` from the sibling project, [**margins**](https://cran.r-project.org/package=margins).

In addition to `prediction()`, this package provides a number of utility functions for generating useful predictions:

- `find_data()`, an S3 generic with methods that find the data frame used to estimate a regression model. This is a wrapper around `get_all_vars()` that attempts to locate data as well as modify it according to `subset` and `na.action` arguments used in the original modelling call.
- `mean_or_mode()` and `median_or_mode()`, which provide a convenient way to compute the data needed for predicted values *at means* (or *at medians*), respecting the differences between factor and numeric variables.
- `seq_range()`, which generates a vector of *n* values based upon the range of values in a variable
- `build_datalist()`, which generates a list of data frames from an input data frame and a specified set of replacement `at` values (mimicking the `atlist` option of Stata's `margins` command)

## Simple code examples

- `find_data()`, an S3 generic with methods that find the data frame used to estimate a regression model. This is a wrapper around `get_all_vars()` that attempts to locate data as well as modify it according to `subset` and `na.action` arguments used in the original modelling call.
- `mean_or_mode()` and `median_or_mode()`, which provide a convenient way to compute the data needed for predicted values *at means* (or *at medians*), respecting the differences between factor and numeric variables.
- `seq_range()`, which generates a vector of *n* values based upon the range of values in a variable
- `build_datalist()`, which generates a list of data frames from an input data frame and a specified set of replacement `at` values (mimicking the `atlist` option of Stata's `margins` command)

Simple code examples
--------------------

A major downside of the `predict()` methods for common modelling classes is that the result is not type-safe. Consider the following simple example:


```r
``` r
library("stats")
library("datasets")
x <- lm(mpg ~ cyl * hp + wt, data = mtcars)
class(predict(x))
```

```
## [1] "numeric"
```
## [1] "numeric"

```r
``` r
class(predict(x, se.fit = TRUE))
```

```
## [1] "list"
```
## [1] "list"

**prediction** solves this issue by providing a wrapper around `predict()`, called `prediction()`, that always returns a tidy data frame with a very simple `print()` method:


```r
``` r
library("prediction")
(p <- prediction(x))
```

```
## Average prediction for 32 observations: 20.0906
```
## Average prediction for 32 observations: 20.0906

```r
``` r
class(p)
```

```
## [1] "prediction" "data.frame"
```
## [1] "prediction" "data.frame"

```r
``` r
head(p)
```

```
## mpg cyl disp hp drat wt qsec vs am gear carb fitted se.fitted
## 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 21.90488 0.6927034
## 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 21.10933 0.6266557
## 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 25.64753 0.6652076
## 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 20.04859 0.6041400
## 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 17.25445 0.7436172
## 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 19.53360 0.6436862
```
## mpg cyl disp hp drat wt qsec vs am gear carb fitted se.fitted
## 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 21.90488 0.6927034
## 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 21.10933 0.6266557
## 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 25.64753 0.6652076
## 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 20.04859 0.6041400
## 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 17.25445 0.7436172
## 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 19.53360 0.6436862

The output always contains the original data (i.e., either data found using the `find_data()` function or passed to the `data` argument to `prediction()`). This makes it much simpler to pass predictions to, e.g., further summary or plotting functions.

Additionally the vast majority of methods allow the passing of an `at` argument, which can be used to obtain predicted values using modified version of `data` held to specific values:


```r
``` r
prediction(x, at = list(hp = seq_range(mtcars$hp, 5)))
```

```
## Average predictions for 32 observations:
```
## Average predictions for 32 observations:

```
## at(hp) value
## 52.0 22.605
## 122.8 19.328
## 193.5 16.051
## 264.2 12.774
## 335.0 9.497
```
## at(hp) value
## 52.0 22.605
## 122.8 19.328
## 193.5 16.051
## 264.2 12.774
## 335.0 9.497

This more or less serves as a direct R port of (the subset of functionality of) Stata's `margins` command that calculates predictive marginal means, etc. For calculation of marginal or partial effects, see the [**margins**](https://cran.r-project.org/package=margins) package.

## Supported model classes
Supported model classes
-----------------------

The currently supported model classes are:

- "lm" from `stats::lm()`
- "glm" from `stats::glm()`, `MASS::glm.nb()`, `glmx::glmx()`, `glmx::hetglm()`, `brglm::brglm()`
- "ar" from `stats::ar()`
- "Arima" from `stats::arima()`
- "arima0" from `stats::arima0()`
- "betareg" from `betareg::betareg()`
- "clm" from `ordinal::clm()`
- "coxph" from `survival::coxph()`
- "crch" from `crch::crch()`
- "gam" from `gam::gam()`
- "gee" from `gee::gee()`
- "gls" from `nlme::gls()`
- "hurdle" from `pscl::hurdle()`
- "hxlr" from `crch::hxlr()`
- "ivreg" from `AER::ivreg()`
- "lda" from `MASS:lda()`
- "lme" from `nlme::lme()`
- "loess" from `stats::loess()`
- "lqs" from `MASS::lqs()`
- "mca" from `MASS::mca()`
- "mclogit" from `mclogit::mclogit()`
- "merMod" from `lme4::lmer()` and `lme4::glmer()`
- "mnlogit" from `mnlogit::mnlogit()`
- "mnp" from `MNP::mnp()`
- "naiveBayes" from `e1071::naiveBayes()`
- "nlme" from `nlme::nlme()`
- "nls" from `stats::nls()`
- "nnet" from `nnet::nnet()`, `nnet::multinom()`
- "plm" from `plm::plm()`
- "polr" from `MASS::polr()`
- "ppr" from `stats::ppr()`
- "princomp" from `stats::princomp()`
- "qda" from `MASS:qda()`
- "rlm" from `MASS::rlm()`
- "rpart" from `rpart::rpart()`
- "rq" from `quantreg::rq()`
- "selection" from `sampleSelection::selection()`
- "survreg" from `survival::survreg()`
- "svm" from `e1071::svm()`
- "svyglm" from `survey::svyglm()`
- "train" from `caret::train()`
- "zeroinfl" from `pscl::zeroinfl()`

## Requirements and Installation

[![CRAN](http://www.r-pkg.org/badges/version/prediction)](https://cran.r-project.org/package=prediction)
![Downloads](http://cranlogs.r-pkg.org/badges/prediction)
[![Build Status](https://travis-ci.org/leeper/prediction.svg?branch=master)](https://travis-ci.org/leeper/prediction)
[![Build status](https://ci.appveyor.com/api/projects/status/a4tebeoa98cq07gy/branch/master?svg=true)](https://ci.appveyor.com/project/leeper/prediction/branch/master)
[![codecov.io](http://codecov.io/github/leeper/prediction/coverage.svg?branch=master)](http://codecov.io/github/leeper/prediction?branch=master)
[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
- "lm" from `stats::lm()`
- "glm" from `stats::glm()`, `MASS::glm.nb()`, `glmx::glmx()`, `glmx::hetglm()`, `brglm::brglm()`
- "ar" from `stats::ar()`
- "Arima" from `stats::arima()`
- "arima0" from `stats::arima0()`
- "betareg" from `betareg::betareg()`
- "clm" from `ordinal::clm()`
- "coxph" from `survival::coxph()`
- "crch" from `crch::crch()`
- "earth" from `earth::earth()`
- "Gam" from `gam::gam()`
- "gausspr" from `kernlab::gausspr()`
- "gee" from `gee::gee()`
- "gls" from `nlme::gls()`
- "hurdle" from `pscl::hurdle()`
- "hxlr" from `crch::hxlr()`
- "ivreg" from `AER::ivreg()`
- "knnreg" from `caret::knnreg()`
- "kqr" from `kernlab::kqr()`
- "ksvm" from `kernlab::ksvm()`
- "lda" from `MASS:lda()`
- "lme" from `nlme::lme()`
- "loess" from `stats::loess()`
- "lqs" from `MASS::lqs()`
- "mca" from `MASS::mca()`
- "mclogit" from `mclogit::mclogit()`
- "merMod" from `lme4::lmer()` and `lme4::glmer()`
- "mnlogit" from `mnlogit::mnlogit()`
- "mnp" from `MNP::mnp()`
- "naiveBayes" from `e1071::naiveBayes()`
- "nlme" from `nlme::nlme()`
- "nls" from `stats::nls()`
- "nnet" from `nnet::nnet()`, `nnet::multinom()`
- "plm" from `plm::plm()`
- "polr" from `MASS::polr()`
- "ppr" from `stats::ppr()`
- "princomp" from `stats::princomp()`
- "qda" from `MASS:qda()`
- "rlm" from `MASS::rlm()`
- "rpart" from `rpart::rpart()`
- "rq" from `quantreg::rq()`
- "selection" from `sampleSelection::selection()`
- "survreg" from `survival::survreg()`
- "svm" from `e1071::svm()`
- "svyglm" from `survey::svyglm()`
- "train" from `caret::train()`
- "zeroinfl" from `pscl::zeroinfl()`

Requirements and Installation
-----------------------------

[![CRAN](http://www.r-pkg.org/badges/version/prediction)](https://cran.r-project.org/package=prediction) ![Downloads](http://cranlogs.r-pkg.org/badges/prediction) [![Build Status](https://travis-ci.org/leeper/prediction.svg?branch=master)](https://travis-ci.org/leeper/prediction) [![Build status](https://ci.appveyor.com/api/projects/status/a4tebeoa98cq07gy/branch/master?svg=true)](https://ci.appveyor.com/project/leeper/prediction/branch/master) [![codecov.io](http://codecov.io/github/leeper/prediction/coverage.svg?branch=master)](http://codecov.io/github/leeper/prediction?branch=master) [![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)

The development version of this package can be installed directly from GitHub using `ghit`:

```R
``` r
if (!require("ghit")) {
install.packages("ghit")
library("ghit")
Oops, something went wrong.

0 comments on commit e819db8

Please sign in to comment.