Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for "margins" of ridge regression? #140

Open
dfrankow opened this issue Mar 25, 2020 · 3 comments
Open

Suggestions for "margins" of ridge regression? #140

dfrankow opened this issue Mar 25, 2020 · 3 comments

Comments

@dfrankow
Copy link

@dfrankow dfrankow commented Mar 25, 2020

This is a request for debugging assistance for a new feature, margins of ridge regression. If I can debug this, I'd be happy if you'd accept a pull request with the feature.

I am trying to modify your "predictions" and "margins" libraries to find the marginal effects of a ridge regression.

  • I modified the "ridge" library to work with factors and have vcov. (See version 2.5).
  • I modified your "prediction" library to use ridgeLinear predictions from that library. (See these diffs.)
  • I modified your "margins" library to test that it worked. It does not. See "Currently fails" in these diffs.

The test and output look like this:

> model1 <- lm(mpg ~ wt + cyl, data = mtcars)
> model2 <- linearRidge(mpg ~ wt + cyl, data = mtcars, lambda=0)
> expect_true(inherits(m2 <- margins(model2), "margins"), label = "margins works for linearRidge")
> m1 <- margins(model1)
> s1 <- summary(m1)
> s2 <- summary(m2)
> # Currently fails:
>     expect_equal(s1, s2)
Error: `s1` not equal to `s2`.
Attributes: < Component “call”: target, current do not match when deparsed >
Component “SE”: Mean relative difference: 1
Component “z”: Mean relative difference: Inf
Component “p”: Mean relative difference: 1
Component “lower”: Mean relative difference: 0.3282719
Component “upper”: Mean relative difference: 0.9557901
> s1
 factor     AME     SE       z      p   lower   upper
    cyl -1.5078 0.4147 -3.6360 0.0003 -2.3206 -0.6950
     wt -3.1910 0.7569 -4.2158 0.0000 -4.6745 -1.7075
> s2
 factor     AME     SE    z      p   lower   upper
    cyl -1.5078 0.0000 -Inf 0.0000 -1.5078 -1.5078
     wt -3.1910 0.0000 -Inf 0.0000 -3.1910 -3.1910

Clearly computing the standard error is not going well for my ridgeLinear object.

It is entirely possible that I've modified the ridgeLinear object to have some bug, or that I've incorrectly implemented the interface for it in the "prediction" library. The place I've traced to so far (the "FUN" gradient function in margins) seems to perhaps use the parent environment for some of its values. I could have a scoping problem. I'm looking for suggestions of things to try.

Within my "margins", I traced it with this code in the RStudio debugger:

source('R/jacobian.R')
source('R/gradient_factory.R')
source('R/find_terms_in_model.R')
source('R/get_effect_variances.R')
source('R/reset_coefs.R')
vars1 <- get_effect_variances(mtcars, model1)
vars2 <- get_effect_variances(mtcars, model2)

The jacobian is always all 0s for model2 (the linearRidge model), so perhaps the world looks flat and it doesn't find the variances properly.

I can keep dropping into the debugging rathole by diving into the marginal_effects within the gradient function FUN ("me_tmp <- gr.." near line 20 of gradient_factory.R), but I wondered if you had any ideas.

Thanks for your attention.

Output of sessionInfo:

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] margins_0.3.23 testthat_2.3.2 ridge_2.5     

loaded via a namespace (and not attached):
[1] MASS_7.3-51.5     compiler_3.6.3    magrittr_1.5      R6_2.4.1          tools_3.6.3      
[6] yaml_2.2.1        data.table_1.12.8 rlang_0.4.5       prediction_0.3.16
@dfrankow
Copy link
Author

@dfrankow dfrankow commented Mar 25, 2020

I'll attach my logs so far, why not. It's a bit opaque, tho.

Below are the log statements in diff form, and I attach two files:

  • 1.log from get_effect_variances(mtcars, model1)
  • 2.log from get_effect_variances(mtcars, model2)

The first obvious difference is i=2.

1.log has numer=9.9e-08 and out=0.99 (long line goes off the screen):

*** i 2 coeftemp 39.6862614802529 FUN -3.1909720389839 F0 -3.1909721389834 numer 9.99994953509997e-08 out[, i] 0.999994953509997

2.log has numer=0 and out=0:

*** i 2 coeftemp 39.686261480253 FUN -3.19097213898376 F0 -3.19097213898376 numer 0 out[, i] 0

It's as if FUN=F0 for some reason I can't quite see in 2.log.

Log statements:

diff --git a/R/gradient_factory.R b/R/gradient_factory.R
index 4238150..c601512 100644
--- a/R/gradient_factory.R
+++ b/R/gradient_factory.R
@@ -33,6 +34,7 @@ gradient_factory.default <- function(data, model, variables = NULL, type = "resp
             # apply colMeans to get average marginal effects
             means <- apply(me_tmp, 2L, stats::weighted.mean, w = weights, na.rm = TRUE)
         }
+        write(paste("** FUN me_tmp", me_tmp, "means", means), file="the.log", append=TRUE)
         means
     }
     return(FUN)
diff --git a/R/jacobian.R b/R/jacobian.R
index 2ac79aa..1298cb9 100644
--- a/R/jacobian.R
+++ b/R/jacobian.R
@@ -1,5 +1,7 @@
 # function returns a Jacobian matrix (a term-by-beta matrix)
 jacobian <- function(FUN, coefficients, weights = NULL, eps = 1e-7) {
+  write(paste("** jacob coeffs", coefficients, "weights", weights, "eps", eps),
+        file="the.log", append=TRUE)
     F0 <- FUN(coefficients, weights = weights)
     out <- matrix(NA_real_, nrow = length(F0), ncol = length(coefficients))
     colnames(out) <- names(coefficients)
@@ -8,6 +10,12 @@ jacobian <- function(FUN, coefficients, weights = NULL, eps = 1e-7) {
         coeftemp <- coefficients
         coeftemp[i] <- coeftemp[i] + eps
         out[, i] <- (FUN(coeftemp, weights = weights) - F0) / eps
+        write(paste("*** i", i, "coeftemp", coeftemp,
+                    "FUN", FUN(coeftemp, weights = weights),
+                    "F0", F0,
+                    "numer", (FUN(coeftemp, weights = weights) - F0),
+                    "out[, i]", out[, i]),
+              file="the.log", append=TRUE)
     }
     out
 }
diff --git a/tests/testthat/tests-core.R b/tests/testthat/tests-core.R
index a58f26f..997fba2 100644
@@ -65,6 +65,16 @@ if (requireNamespace("ridge")) {
     model2 <- linearRidge(mpg ~ wt + cyl, data = mtcars, lambda=0)
     expect_true(inherits(m2 <- margins(model2), "margins"), label = "margins works for linearRidge")
 
+    # TODO: remove
+    source('R/jacobian.R')
+    source('R/gradient_factory.R')
+    source('R/find_terms_in_model.R')
+    source('R/get_effect_variances.R')
+    source('R/reset_coefs.R')
+    vars1 <- get_effect_variances(mtcars, model1)
+    vars2 <- get_effect_variances(mtcars, model2)
+
+    # margins(model2)
     m1 <- margins(model1)
     s1 <- summary(m1)
     s2 <- summary(m2)

1.log
2.log

@dfrankow
Copy link
Author

@dfrankow dfrankow commented May 12, 2020

I found a possible factor.

reset_coefs changes the model coefficients.

For lm, this is simply setting a named vector.

For ridgeLinear, it's not entirely obvious how to do it. It doesn't look set up to do it easily. I implemented nothing, so it was using the default method, which doesn't look right. The coef.ridgeLinear function returns a vector where the intercept is constructed out of three different pieces:

inter <- object$ym - scaledcoef %*% object$xm

So perhaps the margins code is changing the model values in a way that currently depends on information carried around with the model. This would have to be adapted to ridgeLinear, but exactly what is required here is not obvious to me.

@leeper
Copy link
Owner

@leeper leeper commented Jun 21, 2020

Thanks for all your work - I'll try to figure out what needs to happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
@dfrankow @leeper and others
You can’t perform that action at this time.