Questions: missing data behavior, general advice #28

alexpghayes · Aug 7, 2018

Missing data behavior

Does prediction provide any guarantees about predictions on missing data? I've been playing around some and it seems like most methods behave like predict() with na.action = na.pass. Is this tested/guaranteed?

I'm currently reworking the augment() methods for broom and am considering moving to prediction as a backend, but need to guarantee that data doesn't get silently dropped.

General advice for predictions

Do you have any general advice for developers implementing predict() methods?

I'm gathering suggestions for R model package development at https://github.com/alexpghayes/principles (currently just rough notes), and if you'd be willing to share your thoughts on what makes a predict() method easy to work with, I'd love to write that up.

prediction() doesn't per se do anything. The na.action is actually handled by find_data() (in the default case) here: https://github.com/leeper/prediction/blob/master/R/find_data.R#L48-L51 I really haven't tested that much.

On the second point, I think we need standardization of model objects so that predict()'s default method just works. Lacking that, type safety seems the most critical (like the main example in the README here).

leeper · Aug 7, 2018

prediction() doesn't per se do anything. The na.action is actually handled by find_data() (in the default case) here: https://github.com/leeper/prediction/blob/master/R/find_data.R#L48-L51 I really haven't tested that much.

On the second point, I think we need standardization of model objects so that predict()'s default method just works. Lacking that, type safety seems the most critical (like the main example in the README here).

Gotcha.

Do you have any particular pain points in terms of standardization? We're brainstorming some at topepo/parsnip#41.

alexpghayes · Aug 7, 2018

Gotcha.

Do you have any particular pain points in terms of standardization? We're brainstorming some at topepo/parsnip#41.

leeper added the question label Aug 7, 2018

leeper added a commit that referenced this issue Aug 8, 2018

add tests for missing data handling (#28)

Loading status checks…

19ffff1

leeper/prediction

Questions: missing data behavior, general advice #28

alexpghayes commented Aug 7, 2018

leeper added the question label Aug 7, 2018

This comment has been minimized.

leeper commented Aug 7, 2018

This comment has been minimized.

alexpghayes commented Aug 7, 2018

leeper added a commit that referenced this issue Aug 8, 2018

leeper/prediction

Join GitHub today

Questions: missing data behavior, general advice #28

Comments

alexpghayes commented Aug 7, 2018

Missing data behavior

General advice for predictions

leeper added the question label Aug 7, 2018

This comment has been minimized.

leeper Aug 7, 2018

leeper commented Aug 7, 2018

This comment has been minimized.

alexpghayes Aug 7, 2018

alexpghayes commented Aug 7, 2018

leeper added a commit that referenced this issue Aug 8, 2018