Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: ReinforcementLearning: A Package to Perform Model-Free Reinforcement Learning in R #1087

Open
whedon opened this issue Nov 19, 2018 · 23 comments

Comments

Projects
None yet
6 participants
@whedon
Copy link
Collaborator

commented Nov 19, 2018

Submitting author: @nproellochs (Nicolas Proellochs)
Repository: https://github.com/nproellochs/ReinforcementLearning
Version: v1.0.2
Editor: @arfon
Reviewer: @terrytangyuan, @kbenoit
Archive: Pending

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/b63a22db024a9f3fea13caed66569912"><img src="http://joss.theoj.org/papers/b63a22db024a9f3fea13caed66569912/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/b63a22db024a9f3fea13caed66569912/status.svg)](http://joss.theoj.org/papers/b63a22db024a9f3fea13caed66569912)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@terrytangyuan & @kbenoit, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @arfon know.

Please try and complete your review in the next two weeks

Review checklist for @terrytangyuan

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Version: Does the release version given match the GitHub release (v1.0.2)?
  • Authorship: Has the submitting author (@nproellochs) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Authors: Does the paper.md file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @kbenoit

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Version: Does the release version given match the GitHub release (v1.0.2)?
  • Authorship: Has the submitting author (@nproellochs) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Authors: Does the paper.md file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
@whedon

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 19, 2018

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @terrytangyuan, it looks like you're currently assigned as the reviewer for this paper 🎉.

⭐️ Important ⭐️

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
@whedon

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 19, 2018

Attempting PDF compilation. Reticulating splines etc...
@whedon

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 19, 2018

@terrytangyuan

This comment has been minimized.

Copy link
Collaborator

commented Dec 1, 2018

@nproellochs Could you clarify the differences between your package and reinforcelearn? Also, have you done any benchmarking or used the package in certain applications yet as I doubt that this package is suitable and performant enough for practical use cases given it's implemented purely in R?

@nproellochs

This comment has been minimized.

Copy link

commented Dec 1, 2018

@terrytangyuan The key differences between our package and the reinforcelearn package are as follows:

  1. Different from the reinforcelearn package, our package performs batch reinforcement learning (it allows to learn the optimal behavior of an agent based on sample sequences consisting of states, actions and rewards). Batch reinforcement learning drastically reduces the computation time in most settings.
  2. Our package is shipped with the built-in capability to sample experience from a function that defines the dynamics of the environment.
  3. Our package incorporates a variety of learning algorithms to improve the performance of (batch) reinforcement learning.

We also tested the suitability of the package in multiple application scenarios. For example, our paper "Negation scope detection in sentiment analysis: Decision support for news-driven trading" makes use of the package to learn negations based on reinforcement learning (the paper is available from https://www.sciencedirect.com/science/article/pii/S0167923616300823).

@terrytangyuan

This comment has been minimized.

Copy link
Collaborator

commented Dec 3, 2018

@nproellochs By "performance", you meant the statistical performance, right? What about computational performance and benchmarks? RL is mostly popular in large scale applications but your package doesn't seem to support that.

@nproellochs

This comment has been minimized.

Copy link

commented Dec 3, 2018

@terrytangyuan In typical application scenarios, batch reinforcement learning also speeds up computational performance as it mitigates the ‘exploration overhead’ problem in pure online learning. In combination with experience replay, it speeds up convergence by collecting and replaying observed state transitions repeatedly to the agent as if they were new observations collected while interacting with the system. An overview of the method can be found in our paper (https://arxiv.org/pdf/1810.00240.pdf). Nonetheless, due to the fact that the package is written purely in R, I agree that its applicability to very large scale problems (such as applications in computer vision) is limited.

@arfon

This comment has been minimized.

Copy link
Member

commented Jan 13, 2019

👋 @terrytangyuan @kbenoit - happy new year! How are you getting along with your reviews?

@terrytangyuan

This comment has been minimized.

Copy link
Collaborator

commented Jan 14, 2019

@nproellochs Thanks for the clarification. I understand your point but I don't see how useful and practical this software can be if it cannot be applied to the most popular RL scenarios (mostly due to computational performance overhead). I am uncertain how valuable this is to publish in JOSS. @arfon How do we go from here? Any thoughts @kbenoit?

@kbenoit

This comment has been minimized.

Copy link
Collaborator

commented Jan 15, 2019

Hi - Sorry for the delay. I'm currently on holidays but back to work on Jan 23 and happy to complete my review then.

In the meantime: On the performance issue, this is really a question for the package and its authors to demonstrate. If there is a vignette or demonstration in the JOSS paper showing that despite being all R, that it can be useful for some applications, then I consider this a satisfactory answer to the question. Alternatively, the case could be made that it's a demonstration or learning package for RL for R users. Only if it's impractical for nearly any application because of performance issues would it be inappropriate as an R package or for JOSS.

@arfon

This comment has been minimized.

Copy link
Member

commented Jan 15, 2019

@nproellochs Thanks for the clarification. I understand your point but I don't see how useful and practical this software can be if it cannot be applied to the most popular RL scenarios (mostly due to computational performance overhead). I am uncertain how valuable this is to publish in JOSS. @arfon How do we go from here? Any thoughts @kbenoit?

@terrytangyuan - this is a fair point. We have a sister journal JOSE which may be a better fit for this package if @kbenoit is of the same opinion as @terrytangyuan here.

Hi - Sorry for the delay. I'm currently on holidays but back to work on Jan 23 and happy to complete my review then.

That would be great. Thanks!

@arfon

This comment has been minimized.

Copy link
Member

commented Feb 4, 2019

Friendly reminder to pick up this review sometime soon please @kbenoit

@kbenoit

This comment has been minimized.

Copy link
Collaborator

commented Feb 7, 2019

Sorry for the delay guys, here are my comments. All of my suggestions are below but on the JOSE option above, I think it has to do below with the purpose of the package and the need it addresses, which I think needs additional clarification before we can answer this definitively. Do I think the package should be rejected from JOSS because it cannot solve large-scale computer vision problems? No, not unless that is what it claims to solve. So for me, a key unanswered question is exactly what need it claims to meet. See more below.

Checklist issues

  • Does the release version given match the GitHub release (v1.0.2)?

Not exactly, as the GitHub repo has not tagged releases. Suggest you use this very useful facility. https://help.github.com/articles/creating-releases/. I can verify that the current master branch on the GitHub repo matches the submitted and the CRAN version.

  • Functionality: Have the functional claims of the software been confirmed?

I find this uncertain, since in the vignette, there is no actual execution of the code, and no interpretation or discussion of the results or the format in which they are delivered. How I am to understand the output of policy(model), which is a named character vector such as

> head(policy(model))
.XXBB..XB XXBB.B.X. .XBB..BXX BXX...B.. ..XB..... XBXBXB... 
     "c1"      "c5"      "c5"      "c4"      "c5"      "c9" 
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

None exist in the repo or in the README.md, but these would be easy to add.

  • Software paper

None of the three checklist items are met in the current version of the paper. See below for my comments on the statement of need and the purpose of the package.

Core issues relating to package's purpose

It's not entirely clear what the core purpose of the package is, given the performance limitations acknowledged above. Either this is a practical solution for applying reinforcement learning to problems that people need to solve, in which case there needs to be a discussion o of its performance limitations and some comparison to other possible solutions, in the paper and in my opinion, in the package vignette and/or repository README. The current statement of purpose (in the paper) suggests that the package exists to provide an R package for model-free learning, implying that its purpose is to provide a practical tool for applying this model to a broad class of problems. If so, then some discussion of performance is needed.

Or, this is a purely R-based implementation of reinforcement learning that offers a way to test or take apart this class of machine learning model for primarily educational purposes. If this is the case, then JOSE is more appropriate.

Either way, but especially in the second scenario, more complete documenation is needed, since any user is going to need to:

  • understand the required input format for supplying their own data; and
  • understand the output format and how to interpret it.

The man pages cite Sutton and Barto (1998). Why not implement one of the examples from that text using the package, as a vignette? Especially if this is primarily a learning tool, this would be really useful, but even if designed as a problem-solving tool, it will greatly help users connect the tool to a textbook that more fully explains parameter choices and provides further examples.

Some options are never explained at all, such as iter, which defaults to 1. When would higher values be useful?

Minor issues

Not every manual entry has examples, and often has very little discussion. Why not flesh this out, especially if the purpose is educational?

Admittedly, I am a bit obsessed with naming, but I find the function names unintuitive and inconsistent, either non-descriptive (e.g. policy) or inconsistent: compounded nouns (e.g. epsilonGreedyActionSelection) or verb-nouns (e.g. lookupLearningRule). Not clear either which are data objects and which are functions.

The R code is nicely written and very readable - so if the purpose is educational then it will serve that well, for those looking at source code. Although: please add a space after if.

R CMD CHECK issues

On the a full check, I am seeing:

  • N checking top-level files Non-standard files/directories found at top level: ‘JOSS’
    (add to .Rbuildignore)

  • Warning: roxygen2 requires Encoding: UTF-8 (add this to DESCRIPTION)

@arfon

This comment has been minimized.

Copy link
Member

commented Mar 4, 2019

👋 @nproellochs - please let me know when you've incorporated @kbenoit's feedback.

@arfon

This comment has been minimized.

Copy link
Member

commented Apr 9, 2019

@nproellochs - could you give us an update on your progress?

@kbenoit

This comment has been minimized.

Copy link
Collaborator

commented Apr 11, 2019

Sorry to be slow on this, review by early next week at the latest. Been snowed under with something else but it's done now.

@nproellochs

This comment has been minimized.

Copy link

commented Apr 11, 2019

@arfon Sorry for the delay. I will give an update on this ASAP. Thank you @kbenoit & @terrytangyuan for the review and the valuable feedback.

@arfon

This comment has been minimized.

Copy link
Member

commented May 1, 2019

@arfon Sorry for the delay. I will give an update on this ASAP. Thank you @kbenoit & @terrytangyuan for the review and the valuable feedback.

👋 @nproellochs - friendly reminder to get back to this soon.

@terrytangyuan

This comment has been minimized.

Copy link
Collaborator

commented May 1, 2019

@arfon My feedback and opinion still remain unchanged for the reasons I mentioned earlier. I'd recommend @nproellochs submitting this article to other journals, e.g. JOSE mentioned above.

@arfon

This comment has been minimized.

Copy link
Member

commented May 1, 2019

@arfon My feedback and opinion still remain unchanged for the reasons I mentioned earlier. I'd recommend @nproellochs submitting this article to other journals, e.g. JOSE mentioned above.

Understood. Thanks @terrytangyuan.

@danielskatz

This comment has been minimized.

Copy link

commented May 20, 2019

👋 @arfon - this review seems to have stalled...

@nproellochs

This comment has been minimized.

Copy link

commented May 20, 2019

@arfon I am planning to incorporate the feedback next weekend. Is there a deadline for the revision?

@arfon

This comment has been minimized.

Copy link
Member

commented May 20, 2019

@arfon I am planning to incorporate the feedback next weekend. Is there a deadline for the revision?

No, the timing is up to you.

Please make sure though in your revisions you directly address the feedback from @kbenoit. Specifically:

Or, this is a purely R-based implementation of reinforcement learning that offers a way to test or take apart this class of machine learning model for primarily educational purposes. If this is the case, then JOSE is more appropriate.

In your response here I think you need to make a strong case for this being a JOSS submission rather than a JOSE one, especially given both reviewers have the same concern here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.