Support for R / renv as a language#1799
Conversation
| # Alternatively, we require the renv | ||
| # package to be installed already, then we can | ||
| # omit that. | ||
| Rscript -e 'dir.create(Sys.getenv("R_LIBS_USER"), recursive = TRUE)' |
There was a problem hiding this comment.
This won't work if a non-root user has never installed any R package (quite unlikely at that stage but...). Maybe catch the installation error and fail (more) informatively?
ca4a52b to
a72bde1
Compare
pre_commit/languages/renv.py
Outdated
| # expose entry to the user for maximal control | ||
| return helpers.run_xargs( | ||
| hook, hook.cmd, file_args, | ||
| color=color, cwd=hook.prefix.prefix_dir, |
There was a problem hiding this comment.
the current expectation for hook authors is that hooks will be executed from the root of the repository, so cwd=... won't be workable here
I think that necessarily changes how the environment setup is working now -- the ~normalish way of setting this up is by pointing some set of environment variables at the ENVIRONMENT_DIR (and subdirs)
note also that things must be installed entirely inside the ENVIRONMENT_DIR or they potentially won't be persisted by caching
There was a problem hiding this comment.
the current expectation for hook authors is that hooks will be executed from the root of the repository, so cwd=... won't be workable here.
of course 🤦.
the ~normalish way of setting this up is by pointing some set of environment variables at the ENVIRONMENT_DIR (and subdirs).
That's what I saw. Problem is (and that's a peculiarity of R) the most operations happen in an R session, not the terminal session. renv can only be activated properly from inside R. For that reason, my current suggestion is to place a renv::activate() call in a file, then point the R_PROFILE_USER environment variable to that file so the code will run at R startup. The idea is now to allow two versions, inline code expressions and file in entry::
Rscript -e expr # like Rscript -e "1+1"
Rscript path/to/file An inject the options --no-save, --no-restore, --no-site-file and --no-environ after Rscript (to basically have --vanilla start, but run this one renv::activate() statement) and expand the path/to/file similar to language: script since we have to run the hooks in the working directory of the repo. The files and the hook args are then passed after the above, i.e.
Rscript --no-save --no-restore --no-site-file --no-environ /abs/path/ [files] [hook args]
Rscript --no-save --no-restore --no-site-file --no-environ -e expr [files] [hook args]note also that things must be installed entirely inside the ENVIRONMENT_DIR or they potentially won't be persisted by caching.
This should be solved now. Just placing the whole env in a subdirectory now.
| @@ -0,0 +1,18 @@ | |||
| Package: gli.clu | |||
There was a problem hiding this comment.
The DESCRIPTION file is basically the equivalent of setup.py, renv.lock is the equivalent of the conda environment.yml. Not sure if one or both should be listed in LOCAL_RESOURCES as this was introduced here:
https://github.com/pre-commit/pre-commit/pull/1232/files#diff-a52ef0cf627ce783529c259e63d4ea82b77520936a2455cbf4f90ad2fb36d27dR176
There was a problem hiding this comment.
ah LOCAL_RESOURCES is to allow an additional_dependencies-only repo: local hook -- it seems like additional_dependencies is supported for R so we should probably make an empty one?
There was a problem hiding this comment.
Ok, thanks for the explanation. I added a template for renv.lock in cf88089. However, there is an R version hard-coded in it which will get outdated. As far as I can tell from a quick experiment, the docs and issues in https://github.com/rstudio/renv, there is no indication that the R version recorded in the lock file will be used at renv::restore(), which we use in renv.install_environment(). Even removing seems fine (I left it there anyways). But that might still be a bit brittle.
There was a problem hiding this comment.
Also, should we add a unit test for a local hook with additional dependencies to check if things work as expected? I don't think that's currently covered.
There was a problem hiding this comment.
yeah it would probably be good to have one unit test demoing a local hook via additional_dependencies
tests/repository_test.py
Outdated
| test_run_a_node_hook(tempdir_factory, store) | ||
|
|
||
|
|
||
| def _test_renv_parsing( |
There was a problem hiding this comment.
let me know if these are too many tests (because it slows down testing). I think we can get 100% coverage with fewer tests too (but then not cover as many combinations).
asottile
left a comment
There was a problem hiding this comment.
overall looks great, really happy you took this on and I'm excited to add this support!
pre_commit/languages/all.py
Outdated
| from pre_commit.languages import perl | ||
| from pre_commit.languages import pygrep | ||
| from pre_commit.languages import python | ||
| from pre_commit.languages import renv |
There was a problem hiding this comment.
I think this should maybe just be called r?
There was a problem hiding this comment.
Thought about that also. The thing is that language: conda can both host R and Python hooks and renv can manage python dependencies, so I thought I go for a more specific name. On the other hand, language: python uses venv (and is still called python, not venv). Since working with virtual environments is still not as common in R compared to Python and not every R user knows renv, using r as a name might also make it easier for people to find what they are looking for. So I am happy to switch to r, please let me know what you prefer.
There was a problem hiding this comment.
yeah I think r makes the most sense, conda is kinda the special case right now and I think if people google "pre-commit r" they'll be much more likely to hit this than if it were named renv
we can always add other names if new ways of installing isolated r environments happen in the future and/or shuffle them around (there used to be a python_venv language which was merged into python as virtualenv makes venv-style environments anyway nowadays)
pre_commit/languages/renv.py
Outdated
| with in_env(hook.prefix, hook.language_version): | ||
|
|
||
| return helpers.run_xargs( | ||
| hook, cmd_from_hook(hook), file_args, color=color, |
There was a problem hiding this comment.
I usually prefer to read a file from the bottom upwards -- also helper functions that live only in this module should be underscore-prefixed to make it more clear what is/isn't part of the api (the api specification is in the languages/all.py file)
There was a problem hiding this comment.
Sure, I can adapt the order. I can also prefix the helper functions not part of the API. But why is in_env and get_env_patch not prefixed in other modules although not listed in languages/all.py?
There was a problem hiding this comment.
hmmmm yeah good point, those should probably be underscore-prefixed everywhere -- probably just legacy and copy paste 😅
pre_commit/languages/renv.py
Outdated
| # TODO what if expr is x;2? not captured by shlex.split() | ||
| """ | ||
| if entry[0] != 'Rscript': | ||
| raise SyntaxError('entry must start with `Rscript`.') |
There was a problem hiding this comment.
SyntaxError is for python syntax being incorrect -- these should be ValueError probably
pre_commit/languages/renv.py
Outdated
| ' '.join([ | ||
| 'The only valid syntax is `Rscript -e {expr}`', | ||
| 'or `Rscript path/to/hook/script`', | ||
| ]), |
There was a problem hiding this comment.
this can be written as an implicitly joined string:
raise TypeError(
'first line here '
'second line here'
)
pre_commit/languages/renv.py
Outdated
| ]), | ||
| ) | ||
|
|
||
| None |
There was a problem hiding this comment.
this line does nothing and can be removed, note that python implicitly returns None for "void" functions and doesn't support implicit-return-without-return-keyword
| @@ -0,0 +1,18 @@ | |||
| Package: gli.clu | |||
There was a problem hiding this comment.
ah LOCAL_RESOURCES is to allow an additional_dependencies-only repo: local hook -- it seems like additional_dependencies is supported for R so we should probably make an empty one?
tests/repository_test.py
Outdated
| from pre_commit.languages import helpers | ||
| from pre_commit.languages import node | ||
| from pre_commit.languages import python | ||
| from pre_commit.languages import renv |
There was a problem hiding this comment.
the tests specifically for the renv helper functions should live in tests/languages/renv_test.py (or r_test.py after renaming)
tests/repository_test.py
Outdated
|
|
||
|
|
||
| def test_renv_parsing_expr_non_Rscirpt(tempdir_factory, store): | ||
|
|
There was a problem hiding this comment.
(throughout the review, nbd I can fix this myself later -- I don't put blank lines at the beginning or the ending of blocks)
tests/repository_test.py
Outdated
|
|
||
| def test_renv_parsing_expr_non_Rscirpt(tempdir_factory, store): | ||
|
|
||
| with pytest.raises(SyntaxError, match='entry must start with'): |
There was a problem hiding this comment.
I really don't like the match= arg and prefer to test the exact error message, you can follow this pattern:
with pytest.raises(TypeHere) as excinfo:
...
msg, = excinfo.value.args
assert msg == ...
tests/repository_test.py
Outdated
|
|
||
|
|
||
| @xfailif_windows # pragma: win32 no cover | ||
| @ xfailif_windows # pragma: win32 no cover |
There was a problem hiding this comment.
decorators shouldn't have spaces after them -- not sure what happened here
There was a problem hiding this comment.
Also wondered, but one of the pre-commit hook changed it, maybe autopep? I even rebased my PR master and tried to get rid of this. Can revert but then not sure if the pre-commit.ci passes.
There was a problem hiding this comment.
ah my guess is there was a syntax error earlier in the file and then autopep8 freaked -- I've been meaning to fix that in autopep8 but it's tricky (basically sometimes it will try and rewrite a file even if it has a syntax error 😱)
e07dfae to
809c79e
Compare
|
added a stub for this in the docs -- pre-commit/pre-commit.com#473 |

This PR aims to add 2nd level support for R to pre-commit via the virtual environment manager renv, motivated by isolated dependency management and (hopefully) compatibility with pre-commit.ci for 2nd level languages as outlined in lorenzwalthert/precommit#215. The implementation follows the recently contributed support for conda (#1232) for source code and tests as well as the implementation of the python virtualenv language.
Further considerations:
Initial tests only run on Linux. I have to add other platforms as we go if this PR is generally approved.I'd appreciate feedback to the questions in the source code comments, in particular on whether or not to removeAlso, let me know if you want certain comments that explain stuff to be removed. Was not sure how much.in_env()as it seems not needed or whether it should be kept for consistency with other languages.Before this PR, I was not familiar with the pre-commit code base nor a very avid python user, hence sorry if there are obvious flaws in this PR.