Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update agent methods to use the selected R interpreter consistently #2501

Merged
merged 29 commits into from
Dec 21, 2024

Conversation

sagerb
Copy link
Collaborator

@sagerb sagerb commented Dec 17, 2024

Intent

The changes within this PR are designed to implement a consistent usage of the active version of R, when running within Positron. The changes apply to VS Code as well, however, until we implement some of the other issues within the associated epic to gather configured versions, Positron will be the place which exposes the functionality the cleanest.

The associated issue outlines this logic change for determining the active R version to use:

  1. Use the passed-in path for the R interpreter if supplied. It must exist and be validated)
    • If it doesn't exist, fall through to the next option
    • If it does exist but is not a valid R interpreter (use --version output to validate), then return an error
  2. Use the first R executable found on PATH; it must exist and be validated.
    • If not valid or doesn't exist, return an error

Once the interpreter is identified, it should be used for creating/retrieving the lock file, as well as when running commands.

Resolves #2500

Type of Change

    • Bug Fix
    • New Feature
    • Breaking Change
    • Documentation
    • Refactor
    • Tooling

Approach

The existing codebase needed to be more consistent in determining which path was to be used when invoking the R interpreter. There was no clear, specific location where the logic associated with the determination algorithm was isolated. As this task (which is part of the epic hUse the active versions of interpreters instead of the first version on PATH) required changes to this logic, a refactor seemed the best approach to get consistent behavior in place and have a logical location to modify it if we needed to.

The main refactoring involved with this change involved removing the R interpreter-related functionality from the inspect package and adding it to a newly created package, interpreters. I intend to do the same type of refactoring for the other interpreters within this epic (Python and Quato), so this package will be expanded upon soon.

Once the existing code was refactored, it was updated to implement the logic listed in the issue (#2500).

In addition, the functionality of determining if there is a dependency on R was moved within the interpreters package. It had previously lived within the initialization code, but that appeared to have been implemented as a "guard" before usage of the inspect package. Once the changes were made, I moved the requiresR method into the inspect package.

With this in place, all other usages within the code base for an R interpreter were refactored to use the new interpreters package. The need for these other areas of the codebase to access this functionality was a major factor in why I split the functionality out into a separate package. It just didn't seem "right" to have the other areas depend upon the inspect package when they simply wanted an interpreter.

Many unit tests were either moved or updated to work with the new package. I've been careful to maintain the intention of the previous tests and add new ones.

User Impact

The concept we're trying to maintain here is that when a user has set up their environment within an IDE for a particular version of R, they expect that the same version will be used for all operations of the publisher extension. While we have not yet implemented all of the hooks to gather the "active" version of R, the API requests are in place so that we can set that value from the extension code.

While many of the other changes will not be visible to the user, a user will see the consistent behavior when creating a deployment for a project; the configuration file created will have an R section which properly reflects the new approach.

Automated Tests

Directions for Reviewers

Unfortunately, this PR covers more changes than we usually try to have within a single PR. Unfortunately, the impact to the unit tests greatly exploded the changes required. I'd suggest looking at the changes by first examining the main functionality which was refactored. This can be seen by viewing the changes to inspect/r.go as functionality was moved into interpreters/r.go. Changes to the other files fall into two buckets:

  1. calling this new functionality to get the active R interpreter
  2. unit test updates

I recommend using Positron for the validation process, with a configuration that includes two or more versions of R on the system. With an existing R project open, you should be able to create a deployment for a project and have its configuration file populated with an R section reflecting the values of the selected version. Switching the version via Positron and then creating a new deployment should produce a configuration that contains an R section reflecting the active version selected.

Checklist

@sagerb sagerb marked this pull request as ready for review December 19, 2024 01:38
@sagerb
Copy link
Collaborator Author

sagerb commented Dec 19, 2024

Unit tests have been added and CI has been fixed. Ready for further review and testing.

Copy link
Collaborator

@marcosnav marcosnav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still reviewing, great work so far!

I have a couple suggestions that I figured to share those first since that could require some adjustments.

Comment on lines 52 to 56
var _ RInterpreter = &defaultRInterpreter{}

var NewRInterpreter = RInterpreterFactory

func RInterpreterFactory(base util.AbsolutePath,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have these three instances? Also, having an exported factory-constructor introduces the risk of overriding it and seems unnecessary.

Suggestion, only have a single contructor NewRInterpreter...

func NewRInterpreter(base util.AbsolutePath,
  rExecutableParam util.Path, log logging.Logger) RInterpreter {

I understand that the current implementation helps on testing and setting temporary mocks, but we should avoid having global instances that can be overriden so easily.

In this case, consider dependency injection, that can be done in constructors too. For example, in the case of PostPackagesRScanHandler, it is possible to have an RInterpreter factory as dependency and property of it, instead of calling it directly, and it still can be mocked on tests.

type PostPackagesRScanHandler struct {
  base                util.AbsolutePath
  log                 logging.Logger
  interpreterFactory  func(base util.AbsolutePath, rExecutableParam util.Path, log logging.Logger) RInterpreter
}

func NewPostPackagesRScanHandler(base util.AbsolutePath, log logging.Logger) *PostPackagesRScanHandler {
	return &PostPackagesRScanHandler{
		base: base,
		log:  log,
                interpreterFactory: func(dir util.AbsolutePath, rPath util.Path, log logging.Logger) RInterpreter {
                    return interpreters.NewRInterpreter(projectDir, rPath, log)
                },
	}
}

func (h *PostPackagesRScanHandler) ServeHTTP(w http.ResponseWriter, req *http.Request) {
  // ...
  rInterpreter := h.interpreterFactory(projectDir, rPath, h.log)
  // ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I tripped across the wrong pattern in Mike's code (internal/initialize/initialize.go). I didn't much care for it either, but I was trying to be consistent with the prior art. Looking through the rest of the code, it looks like this was the only place he used this technique.

I'll start refactoring this into dependency injection and getting everything to work using it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. The initialize.go file was set up using exported methods, which did not use the builder pattern. That's why Mike used the global variable approach.

//
// Errors are taken care of internally and determine the setting or non-setting
// of the attributes.
func (i *defaultRInterpreter) Init() error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this should be part of the constructor, it can still be it's own method, but by running this logic in the constructor you'll get a couple goodies:

  1. You won't have to check for i.initialized in every method.
  2. Won't need to remember to run Init() after every new instance.

E.g:

func NewRInterpreter(base util.AbsolutePath,
  rExecutableParam util.Path, log logging.Logger) (RInterpreter, error) {

  interpreter := &defaultRInterpreter{
    executor:   executor.NewExecutor(),
    // ...
  }
  err := interpreter.init()
  if err != nil {
    return nil, err
  }
  return interpreter, nil
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had this in the original approach. Something that I can't recall right now drove me in this direction, but I'm not sure it is applicable anymore. I'll take another look and see if I have any concerns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, figured out that the reason I did this was so that we could update the structure assignment values ahead of the init function being run. This was allowing the unit testing to call the factory function, modify whatever they needed to do and then invoke init.

But, I've gone ahead and modified it to now use additional dependency injections for the functionality used internally. This caused some complications, but I do allow nil to be passed in and provide default providers for the functionality. So while there may be a few nils added to the inspect and interpreter factory methods (and perhaps a few more dependent factories), I think it is consistent.

Copy link
Collaborator

@marcosnav marcosnav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finished reviewing, no more observations aside from what was already mentioned.

This is great work and I really like the approach you took on this one! 🏅

Copy link
Collaborator

@marcosnav marcosnav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! There seem to be something to tweak for windows tests, feel free to merge once that passes.

Validated with Positron.

Started with R 4.3, configuration reflected the selected interpreter

Screen Shot 2024-12-20 at 23 53 40

Changed to 4.4, new deployment configuration reflected the change

Screen Shot 2024-12-20 at 23 55 54

@sagerb
Copy link
Collaborator Author

sagerb commented Dec 21, 2024

The last commit places an await when calling the click action, where we were seeing the race condition and lack of selection for the deployment.

The runs within https://github.com/posit-dev/publisher/actions/runs/12440697322 show a 24 out of 25 success rate of running the PR's codebase through CI. The one failure does not look like the same kind of failure we were diagnosing, and at initial examination, it looks more like a failure within the test environment itself.

I'm merging this PR with confidence.

@sagerb sagerb merged commit 69048f2 into main Dec 21, 2024
14 checks passed
@sagerb sagerb deleted the sagerb-extract-r-interpreter-with-new-active-as-priority branch December 21, 2024 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement rExecutable discovery to prefer the active version of R within the IDE rather than path
2 participants