Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global option to use Internet explorer proxy settings #417

Closed
juancentro opened this issue Dec 24, 2016 · 18 comments
Closed

Global option to use Internet explorer proxy settings #417

juancentro opened this issue Dec 24, 2016 · 18 comments
Labels
documentation feature a feature request or enhancement

Comments

@juancentro
Copy link

In corporate environments usually there is a proxy and it is integrated with Windows security (usually ntlm using active directory credentials).
The curl package knows how to create handles that use Internet explorer proxy settings. It would be very useful to have a global option that would use those settings every time it creates a handle. Now you can only specify user and pwd globally, and besides it not working everytime (specially when there is a PAC file involved) you have to change the credentials every time the Windows pwd is changed (usually 1 month)

@hadley
Copy link
Member

hadley commented Jul 24, 2017

Do you have any sample code?

@hadley hadley added the feature a feature request or enhancement label Jul 24, 2017
@juancentro
Copy link
Author

Not right now, but it uses ie_proxy_info and ie_get_proxy_for_url from said package. Should probably have a similar behavior to what base R does now on Windows, where it uses the old internet2 (IE internet settings) by default

@hadley
Copy link
Member

hadley commented Jul 25, 2017

I have no idea what that code should look likely, and I don't have a windows machine that requires a proxy to test it on. So unless someone can provide a start at the code, it's unlikely I'll be able to do anything.

@cderv
Copy link
Contributor

cderv commented Jul 25, 2017

I have a windows machine that require a proxy at work. I even have a PAC file. I could test something but don't know where to start.

Proxy setting has always been painful. Currently, the solution in my company is setting the proxy environment variables on the system so that curl and httr find the config.

I was not able to use curl IE helper functions apart from getting the proxy url.

Will try something when I have time. Still have another Issue here to deal with before.

@juancentro
Copy link
Author

juancentro commented Jul 25, 2017 via email

@hadley
Copy link
Member

hadley commented Jul 27, 2017

The best way to help would be to provide me with a snippet of code that works for you, connecting to an external website using your companies proxy server.

@juancentro
Copy link
Author

I tried today, I use a PAC file also, and ie_proxy_info returns the PAC file address (among other things). But curl (the program, not the package) doesn't support PAC files cause they are JavaScript (at least that is what the website says). But Jerome says that ie_get_proxy_for_url downloads the PAC and executes it, so maybe that is something he added. For me, the latter function works and returns the address of the proxy needed for a URL.
Thing is, my proxy uses authentication (integrated Windows) so if I don't specify my credentials it fails.
Maybe try without credentials and use a pop-up (optional) if that fails? And how does R use IE settings thru internet2 dll? I don't need to specify nothing there, it just works! (I.e. for download.file)

@cderv
Copy link
Contributor

cderv commented Aug 1, 2017

R uses IE settings when you use download method wininet which is now the default on windows . download.file has an argument method for changing this behaviour. httr does not use wininet but is based on curl so equivalent to libcurl method that is the default on non windows machine.
you could find some explanation on the help page for ?download.file int the Details part.

I think ie_proxy_info and ie_get_proxy_for_url are here to help find the proxy url but do not help for authentification. I think it is possible to build a function that asked with a pop-up for username and password but it is not non-interactive friendly

I will continue looking for a solution. In my company, I already have to deal with AD and Kerberos for authentification. Not the easiest.

@cderv
Copy link
Contributor

cderv commented Aug 1, 2017

I began to work on something digging into curl options, and see what libcurl can do.

First, we must remind that libcurl respects the environment variable http_proxy, ftp_proxy, sftp_proxy, no_proxy etc.. so if you configure your system with them it will always work.
Example:
http_proxy=http://<user>:<mdp>@<url>:<port> will be found by curl and used. no_proxy is read too, to know when not to use proxy.

Now, if you do not want to do that, what are the solutions. httr works with curl which works with libcurl C++ library. So we must configure curl.

Basically in httr, you could set the proxy for curl with use_proxy. As you could get the proxy url with curl::ie_get_proxy_for_url we could get proxy configuration for request with something like that

config_proxy <- use_proxy(url = curl::ie_get_proxy_for_url(url), username = <user>, password = <pwd>)
# apply the config this way
GET(url, config_proxy) # or with set_config or with_config

By default, use_proxy uses basic authentication method, it is why you have to give username and password. curl works with other authentication methods like ntlm, or gssnegociate. Sometimes, I need to use them for some internal website. Maybe you need them for your proxy configuration - for me it is working like that, with auth="basic". If you need example tell me.

Why this is not perfect ?
The drawback of this is the fact that we have to write down in clear code username and password.
I think we may find some solution to this :

  • environment variables,
  • packages like secrets or keyring
  • asking for password.
    One other issue is what to do for an internal website which does not need proxy. We don't want to configure a proxy in this case.

For dealing with all this, I came up quickly with this function, aiming to do a wrapper for httr::config:

get_config_proxy <- function(url, user = NULL, pwd = NULL, verbose = FALSE, auth = "basic") {
  # curl::ie_get_proxy_for_url wants a scheme - either way it don't work
  if(is.null(httr::parse_url(url)$scheme)) {
    if (verbose)  message("No scheme provided. assume HTTP")
    url <- modify_url(url, scheme = "http")
  }
  # get the proxy url if needed
  proxy_url <- curl::ie_get_proxy_for_url(url)
  # return blank config if no proxy required
  if(is.null(proxy_url)) {
    if(verbose) message("No proxy")
    return(httr::config())
  }
  if(verbose) message("Proxy found")
  # Otherwise configure the proxy / user and password are needed
  if (is.null(user) || is.null(pwd)) {
    if(!interactive()) {
      stop("user and pwd must be provided for authentification")
    } else {
      user <- rstudioapi::askForPassword("username for proxy")
      pwd <- rstudioapi::askForPassword("Password for proxy")
    }
  }
  # return a config object for use with httr
  proxy_config <- httr::use_proxy(url = proxy_url, username = user, password = pwd, auth = auth)
}

This function executes the PAC file to find if proxy is required and which is the url and port. If it is not required, it returns a config unchanged. if it is required, it return the right config to use with httr.

Example :

url <- "http://www.google.fr"
# No proxy configured so we get a timeout from curl fetch
# req <- GET(url) 
# we configure the proxy giving interactively user and password
config_proxy <- get_config_proxy(url)
# we make a GET using the config
req <- with_config(config_proxy, GET(url))
status_code(req) # we get 200. it works! 

There is still a drawback as print(config_proxy) will print the password in clear in the console. however no reason to print it - so maybe not an issue

Initially, I wanted to make a special package with some functions to deal with authentification in my company using a RStudio Addins or other thing. It may be clever to add these functionnality to httr.

@juancentro , can you try these small pieces of code to see what auth method your company is using ?
@hadley, if you want I can help working on a PR to deal with all that, with some examples or a vignette to explain how proxy configuration works with httr.
I think there are several solutions to include something this in httr, keeping proxy configuration external like I did or trying to find a more included way. Using keyring, secret or something like httr's token could be a more clever way to deal with password and username.

I came up with this to open the discussion on a solution. Waiting for your thoughts.

@juancentro
Copy link
Author

juancentro commented Aug 1, 2017 via email

@cderv
Copy link
Contributor

cderv commented Aug 1, 2017

For dealing with ntlm auth, you just have to provide auth="ntlm" in the function.
you could try then with user="" and pwd="" or with user=<user> and pwd="". This kind of auth, if well configured can find your credential automatically from you OS config. Maybe you could find an IT guy to help you. If someone find a solution with curl command line and options, we can translate it to R.

I think you have to play around with different way to use ntlm auth with username and password with curl if it is what you have. I can't test for you because I do not have that. My proxy works with basic auth. Can you try ?

I am not sure it is possible to make curl use IE setting. It is wininet which can do that and httr do not work with it. I will still search the web about curl and IE setting, but it seems not to have anything useful.

now that you can use download.file(), url() with "wininet" and don't have to always use httr even if it is super useful and easy. It could make thing easier sometimes not to use it ! :) (maybe I should not say that.. )

@mungojam
Copy link

mungojam commented Nov 17, 2017

This worked well for me, hope it can be made more of a standard request though:

    config_proxy <- use_proxy(
                      url = curl::ie_get_proxy_for_url(apiUrl), 
                      auth = "ntlm", 
                      username = ""
                    )
    
    GET(apiUrl, config_proxy)

@cderv
Copy link
Contributor

cderv commented Nov 17, 2017

Do not work for me as my proxy does not seem to use ntml credentials. Currently, still using environment variable, but working on package to simplify the use.
However, nice to know it worked for you. Curious if it will work for @juancentro ... we will see !

@mungojam
Copy link

mungojam commented Sep 13, 2018

@cderv you could try "gssnegotiate" instead. "ntlm" doesn't work for us anymore, presumably due to some proxy change. "gssnegotiate" does:

config_proxy <- use_proxy(
                      url = curl::ie_get_proxy_for_url(apiUrl), 
                      auth = "gssnegotiate", 
                      username = ""
                    )
    
GET(apiUrl, config_proxy)

@hadley
Copy link
Member

hadley commented Nov 22, 2018

I think this would make a great blog post, and then if that helps many people, it could be turned into a httr vignette. Does that sound like a reasonable place to start?

@juancentro
Copy link
Author

juancentro commented Nov 22, 2018 via email

@ugzvot
Copy link

ugzvot commented Aug 28, 2019

Digging up an old issue, but i found a working solution for my corporate environment (man in the middle proxy using using Windows logins (SSPI) and the ntlm authentication protocol).

For downloads in RStudio / R (download.file(), install.packages()) setting the environment variables HTTP_PROXY and HTTPS_PROXY with the proxy url is enough (in ~/.renviron in my case).

For httr I found the following solution:

library(httr)
GET("https://cran.r-project.org/Rlogo.svg")
#> Error in curl::curl_fetch_memory(url, handle = handle): Received HTTP code 407 from proxy after CONNECT

proxy_config <- config(
  proxy = curl::ie_get_proxy_for_url(),
  proxyauth = 8,
  proxyuserpwd = ":",
  ssl_verifypeer = 0
)

GET("https://cran.r-project.org/Rlogo.svg", config = proxy_config)
#> Response [https://cran.r-project.org/Rlogo.svg]
#>   Date: 2019-08-28 12:44
#>   Status: 200
#>   Content-Type: image/svg+xml
#>   Size: 2.69 kB
#> <BINARY BODY>

Created on 2019-08-28 by the reprex package (v0.3.0)

I had the same challenge getting renv working with curl: rstudio/renv#146 ( contains some additional information about proxies)

@hadley
Copy link
Member

hadley commented Oct 31, 2023

httr has been superseded in favour of httr2, so is no longer under active development. If this problem is still important to you in httr2, I'd suggest filing an issue offer there 😄 — but I think this is no longer so important due to changes in the way that windows manages proxies. Thanks for using httr!

@hadley hadley closed this as completed Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants