Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async connection API does not properly handle timeouts with multiple hosts #618

Open
ajmaidak opened this issue Dec 20, 2024 · 2 comments · May be fixed by #619
Open

async connection API does not properly handle timeouts with multiple hosts #618

ajmaidak opened this issue Dec 20, 2024 · 2 comments · May be fixed by #619

Comments

@ajmaidak
Copy link

If 10.0.0.1 does not respond via TCP even if 10.0.0.2 is online and functioning the following connection will timeout

require 'pg'
conn = PG.connect(
  host: '10.0.0.1,10.0.0.2',
  port: 5432,
  dbname: 'postgres',
  user: 'user',
  password: 'password',
  connect_timeout: 5,
  target_session_attrs: 'read-write',
)

result = conn.exec('SELECT version()')
puts result[0]['version']

The issue is that the async api supported by libpq ignores the "connect_timeout" connection parameter when using PQconnectStart/PQconnectPoll:

The connect_timeout connection parameter is ignored when using PQconnectPoll; it is the application's responsibility to decide whether an excessive amount of time has elapsed.

You can see discussion of this on the PostgreSQL hackers mailing list. Specifically around this same issue with psycopg3.

Browsing the code a bit I believe when the connect_timeout expires here rather then failing there needs to be a call to PQreset or PQresetStart/PQresetPoll to force libpq to reset the connection and try to access the next host?

@larskanis
Copy link
Collaborator

pg-1.5.x versions should do proper connection timeout handling. The details are described in this commit: 40b2ad5

In your case it should wait 5 seconds per connection and abort after 10 seconds if no connection can be established.

Can you describe how it behaves in your setup? Can you do a bit debugging in the code path of pg that you referenced, to investigate how it behaves exactly?

@ajmaidak
Copy link
Author

In your case it should wait 5 seconds per connection and abort after 10 seconds if no connection can be established.

Its not, here an example:

Run a postgres docker:

docker run -d \
    -p 5432:5432 \
    --name some-postgres \
    -e POSTGRES_PASSWORD=password \
    -e POSTGRES_HOST_AUTH_METHOD=scram-sha-256 \
    postgres

This test program

require 'pg'

puts "pg version is: #{PG::VERSION}"

begin
  conn = PG.connect(
    host: ENV["PGHOST"],
    port: 5432,
    dbname: 'postgres',
    user: 'postgres',
    password: 'password',
    connect_timeout: 5,
    target_session_attrs: 'read-write',
  )

  result = conn.exec('SELECT version()')
  puts result[0]['version']

ensure
  conn&.close
end

Set PGHOST (10.0.0.1 is just some address to which a connection will timeout):

$ export PGHOST="10.0.0.1,127.0.0.1"
$ time ruby test.rb 
pg version is: 1.5.9
/Users/amaidak/.gem/ruby/3.2.4/gems/pg-1.5.9/lib/pg/connection.rb:699:in `async_connect_or_reset': connection to server at "10.0.0.1" (10.0.0.1), port 5432 failed: timeout expired (PG::ConnectionBad)
        from /Users/amaidak/.gem/ruby/3.2.4/gems/pg-1.5.9/lib/pg/connection.rb:844:in `connect_to_hosts'
        from /Users/amaidak/.gem/ruby/3.2.4/gems/pg-1.5.9/lib/pg/connection.rb:772:in `new'
        from /Users/amaidak/.gem/ruby/3.2.4/gems/pg-1.5.9/lib/pg.rb:63:in `connect'
        from test.rb:6:in `<main>'

real    0m5.082s
user    0m0.046s
sys     0m0.026s

The expected behavior is the connection to 10.0.0.1 times out after 5 seconds but the client connection to 127.0.0.1 succeeds. Instead you can see the connection simply times out after 5 seconds.

Here is a PR with what I think the correct behavior should be: #619

@ajmaidak ajmaidak linked a pull request Dec 23, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants