Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Known_hosts file loses content when adding new host #1209

Open
pikeas opened this issue Sep 19, 2024 · 9 comments
Open

Known_hosts file loses content when adding new host #1209

pikeas opened this issue Sep 19, 2024 · 9 comments

Comments

@pikeas
Copy link

pikeas commented Sep 19, 2024

Describe the bug

Pyinfra deletes lines from ~/.ssh/known_hosts.

To Reproduce

https://github.com/pyinfra-dev/pyinfra/blob/3.x/pyinfra/connectors/sshuserclient/client.py#L43-L49:

with HOST_KEYS_LOCK:
    host_keys = client.get_host_keys()
    host_keys.add(hostname, key.get_name(), key)
    # The paramiko client saves host keys incorrectly whereas the host keys object does
    # this correctly, so use that with the client filename variable.
    # See: https://github.com/paramiko/paramiko/pull/1989
    host_keys.save(client._host_keys_filename)

This happens because of the way Paramiko parses the file - it throws away comment lines, newlines, and lines it doesn't understand.

https://github.com/paramiko/paramiko/blob/main/paramiko/hostkeys.py#L89-L97:

with open(filename, "r") as f:
    for lineno, line in enumerate(f, 1):
        line = line.strip()
        if (len(line) == 0) or (line[0] == "#"):
            continue
        try:
            entry = HostKeyEntry.from_line(line, lineno)
        except SSHException:
            continue

So, when Pyinfra calls hosts_keys.save(...), this doesn't just add a new entry, it actually overwrites the file with the contents as parsed by Paramiko. This is unexpected - I've been fighting disappearing contents in known_hosts for years and had no idea it was caused by PyInfra.

Expected behavior

Pyinfra should either never modify known_hosts, or only add new hosts.

I understand the intent here, to make running PyInfra as convenient as possible. But IMO this behavior is a foot-gun for infrastructure maintainers and should be changed. It's frustrating to lose formatting, painful to lose comments, and breaks general SSH usage when @cert-authority (not yet support by Paramiko) silently disappears.

evoldstad added a commit to evoldstad/pyinfra that referenced this issue Oct 1, 2024
Instead of relying on paramiko to correctly parse the full
hostfile, this fix makes it so that we only append new keys to
the hostfile rather than overwriting it completely.

The old behaviour results in unwanted alterations to the hostfile,
such as hostfile entries not parseable by paramiko disappearing
without users consent, and comments being removed.

I also split out the logic for appending hostkeys into a new function
instead of having the code be unnecessarily repeated in multiple places.
This makes the MissingHostKeyPolicy children a bit more readable IMO.
@evoldstad
Copy link
Contributor

Hi,

I created a possible fix for this, making it so that pyinfra will only append to the known_hosts file. Do you have the possibility of testing it with a known_hosts file that is known to break? Alternatively providing an example that demonstrates the behavior. I would like to add a test for this to ensure we don't repeat the behavior, as I agree that this has a very negative impact for certain users.

@pikeas
Copy link
Author

pikeas commented Oct 2, 2024

Here's a reproduction:

$ mv ~/.ssh/known_hosts ~/.ssh/know_hosts.bak
$ cd (mktemp -d)
$ rye init # or other project setup, such as pip
$ rye add pyinfra

$ cat <<EOF > ~/.ssh/known_hosts
# this is an important comment

# another comment after the newline

@cert-authority *.my-other-domain.com <key type> <key>
EOF

$ rye run pyinfra server.example.com exec -- echo "hello world"
--> Loading config...
--> Loading inventory...
--> Connecting to hosts...
    No host key for server.example.com found in known_hosts, accepting & adding to host keys file
    [server.example.com] Connected

<work happens>

--> Disconnecting from hosts...

$ cat ~/.ssh/known_hosts
server.example.com <key type> <key>

The known_hosts file is overwritten, which silently removes the comments, newlines, and cert-authority lines.

@pikeas
Copy link
Author

pikeas commented Oct 2, 2024

3.x...evoldstad:pyinfra:known_hosts_fix#diff-e12fd39e569809c21098f098715232f682ab3fd99ed42a24b266c75f8c5552acR65-R76

In the fix under AskPolicy, the host addition should likely be moved inside the "y" confirmation.

@evoldstad
Copy link
Contributor

(Sorry for the late response, got more busy than expected)

Do you mean this part?

        if should_continue.lower() != "y":
            raise SSHException(
                "AskPolicy: No host key for {0} found in known_hosts".format(hostname),
            )

As it will raise an exception if the answer is not "y" I think it is correct (unless I'm missing something?)

(I'm currently finishing writing the test-case so that future commits won't accidentally reintroduce the bug, thanks for your feedback!)

@pikeas
Copy link
Author

pikeas commented Oct 8, 2024

(Sorry for the late response, got more busy than expected)

No apology needed, your hard work as an OSS maintainer is appreciated!

As it will raise an exception if the answer is not "y" I think it is correct (unless I'm missing something?)

You're right, this appears to be correct. Thanks again!

evoldstad pushed a commit to evoldstad/pyinfra that referenced this issue Dec 2, 2024
…a-dev#1209)

This fixes issue pyinfra-dev#1209, making it so that we append new keys to the
users known_hosts file instead of overwriting it.

Additionally:

    Added a testcase that should discover this breaking in the future.
    Broke out the append functionality into a "append_hostkey" function,
making it so we don't needlessly reuse code for AskPolicy and
AcceptNewPolicy.
    Linting actually correct this time.

Previous behaviour when adding a new key:

    Create a paramiko.HostKeys object
    Read the users known_hosts file
    Add the new key to the object
    Save the object, overwriting the users host_keys file.

New behaviour:

    Create a paramiko.HostKeyEntry object using the new hostname and
corresponding key.
    Append this key to the existing known_hosts file.
@evoldstad
Copy link
Contributor

After some (dumb) mistakes with linting and commits, I finally have a working pull request submitted.
It seems that there is an issue with the spell-check linter though, as it thinks a random part of the example ssh-key I generated is misspelled... Hopefully someone can make an exception or show how to work around this, because I'm not sure I can do much about it except generate a new random key and hope that one doesn't get flagged by the spell-checker?

evoldstad pushed a commit to evoldstad/pyinfra that referenced this issue Dec 5, 2024
…a-dev#1209)

This fixes issue pyinfra-dev#1209, making it so that we append new keys to the
users known_hosts file instead of overwriting it.

Additionally:

    Added a testcase that should discover this breaking in the future.
    Broke out the append functionality into a "append_hostkey" function,
making it so we don't needlessly reuse code for AskPolicy and
AcceptNewPolicy.
    Linting actually correct this time.

Previous behaviour when adding a new key:

    Create a paramiko.HostKeys object
    Read the users known_hosts file
    Add the new key to the object
    Save the object, overwriting the users host_keys file.

New behaviour:

    Create a paramiko.HostKeyEntry object using the new hostname and
corresponding key.
    Append this key to the existing known_hosts file.
evoldstad pushed a commit to evoldstad/pyinfra that referenced this issue Dec 5, 2024
…a-dev#1209)

This fixes issue pyinfra-dev#1209, making it so that we append new keys to the
users known_hosts file instead of overwriting it.

Additionally:

    Added a testcase that should discover this breaking in the future.
    Broke out the append functionality into a "append_hostkey" function,
making it so we don't needlessly reuse code for AskPolicy and
AcceptNewPolicy.
    Linting actually correct this time.

Previous behaviour when adding a new key:

    Create a paramiko.HostKeys object
    Read the users known_hosts file
    Add the new key to the object
    Save the object, overwriting the users host_keys file.

New behaviour:

    Create a paramiko.HostKeyEntry object using the new hostname and
corresponding key.
    Append this key to the existing known_hosts file.
evoldstad pushed a commit to evoldstad/pyinfra that referenced this issue Dec 8, 2024
…a-dev#1209)

This fixes issue pyinfra-dev#1209, making it so that we append new keys to the
users known_hosts file instead of overwriting it.

Additionally:

- Added a testcase that should discover this breaking in the future.

- Broke out the append functionality into a "append_hostkey" function,
    making it so we don't needlessly reuse code for AskPolicy and

- AcceptNewPolicy.

- Linting actually correct this time.

This commit changes the behaviour when adding a new key to known_hosts:

Previous behaviour when adding a new key:

- Create a paramiko.HostKeys object

- Read the users known_hosts file

- Add the new key to the object

- Save the object, overwriting the users host_keys file.

New behaviour:

- Create a paramiko.HostKeyEntry object using the new hostname and
corresponding key.

- Append this key to the existing known_hosts file.
@evoldstad
Copy link
Contributor

Issues with pull-request was finished yesterday and the fix is now waiting to be merged :)
Request: #1252

Fizzadar pushed a commit that referenced this issue Dec 11, 2024
This fixes issue #1209, making it so that we append new keys to the
users known_hosts file instead of overwriting it.

Additionally:

- Added a testcase that should discover this breaking in the future.

- Broke out the append functionality into a "append_hostkey" function,
    making it so we don't needlessly reuse code for AskPolicy and

- AcceptNewPolicy.

- Linting actually correct this time.

This commit changes the behaviour when adding a new key to known_hosts:

Previous behaviour when adding a new key:

- Create a paramiko.HostKeys object

- Read the users known_hosts file

- Add the new key to the object

- Save the object, overwriting the users host_keys file.

New behaviour:

- Create a paramiko.HostKeyEntry object using the new hostname and
corresponding key.

- Append this key to the existing known_hosts file.
@evoldstad
Copy link
Contributor

@pikeas , the request was successfully merged and this issue should now be fixed :)

@pikeas
Copy link
Author

pikeas commented Dec 16, 2024

Thanks for fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants