Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Signatures with Associated Data #4313

Closed

Conversation

reneme
Copy link
Collaborator

@reneme reneme commented Aug 16, 2024

Pull Request Dependency

Description

This is a new concept introduced by FIPS 204 and 205 (ML-DSA, SLH-DSA) where applications get the chance to provide some "context" that is incorporated into their signatures. Technically, it is just hashed into the XOF before the actual message. Note that this is not the specified pre-hash variant we had discussed before.

Exposing this to the user is fairly straight forward, and we find that this is conceptually similar to the associated data in an AEAD. Therefore, we propose to introduce set_associated_data() methods in PK_Verifier and PK_Signer that behave similarly to the same method in AEAD_Mode. Most notably, the AD is kept when signing multiple messages with a single PK_Signer/PK_Verifier. We envision this to be used like that:

const std::vector<uint8_t> signature = /* some sig */
const std::vector<uint8_t> message = /* some message */
const std::vector<uint8_t> ad = /* some application-defined associated data */
std::unique_ptr<Public_Key> pk = /* some pubkey that supports AD - e.g. ML-DSA */

PK_Verifier v(pk, "" /* no padding, meh! */);
v.set_associated_data(ad); // must happen before any call to update()
const bool ok = v.verify_message(message, signature);
const bool also_ok = v.verify_message(some_other_message, some_other_signature); // <- uses the same AD

Signature schemes that don't support this, should throw immediately and there's an additional predicate method is_valid_associated_data_length() to generically check for support.

reneme and others added 2 commits August 16, 2024 11:51
This is a new concept introduced by FIPS 204 and 205 (ML-DSA, SLH-DSA)
where applications get the chance to provide some context that is
incorporated into their signatures. It is conceptually similar to the
associated data in an AEAD, therefore it behaves similarly in the
Signer/Verifier.

Note that algorithms that don't support AD, are supposed to always throw
when an application calls set_associated_data() on them. There is also a
predicate function is_valid_associated_data_length() for applications to
generically check for the support of it.

Co-Authored-By: Fabian Albert <[email protected]>
@reneme reneme added the enhancement Enhancement or new feature label Aug 16, 2024
@reneme reneme added this to the Botan 3.6.0 milestone Aug 16, 2024
@reneme reneme self-assigned this Aug 16, 2024
@reneme reneme changed the title Feature/signature with label Support Signatures with Associated Data Aug 16, 2024
@reneme reneme requested a review from randombit August 16, 2024 10:13
@coveralls
Copy link

Coverage Status

coverage: 91.25% (-0.02%) from 91.27%
when pulling a3d56a4 on Rohde-Schwarz:feature/signature_with_label
into c2491c7 on randombit:master.

* @returns true if the associated data length is valid for this signature scheme.
* Schemes that don't support associated data always return false.
*/
virtual bool is_valid_associated_data_length(size_t length) const {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For schemes that support associated data, is there any (practical) limit? It seems more sensible to me to have

bool supports_associated_data() const;

With the assumption that the signature scheme, if it supports AD, supports effectively arbitrary lengths. Plus maybe a throw Invalid_Argument to handle the unlikely cases where say the scheme supports any inputs up to 4G and you provided 5 GB.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PQ algorithms it is (arbitrarily) limited to 255bytes (because they defined a fixed length-tag of one byte for it). See FIPS 204 Section 5.2 and Algorithm 2.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to characterize this as a context rather than an associated data. That could also be applied for Ed25519ctx/Ed448ctx

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For usability it might be better to bind the context to the PK_Signer and PK_Verifier itself, via a constructor arg, and not allow changing at runtime. I think that would (IIUC) match better with the expected usage, where some protocol Foo uses "Foo-v1" context consistently for all signatures it generates. It also avoids PK_Signer having a uninspectable, mutable state.

Quick proposal:

      // If nullopt is provided, all good
      // If a value is provided, then context must be supported
      PK_Signer(const Private_Key& key,
                RandomNumberGenerator& rng,
                std::string_view padding,
                std::optional<std::span<const uint8_t>> context);

     // Returns whatever context was provided during construction, if any
     std::optional<std::span<const uint8_t>> context();

Assuming the context is a string is probably the common case, and more ergonomic, but maybe someone will want to use a hash or whatever.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the context is a string is probably the common case, and more ergonomic, but maybe someone will want to use a hash or whatever.

I'm really not a big fan of defining these contexts and labels as strings in general. Take the TCG's TPM2 Spec (Annex B.4 "OAEP"):

For RSA keys protecting a secret value [...], the L parameter is a null-terminated string indicating the intended use of the encrypted value. [...] The RSA key’s Name algorithm is used to compute lhash := H(L), and the null termination octet is included in the digest

... our OAEP implementation insists on this value "P" being a std::string_view. And worse, if I wanted to use PK_Encryptor_EME, I'd currently have to marshal this value into an algorithm descriptor string, like: OAEP(SHA-256,MGF1,SOMEVALUE\0). It works, because we're using C++ rather than C. But, as a user, it leaves me quite uneasy.

Instead, I'd always go for a byte-oriented container and maybe provide a free-standing helper like std::span<const uint8_t> from_string(std::string_view).

Copy link
Collaborator Author

@reneme reneme Aug 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For usability it might be better to bind the context to the PK_Signer and PK_Verifier itself, via a constructor arg, and not allow changing at runtime.

That works, too. And, I agree that it probably makes conceptual sense to fix the context at construction of PK_Signer/PK_Verifier. For now, I think, whatever works best.

As rationale for our suggestion: we found that the constructor's parameters are already fairly cumbersome. With padding sometimes actually being a general purpose stringified param (see Dilithium/SPHINCS+, or even worse: SM2, where this even packs two comma-separated parameters 😨). Also the optional params format (being useful only for a few schemes, and providing a setter in PK_Signer) and provider (not really doing anything anymore).

All that to say: Perhaps for Botan 4 we should rethink this API (along with the other pubkey.h utilities: see also #4076 (comment)) altogether. 🤔

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no need to wait to Botan4. But I have a lot of concerns about Any_Map or any similar proposal. There have been a lot of things that were done as "string->string maps" and each time I've regretted it.

I am working on another approach that handles all of this in (IMO) a much cleaner way. WIP coming soon.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just wanted the link for future reference. I do agree (and hope) that there must be a better way to solve that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reneme POC in #4318

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not a big fan of defining these contexts and labels as strings in general.

Regarding this I do understand your point. In #4318 I provided interfaces to set the context as either a bytestring or a string as a convenience, we then convert to a bytestring for internal processing.

@randombit
Copy link
Owner

My feeling is (while there are a lot of details still to be worked out) #4318 is the way to proceed towards supporting this, concur?

@reneme
Copy link
Collaborator Author

reneme commented Aug 21, 2024

My feeling is (while there are a lot of details still to be worked out) #4318 is the way to proceed towards supporting this, concur?

Absolutely! Hence:

image

@reneme
Copy link
Collaborator Author

reneme commented Aug 21, 2024

Mhh, thinking more about it: This does make the entire patch obsolete, as it just introduces API that will be replaced by the builder in #4318. Probably that was your point, no?

Closing.

@reneme reneme closed this Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants