gh-128150: improve performances of `uuid.uuid*` constructor functions. #128151

picnixz · 2024-12-21T10:12:01Z

There are some points that can be addressed:

We can drop some micro-optimizations to reduce the diff. Most of the time is taken by function calls and loading integers.
HACL* MD5 is faster than OpenSSL MD5 so it's better to use the former. However, using _md5.md5 or from _md5 import md5 is a micro-optimization that can be dropped without affecting performances too much.

The rationale of expanding not 0 <= x < 1 << 128 into x < 0 or x > 0xffff_ffff_ffff_ffff_ffff_ffff_ffff_ffff is due to the non-equivalent bytecodes.
Similar arguments apply to expanding not 0 <= x < (1 << C) into x < 0 or x > B where B is the hardcoded hexadecimal value of (1 << C) - 1.

Bytecode comparisons

   1           LOAD_SMALL_INT           0
               LOAD_NAME                0 (x)
               SWAP                     2
               COPY                     2
               COMPARE_OP              42 (<=)
               COPY                     1
               TO_BOOL
               POP_JUMP_IF_FALSE        9 (to L1)
               NOT_TAKEN
               POP_TOP
               LOAD_SMALL_INT           1
               LOAD_SMALL_INT         128
               BINARY_OP                3 (<<)
               COMPARE_OP               2 (<)
               JUMP_FORWARD             2 (to L2)
       L1:     SWAP                     2
               POP_TOP
       L2:     TO_BOOL
               UNARY_NOT
               POP_TOP
               LOAD_CONST               0 (None)
               RETURN_VALUE

versus

   1           LOAD_NAME                0 (x)
               LOAD_SMALL_INT           0
               COMPARE_OP               2 (<)
               COPY                     1
               TO_BOOL
               POP_JUMP_IF_TRUE         8 (to L1)
               POP_TOP
               LOAD_NAME                0 (x)
               LOAD_CONST               0 (340282366920938463463374607431768211455)
               COMPARE_OP             132 (>)
               POP_TOP
               LOAD_CONST               1 (None)
               RETURN_VALUE
       L1:     POP_TOP
               LOAD_CONST               1 (None)
               RETURN_VALUE

Issue: Improve performances of uuid.* functions #128150

📚 Documentation preview 📚: https://cpython-previews--128151.org.readthedocs.build/

eendebakpt · 2024-12-21T13:16:55Z

The changes itself look good at first glance. On the other hand: if performance is really important, there there dedicated packages to calculate uuids (binding to rust or C) that are much faster.

One more idea to improve performance: add a dedicated constructor that skips the checks. For example add to UUID:

    @classmethod
    def _from_int(cls, int,  is_safe=SafeUUID.unknown):
        v= cls.__new__(cls)
        object.__setattr__(v, 'int', int)
        object.__setattr__(v, 'is_safe', is_safe)
        return v

Results in

%timeit UUID._from_int(123 )
%timeit UUID(int=123, version=None)
451 ns ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
767 ns ± 41.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

(the UUID._from_int can be used from inside uuid4 for example)

picnixz · 2024-12-21T13:53:31Z

On the other hand: if performance is really important, there there dedicated packages to calculate uuids (binding to rust or C) that are much faster.

I also thought about expanding the C interface for the module but it would have been too complex as a first iteration. As for third-party packages, I do know about them but there might be slightly differences in which methods they use for the UUID (and this could be a stop for existing code, namely switching to another implementation).

One more idea to improve performance: add a dedicated constructor that skips the checks

I also had this idea but haven't tested it as a first iteration. I wanted to get some feedback (I feel that performance gains are fine but OTOH, the code is a bit uglier =/)

picnixz · 2024-12-21T14:37:41Z

Ok the benchmarks are not always very stable but I do see improvements sith the dedicated constructor. I need to go now but I'll try to see which version is the best and the most stable.

picnixz · 2024-12-22T10:16:58Z

So, we're now stable and consistent:

+----------------------------------------+---------+-----------------------+-----------------------+
| Benchmark                              | ref     | new                   | opt                   |
+========================================+=========+=======================+=======================+
| uuid3(NAMESPACE_DNS, os.urandom(16))   | 1.13 us | 767 ns: 1.47x faster  | 767 ns: 1.47x faster  |
+----------------------------------------+---------+-----------------------+-----------------------+
| uuid3(NAMESPACE_DNS, os.urandom(1024)) | 2.05 us | 1.82 us: 1.13x faster | 1.78 us: 1.15x faster |
+----------------------------------------+---------+-----------------------+-----------------------+
| uuid4()                                | 1.15 us | 867 ns: 1.33x faster  | 860 ns: 1.34x faster  |
+----------------------------------------+---------+-----------------------+-----------------------+
| uuid5(NAMESPACE_DNS, os.urandom(16))   | 1.10 us | 810 ns: 1.35x faster  | 778 ns: 1.41x faster  |
+----------------------------------------+---------+-----------------------+-----------------------+
| uuid5(NAMESPACE_DNS, os.urandom(1024)) | 1.52 us | 1.22 us: 1.24x faster | 1.19 us: 1.27x faster |
+----------------------------------------+---------+-----------------------+-----------------------+
| uuid8()                                | 926 ns  | 673 ns: 1.38x faster  | 671 ns: 1.38x faster  |
+----------------------------------------+---------+-----------------------+-----------------------+
| Geometric mean                         | (ref)   | 1.21x faster          | 1.22x faster          |
+----------------------------------------+---------+-----------------------+-----------------------+

Benchmark hidden because not significant (3): uuid1(), uuid1(node, None), uuid1(None, clock_seq)

Strictly speaking, the uuid1() benchmarks can be considered significant but only if you consider a 4% improvement as significant, which I did not. I only kept improvements over 10%. The last column is the same as the second one (PGO, no LTO) but using python -OO (namely assertions are removed).

picnixz added 2 commits December 19, 2024 21:43

improve performance of UUIDs creation

0d49ccb

add What's New entry

603335f

bedevere-app bot mentioned this pull request Dec 21, 2024

Improve performances of uuid.* functions #128150

Open

bedevere-app bot added the awaiting review label Dec 21, 2024

picnixz added 5 commits December 21, 2024 11:13

blurb

154ff8b

fix issue number

b965887

fix typos

a8a1894

ensure 14-bit clock sequence

c8aa752

Merge branch 'main' into perf/uuid/init-128150

8c9d5cf

picnixz added 2 commits December 21, 2024 15:18

add dedicated private fast constructor

a2278b8

revert UUIDv1 construction

0710549

picnixz force-pushed the perf/uuid/init-128150 branch from 4f2744a to 0710549 Compare December 21, 2024 14:25

change eager check into an assertion check for internal constructor

5b6922f

update performance results

e631593

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-128150: improve performances of `uuid.uuid*` constructor functions. #128151

gh-128150: improve performances of `uuid.uuid*` constructor functions. #128151

picnixz commented Dec 21, 2024 •

edited

Loading

eendebakpt commented Dec 21, 2024

picnixz commented Dec 21, 2024

picnixz commented Dec 21, 2024

picnixz commented Dec 22, 2024 •

edited

Loading

gh-128150: improve performances of uuid.uuid* constructor functions. #128151

Are you sure you want to change the base?

gh-128150: improve performances of uuid.uuid* constructor functions. #128151

Conversation

picnixz commented Dec 21, 2024 • edited Loading

eendebakpt commented Dec 21, 2024

picnixz commented Dec 21, 2024

picnixz commented Dec 21, 2024

picnixz commented Dec 22, 2024 • edited Loading

gh-128150: improve performances of `uuid.uuid*` constructor functions. #128151

gh-128150: improve performances of `uuid.uuid*` constructor functions. #128151

picnixz commented Dec 21, 2024 •

edited

Loading

picnixz commented Dec 22, 2024 •

edited

Loading