Optimise `load_policy_line` to avoid quadratic individual-character loop #355

huonw · 2024-08-27T00:32:15Z

This dramatically improves the performance of the casbin.persist.adapter.load_policy_line function:

before: iterate over characters individually and build each token string by +='ing those characters. The latter in particular leads to quadratic performance, based the length of the string (e.g. https://stackoverflow.com/a/34008199/1256624), which is noticeable with "long" lines, like if they contain UUIDs.
after: use a regex to find the start/end index of each tokens and slice out each token string in one go.

This function was appearing very high in our profiles of initialising our adapters. We have two UUIDs appear in most of our lines.

To make this a safe refactoring, I've pulled out an internal function that does the string manipulation in isolation and tested it.

I've also added a benchmark that demonstrates the performance improvements. Here's the total change of the median time that pytest-benchmark reports before and after this change, in microseconds:

Benchmark	Before	After	Change
`test_benchmark_extract_tokens_long_nested`	27.0	2.46	-91%
`test_benchmark_extract_tokens_long_simple`	13.6	1.42	-90%
`test_benchmark_extract_tokens_short_nested`	3.25	1.63	-50%
`test_benchmark_extract_tokens_short_simple`	1.58	0.917	-42%

That is, the "long" tests get 10× faster. Even longer lines will likely see a greater speed-up too.

Thanks for casbin!

-------------------------------------------------------------------------------------------------- benchmark: 4 tests -------------------------------------------------------------------------------------------------- Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ test_benchmark_extract_tokens_short_simple 1.4580 (1.0) 58.3340 (1.0) 1.5731 (1.0) 0.5280 (1.0) 1.5420 (1.0) 0.0411 (1.0) 53;455 635.6819 (1.0) 22305 1 test_benchmark_extract_tokens_short_nested 3.0829 (2.11) 13,417.2500 (230.01) 3.8216 (2.43) 49.5247 (93.80) 3.2500 (2.11) 0.2091 (5.09) 60;3604 261.6736 (0.41) 116509 1 test_benchmark_extract_tokens_long_simple 13.5830 (9.32) 95.4170 (1.64) 14.0610 (8.94) 1.0361 (1.96) 13.7920 (8.94) 0.2080 (5.06) 3089;10664 71.1189 (0.11) 57555 1 test_benchmark_extract_tokens_long_nested 26.5830 (18.23) 71.4579 (1.22) 27.2121 (17.30) 0.8780 (1.66) 27.1250 (17.59) 0.0841 (2.05) 566;2007 36.7484 (0.06) 30457 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------- benchmark: 4 tests ------------------------------------------------------------------------------------------------------- Name (time in ns) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_benchmark_extract_tokens_short_simple 999.8912 (1.0) 18,417.0203 (1.0) 1,123.3050 (1.0) 207.9197 (1.0) 1,124.9213 (1.0) 42.0259 (1.0) 97;670 890.2302 (1.0) 26667 1 test_benchmark_extract_tokens_long_simple 1,415.9596 (1.42) 45,624.9109 (2.48) 1,562.9986 (1.39) 223.0803 (1.07) 1,542.0374 (1.37) 42.0259 (1.0) 620;6094 639.7958 (0.72) 196733 1 test_benchmark_extract_tokens_short_nested 1,749.9551 (1.75) 31,417.0029 (1.71) 1,940.4944 (1.73) 295.1896 (1.42) 1,917.0111 (1.70) 42.0259 (1.0) 348;4660 515.3326 (0.58) 115943 1 test_benchmark_extract_tokens_long_nested 2,583.9545 (2.58) 33,583.0264 (1.82) 2,777.7344 (2.47) 224.8759 (1.08) 2,750.0791 (2.44) 42.0259 (1.0) 374;5402 360.0056 (0.40) 117082 1 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------- benchmark: 4 tests --------------------------------------------------------------------------------------------------------- Name (time in ns) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_benchmark_extract_tokens_short_simple 957.9817 (1.0) 20,832.9875 (1.0) 1,106.7959 (1.0) 167.6173 (1.0) 1,084.0595 (1.0) 42.0259 (1.0) 100;508 903.5089 (1.0) 29412 1 test_benchmark_extract_tokens_long_simple 1,374.9814 (1.44) 54,500.0657 (2.62) 1,526.7156 (1.38) 224.1555 (1.34) 1,500.0114 (1.38) 42.0259 (1.0) 511;7264 655.0008 (0.72) 187513 1 test_benchmark_extract_tokens_short_nested 1,707.9292 (1.78) 119,416.9745 (5.73) 1,897.0402 (1.71) 592.5052 (3.53) 1,874.9852 (1.73) 83.9354 (2.00) 458;6029 527.1370 (0.58) 166666 1 test_benchmark_extract_tokens_long_nested 2,500.0190 (2.61) 197,916.0588 (9.50) 2,773.7646 (2.51) 1,057.4522 (6.31) 2,707.9368 (2.50) 166.9396 (3.97) 509;2386 360.5209 (0.40) 116511 1 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------ benchmark: 4 tests ------------------------------------------------------------------------------------------------------------ Name (time in ns) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_benchmark_extract_tokens_short_simple 917.0035 (1.0) 30,542.0253 (1.0) 1,116.1783 (1.0) 703.1331 (3.24) 1,083.0117 (1.0) 41.9095 (1.0) 228;817 895.9142 (1.0) 35346 1 test_benchmark_extract_tokens_long_simple 1,332.9554 (1.45) 46,208.0352 (1.51) 1,483.3709 (1.33) 216.9831 (1.0) 1,459.0332 (1.35) 42.0259 (1.00) 820;5565 674.1402 (0.75) 183217 1 test_benchmark_extract_tokens_short_nested 1,624.9251 (1.77) 185,417.0114 (6.07) 1,924.2939 (1.72) 1,860.9511 (8.58) 1,792.0975 (1.65) 167.0560 (3.99) 722;1138 519.6711 (0.58) 89214 1 test_benchmark_extract_tokens_long_nested 2,417.0149 (2.64) 16,447,208.9382 (538.51) 3,852.3560 (3.45) 100,246.2894 (462.00) 2,707.9368 (2.50) 208.0342 (4.96) 20;836 259.5814 (0.29) 51283 1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------- benchmark: 4 tests --------------------------------------------------------------------------------------------------------- Name (time in ns) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_benchmark_extract_tokens_short_simple 790.9257 (1.0) 25,041.9835 (1.00) 965.2633 (1.0) 690.2278 (5.36) 917.0035 (1.0) 42.0259 (1.02) 261;825 1,035.9868 (1.0) 55814 1 test_benchmark_extract_tokens_long_simple 1,207.9254 (1.53) 568,790.9434 (22.75) 1,424.0680 (1.48) 1,607.7592 (12.49) 1,415.9596 (1.54) 83.0041 (2.02) 544;6540 702.2137 (0.68) 179118 1 test_benchmark_extract_tokens_short_nested 1,499.8950 (1.90) 24,999.9575 (1.0) 1,632.3808 (1.69) 136.4377 (1.06) 1,625.0415 (1.77) 41.0946 (1.0) 1290;4671 612.6022 (0.59) 146349 1 test_benchmark_extract_tokens_long_nested 2,291.0535 (2.90) 28,708.0184 (1.15) 2,454.3364 (2.54) 128.6998 (1.0) 2,457.9931 (2.68) 42.0259 (1.02) 1624;8542 407.4421 (0.39) 146349 1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

casbin-bot · 2024-08-27T00:32:19Z

@Nekotoxin please review

huonw added 6 commits August 27, 2024 09:35

Pull out new _extract_tokens for easier testing

5e47dcb

casbin-bot requested a review from Nekotoxin August 27, 2024 00:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise `load_policy_line` to avoid quadratic individual-character loop #355

Optimise `load_policy_line` to avoid quadratic individual-character loop #355

huonw commented Aug 27, 2024

casbin-bot commented Aug 27, 2024

Optimise load_policy_line to avoid quadratic individual-character loop #355

Are you sure you want to change the base?

Optimise load_policy_line to avoid quadratic individual-character loop #355

Conversation

huonw commented Aug 27, 2024

casbin-bot commented Aug 27, 2024

Optimise `load_policy_line` to avoid quadratic individual-character loop #355

Optimise `load_policy_line` to avoid quadratic individual-character loop #355