From 08e0f144ae4341f7187b07829de829d9a38251c8 Mon Sep 17 00:00:00 2001 From: Dhruv Arya Date: Thu, 12 Dec 2024 11:20:50 +0530 Subject: [PATCH] update upper bound of last bucket --- PROTOCOL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PROTOCOL.md b/PROTOCOL.md index 5b990b7973..a1702d77b2 100644 --- a/PROTOCOL.md +++ b/PROTOCOL.md @@ -387,7 +387,7 @@ The histogram bins correspond to the following ranges: - Bin 6: [100000, 999999] (files with 100,000-999,999 deleted records) - Bin 7: [1000000, 9999999] (files with 1,000,000-9,999,999 deleted records) - Bin 8: [10000000, 2147483646] (files with 10,000,000 to 2147483646 (i.e. Int.MaxValue-1 in Java) deleted records) -- Bin 9: [2147483647, 9,223,372,036,854,775,807] (files with 2147483647 or more deleted records) +- Bin 9: [2147483647, ∞) (files with 2147483647 or more deleted records) This histogram allows analyzing the distribution of deleted records across files in a Delta table, which can be useful for monitoring and optimizing deletion patterns.