Update Mlucas.c, mi64.c for correct modular division (small scalars) (#…

…23) * Update Mlucas.c, mi64.c Small fix, for PRP bases other than 3, to ensure the modular division by the square of the base at the end of a PRP test is not subject to a rounding error (bases 7, 27, 28, and 29 are susceptible among the possible PRP bases less than 32). * Update mi64.c Make a tiny addition to `fquo` in the second limb of the modular division, in the case where a rounding error from the division results in the quotient lying between 0.999...999 (14 nines) and 1.0. * Update Mlucas.c Tweaking the previous assertion for bases other than 3 to apply only to Suyama tests. * Update config-fermat.sh A few tweaks to the comments * Update mi64.c Added warning to stderr if round-off error close to an integer encountered. * Create Fermat-testing.md Extensive documentation on testing Fermat numbers with Mlucas (this pulls together a number of documented and undocumented information about Mlucas from various sources). * Update docs/Fermat-testing.md Link within document as per suggestion Co-authored-by: Teal Dulcet <[email protected]> * Update Fermat-testing.md A few tweaks as suggested by Teal. * Update README.md One small amendment mprime actually does support Pépin tests (though it is handled somewhat awkwardly as it does not generate a certificate proof of the computation; but at least you do get a Res64). Added a link to docs/Fermat-testing.md * Update Fermat-testing.md Explicitly replacing K with kilobytes (FFT length) * Update Mlucas.c Update to Gerbicz’s nsquares integrity check, which assumes the PRP base is always 3. * Update Mlucas.c Fix an ASSERT statement I mangled in the previous commit (that I mistakenly thought behaved like printf) * Update Mlucas.c Successfully tested F13 with the 2K FFT, 8 8 16 radset * Update config-fermat.sh Eliminated 1K FFT, as I am now convinced 2K is the smallest possible workable FFT, which can support F13, F14, and F15. However, 2K requires fiddly tweaking of the Mlucas -f command that makes it a “one-off”. * Update Fermat-testing.md Updated to add 2K FFT * Update config-fermat.sh if statement for F15 / 2K FFT * Update Mlucas.c Consistency with assert around line 4100 * Update Fermat-testing.md Error in 2K FFT documentation * Update Fermat-testing.md Tweak to last paragraph * Update Fermat-testing.md Another 2K FFT tweak * Simplified config-fermat.sh script. * Update Mlucas.c Substitute printing PRP_base in ASSERT with actual value by using cbuf. * Update README.md [skip ci] Tweak to help.txt info * Update Mlucas.c The if condition needed to be inverted, we want to possibly invoke an ASSERT if we do not have equality. * Update Mlucas.c [skip ci] Adding Teal's suggestions. * Update Mlucas.c (SH residue fix) Length of the residue in the arrtmp array is j, not i, so the SH 2^35-1 and 2^36-1 residues are not being calculated from the full residue. * Update Mlucas.c (Suyama (A) bug for Mersenne PRP-CF tests) We refresh the 2^35-1 and 2^36-1 residues for Mersenne exponents (these values were not passed in by the Suyama_CF_PRP call). * Update Mlucas.c (modifying ASSERT) Assert condition covered by if loop. * Update Mlucas.c (JSON timestamp field) [skip ci] UTC not needed in JSON string; see https://www.mersenneforum.org/showthread.php?p=653672#post653672 --------- Co-authored-by: Teal Dulcet <[email protected]> Co-authored-by: Teal Dulcet <[email protected]>
primesearch · Jun 5, 2024 · eec1757 · eec1757
1 parent f760217
commit eec1757
Show file tree

Hide file tree

Showing 5 changed files with 224 additions and 23 deletions.
diff --git a/README.md b/README.md
@@ -19,7 +19,7 @@ Feature | | Mlucas | Prime95/MPrime
 \- | P-1 | ✔️ | ✔️
 \- | P+1 | | ✔️
 \- | ECM | | ✔️
-\- | Pépin | ✔️ | 
+\- | Pépin | ✔️ | ✔️
 **PRP** | Proofs | | ✔️
 \- | Certs | | ✔️
 **Error Checking** | Jacobi | | ✔️
@@ -93,7 +93,7 @@ This README is still in progress. For now, see the original [Mlucas README](http
 
 ## Help
 
-See the [help.txt](help.txt) file for usage information.
+The [help.txt](help.txt) file includes a variety of usage information not covered in the original [README](https://mersenneforum.org/mayer/README.html), concentrating largely on the Mlucas command line options. A separate documentation page covers [Fermat numbers](docs/Fermat-testing.md).
 
 ## Contributing
 

diff --git a/config-fermat.sh b/config-fermat.sh
@@ -27,26 +27,24 @@
 # Mlucas
 MLUCAS=./Mlucas
 
-# Number of iterations
-# use 100, 1000, or 10000 to match pre-computed values
+# Number of iterations (use 100, 1000, or 10000 to match pre-computed values)
 ITERS=100
 
-# Minimum Fermat number (14 or greater)
-MIN=14
+# Minimum Fermat number (15 or greater)
+MIN=15
 
 # Maximum Fermat number (33 or less)
 MAX=29
 
 # Mlucas arguments
 ARGS=(
 	"$@"
-	# Add desired -cpu or -core settings here, or as following arguments, e.g. ../config-fermat.sh -cpu 0:3
-
+	# Add desired -cpu or -core settings here, or as following arguments, e.g. bash ../config-fermat.sh -cpu 0:3
 )
 
-# First, tiny FFT lengths for F14 to F17;
-FFTS=([1]=14 [2]=15 [4]=16 [7]=17 [8]=17)
-# Then, from small up to egregious FFTs for F18 to F33.
+# First, tiny FFT lengths for F15 to F17 (note 4K is the smallest workable length without fiddly radix settings);
+FFTS=([2]=15 [4]=16 [7]=17 [8]=17)
+# Then, from small up to egregiously large FFTs for F18 to F33.
 # The largest FFT reached is 512M, if MAX is set to 33.
 # Note that large FFTs require considerable runtime at 10000 iterations.
 for ((n = 0; n < 16; ++n)); do
@@ -60,12 +58,11 @@ for ((n = 0; n < 16; ++n)); do
 		# k = 15, 16 should both be supported up to at least F32.
 		FFTS[k * m]=$f
 		if [[ $k -eq 15 && $n -gt 5 ]]; then
-			# k = 63 is supported for F24 and above (1008K).
+			# k = 63 is mostly supported for F24 (1008K) and above.
 			FFTS[63 * m >> 2]=$f
 		fi
 	done
 done
-
 for fft in "${!FFTS[@]}"; do
 	f=${FFTS[fft]}
 	if [[ -n $MIN && $f -lt $MIN ]]; then
@@ -75,6 +72,10 @@ for fft in "${!FFTS[@]}"; do
 	fi
 	printf '\n\tTesting F%s (2^%s + 1),\tFFT length: %sK\n\n' "$f" $((1 << f)) "$fft"
 	args=("${ARGS[@]}")
+	# First we test the very fiddly F15 and then loop over F16 up to maximum
+	if [[ $f -eq 15 ]]; then
+		args+=(-radset 8,8,16)
+	fi
 	if [[ $f -le 17 || $f -ge 32 ]]; then
 		args+=(-shift 0)
 	fi

diff --git a/docs/Fermat-testing.md b/docs/Fermat-testing.md
@@ -0,0 +1,189 @@
+# Introduction to Fermat number testing with Mlucas
+
+Mlucas is a powerful Mersenne primality testing and factoring package which supports Lucas–Lehmer 
+testing, Fermat probable primality testing, and Pollard _p_–1 factorisation; justly, most applications of the 
+software have concentrated on examination of the Mersenne numbers, however Mlucas is also capable of 
+testing the primality of Fermat numbers, testing the compositeness of Fermat cofactors, and running _p_–1
+factorisation on Fermat numbers. This document is aimed at filling in several lacunae in the Mlucas 
+documentation with respect to Fermat number testing.
+
+## Mlucas build notes
+
+The Fermat modular squaring code is architecture-specific, ideally requiring Mlucas to be built with one of 
+the SIMD modes (AVX, AVX2, SSE2, etc). This should be handled automatically by the `makemake.sh` script 
+for most hardware, and with the exception of 64-bit ARM (for which workarounds may be available) the default
+build should be capable of testing Fermat numbers.
+
+For example, Apple Silicon is ARM which supports ASIMD, meaning a native Mlucas build cannot test Fermat numbers. One 
+possible answer would be to build an SSE2 Mlucas on an Intel-based Mac, which should be capable of running under 
+Rosetta emulation on Apple Silicon, at the penalty of a significant performance hit for larger exponents.
+
+(When compiling on Intel with the intention of building a compatible version for Apple Silicon, the Makefile 
+requires static linking of the GMP library, by the following alteration to the linker flags:
+`LDLIBS = -Bstatic ${LD_ARGS[@]} -Bdynamic`
+Static linking of libraries is usually not recommended, and this may not be a solution in all possible cases.)
+
+## Fermat self-testing and populating `fermat.cfg`
+
+Assuming you have a working build, Mlucas allows any Fermat number _F<sub>m</sub>_ = 2<sup>2<sup>_m_</sup></sup> + 1 with exponent _m_ in 
+the range [13,63] to be entered as an argument for self-testing, by using the `-f` flag like so:
+
+`./Mlucas -f <exponent> -iters <number> [-fft <fft_length>]`
+
+However, the software only provides fast Fourier transforms (FFTs) up to 512M in length, which further 
+limits the largest Fermat number that may be tested, up to _F_<sub>33</sub>.
+
+The `-iters` flag is mandatory and should be followed by a number less than a million; the recommended 
+values for `-iters` to be set to are 100, 1000, or 10000, since the self-testing code has correct residues 
+for all Fermat numbers [14,33] saved, and any error in the test result will be immediately discovered.
+
+Mlucas does not support Fermat testing for all possible FFT lengths. Generally, FFTs for testing Mersenne 
+numbers are available for any length _k_·2<sup>_n_</sup> in kilobytes, where _k_ = [8,15], _n_ ≥ 0. 
+In the case of Fermat numbers however, only the subset _k_ = {2, 7, 15} supports Fermat testing, along with
+_k_ = 63 when _n_ ≥ 4.
+
+When self-testing Mlucas for Mersenne numbers, a command such as `./Mlucas -s all` configures 
+Mlucas for the majority of FFT lengths and saves the results in `mlucas.cfg`. A similar process must be 
+carried out for the Fermat numbers, using the `./Mlucas -f` command above, which saves results in the 
+file `fermat.cfg`.
+
+A script [`config-fermat.sh`](https://github.com/primesearch/Mlucas/blob/main/config-fermat.sh) has been written to automate the generation of the `fermat.cfg` file using 
+the set of possible FFT lengths. After the license, the next few lines declare some variables which may 
+be customised as desired (for example, 10000 iterations will require significant runtime at the largest 
+possible FFT sizes):
+```bash
+# Mlucas
+MLUCAS=./Mlucas
+
+# Number of iterations (use 100, 1000, or 10000 to match pre-computed values)
+ITERS=100
+
+# Minimum Fermat number (15 or greater)
+MIN=15
+
+# Maximum Fermat number (33 or less)
+MAX=29
+```
+The script should be invoked from the build directory, typically `obj`, with the following command, which 
+may include as parameters any cpu-specific options, such as `-core` or `-cpu`:
+```
+bash ../config-fermat.sh -cpu 0:3
+```
+Once self-testing has occurred, production work on Fermat numbers may be carried out using a `worktodo.txt` 
+file (in previous versions of Mlucas this file was named `worktodo.ini`) with entries in one of the several 
+formats below.
+
+Only assignments for ECM factoring of Fermat numbers are distributed by Primenet, so work assignments 
+cannot be obtained for Fermat numbers using the `primenet.py` script. However _p_–1 results for Fermat 
+numbers up to _F_<sub>29</sub> may be submitted to [mersenne.org](https://www.mersenne.org/manual_result/) as Manual results.
+
+## Primality testing: Pépin’s test
+
+For the Pépin test, the `worktodo` format is simply:
+```
+Fermat,Test=<exponent>
+```
+
+Fermat numbers in the range [13,22] may select a non-optimal FFT length by default in production mode.
+If this is the case, the FFT may be overridden using the command:
+```
+./Mlucas -fft FFT_LENGTH -shift 0
+```
+The optimal FFT lengths vary from 2K up to 512M, as shown in the table below under [testable Fermat numbers](#the-testable-fermat-numbers-fm--22m--1).
+
+Currently, all Fermat numbers up to _F_<sub>30</sub> have received a [Pépin test](https://www.mersenneforum.org/showthread.php?t=18748), and moreover _F_<sub>31</sub> and _F_<sub>32</sub> are known 
+to be composite. However, the character of their cofactors is unknown so a Pépin test would be a 
+necessary prerequisite for testing them; and as for _F_<sub>33</sub>, this is yet to be 
+tested for primality.
+
+The Pépin test uses Gerbicz error checking (GEC) to assure the reliability of the computation, however 
+residue shifting is not implemented for Fermat modular squaring with GEC. Thus when invoking Mlucas the 
+flag `-shift 0` _must_ be added to the Mlucas command line:
+```
+./Mlucas -shift 0
+```
+If non-zero residue shifting is used, the Pépin test will not be able to progress beyond the millionth 
+iteration (the default interval for GEC) as error checking will be unable to confirm whether or not the 
+computation is correct.
+
+## Cofactor compositeness testing: Suyama’s test
+Suyama’s test is a Fermat probable primality test on the cofactor of a Fermat number. As a prerequisite, 
+you _must_ already have run a Pépin test, which should have saved a final residue as a file named e.g. `f23` 
+for a Pépin test of _F_<sub>23</sub>. Do **not** try to run a Suyama test as a single work type incorporating the preceding 
+Pépin test.
+
+The `worktodo` format for the Suyama test is:
+```
+PRP=N/A,1,2,<2^m>,+1,99,0,3,5,"known_factor_1[,known_factor_2,...]"
+```
+
+Note that this work format uses the numeric value of 2<sup>_m_</sup> of a Fermat number _F<sub>m</sub>_, rather than just the 
+exponent _m_. At least one known prime factor must be supplied; composite factors are disallowed.
+
+Prior to testing, Mlucas tries an “integrity check” on each known factor, which has the effect of testing 
+whether the Pépin residue _R_ has been correctly calculated. This calculation can be performed at any point 
+up to the final iteration. More information is available [here](https://github.com/primesearch/Mlucas/issues/4).
+
+## Pollard _p_–1 factoring
+
+Work entries for _p_–1 factoring have two variations, however only the `Pminus1` format is supported for 
+Fermat numbers:
+```
+Pminus1=N/A,1,2,<2^m>,+1,B1,B2[,trial_factoring_bits][,B2_start][,"known_factors"]
+```
+
+The same note as regards the exponent for the Suyama test applies here also; however supplying known, prime factors
+is optional. The `-shift 0` flag is again a necessary addition to the Mlucas command line.
+
+The `B2_start` variable supports breaking up stage 2 of the algorithm among multiple instances of Mlucas.
+
+## Fermat number `worktodo.txt` examples
+
+An example of each of the work types, Pépin and Suyama tests, and Pollard _p_–1 factoring:
+```
+Fermat,Test=23
+PRP=N/A,1,2,1073741824,+1,99,0,3,5,"640126220763137,1095981164658689"
+Pminus1=N/A,1,2,8589934592,+1,10000000,10000000
+```
+Note the _p_–1 test of _F_<sub>33</sub> (as actually [performed](https://www.mersenneforum.org/showthread.php?t=29183) by Ernst Mayer) 
+used the same _B1_ and _B2_ bounds to run only the first stage of the algorithm.
+
+## The testable Fermat numbers _F<sub>m</sub>_ = 2<sup>2<sup>_m_</sup></sup> + 1
+
+The following table lists the default FFT lengths selected by Mlucas for the Fermat numbers in the range 
+[13,33] and the known factors, required for some of the test types above.
+
+ _m_|       2<sup>_m_</sup>|Default FFT |Optimal FFTs     |known_factor(s)
+--|---------:|-----------:|----------------:|--------------------------------------------------------------------
+13|      8192|        1K\*|               2K|`"2710954639361,2663848877152141313,3603109844542291969,319546020820551643220672513"`
+14|     16384|        1K\*|           2K, 4K|`"116928085873074369829035993834596371340386703423373313"`
+15|     32768|        2K  |           2K, 4K|`"1214251009,2327042503868417,168768817029516972383024127016961"`
+16|     65536|        3K\*|               4K|`"825753601,188981757975021318420037633"`
+17|    131072|  6K\*, 7K  |           7K, 8K|`"31065037602817,7751061099802522589358967058392886922693580423169"`
+18|    262144|       13K\*|    14K, 15K, 16K|`"13631489,81274690703860512587777"`
+19|    524288|       26K\*|    28K, 30K, 32K|`"70525124609,646730219521,37590055514133754286524446080499713"`
+20|   1048576|       52K\*|    56K, 60K, 64K|
+21|   2097152|      104K\*| 112K, 120K, 128K|`"4485296422913"`
+22|   4194304|      208K\*| 224K, 240K, 256K|`"64658705994591851009055774868504577"`
+23|   8388608|      448K  | 448K, 480K, 512K|`"167772161"`
+24|  16777216|      896K  | 896K, 960K, 1008K, 1M|
+25|  33554432|     1792K  | 1792K, 1920K, 2M|`"25991531462657,204393464266227713,2170072644496392193"`
+26|  67108864|     3584K  | 3584K, 3840K, 4032K, 4M|`"76861124116481"`
+27| 134217728|  7M, 7.5M  | 7M, 7.5M, 7.875M, 8M|`"151413703311361,231292694251438081"`
+28| 268435456|       15M  | 14M, 15M, 15.75M, 16M|`"1766730974551267606529"`
+29| 536870912|       30M  |  30M, 31.5M, 32M|`"2405286912458753"`
+30|1073741824|       60M  |    60M, 63M, 64M|`"640126220763137,1095981164658689"`
+31|2147483648|      120M  | 120M, 126M, 128M|`"46931635677864055013377"`
+32|4294967296|      256M  | 240M, 252M, 256M|`"25409026523137"`
+33|8589934592|      512M  |       504M, 512M|
+
+\* These default FFTs selected by Mlucas have radices that cannot be used for Fermat modular squaring;
+thus _F_<sub>13</sub> to _F_<sub>22</sub> cannot be tested with Mlucas without overriding the default FFT selected, _e.g._:
+```
+./Mlucas -fft 224K -shift 0
+```
+As mentioned above at [Mlucas build notes](#mlucas-build-notes), building Mlucas is highly architecture-specific, and some 
+build types will not support all of the optimal FFT lengths.
+
+Finally, 2K appears to be the smallest usable FFT for the smallest testable Fermat number _F_<sub>13</sub>;
+the 4K FFT seems to be more reliable on a wider range of systems for the next larger Fermat numbers _F_<sub>14</sub> to _F_<sub>16</sub>.