-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Split arb_pow_fmpz_binexp
into a ui
and mpz
version and implement proper arb_sqr
#396
base: master
Are you sure you want to change the base?
Conversation
And inline arb_pow_[fmpz_binexp/fmpz/ui].
This is going to be used for adding exponents when squaring
And fill some gaps in the documentation for arf.h.
|
||
mag_init(resr); | ||
mag_fast_mul(resr, xm, arb_radref(x)); | ||
if (resr->exp < COEFF_MAX && resr->exp >= COEFF_MIN) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks are not needed; the exponent can be increment with risk of overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume you meant without risk of overflow ;) Och god jul, Fredrik!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
God jul :)
|
||
mag_init(resr); | ||
mag_mul(resr, xm, arb_radref(x)); | ||
fmpz_add_ui(&(resr->exp), &(resr->exp), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use MAG_EXPREF
@@ -914,6 +926,12 @@ int arf_mul_rnd_down(arf_ptr z, arf_srcptr x, arf_srcptr y, slong prec); | |||
? arf_mul_rnd_down(z, x, y, prec) \ | |||
: arf_mul_rnd_any(z, x, y, prec, rnd)) | |||
|
|||
int arf_sqr_via_mpfr(arf_ptr res, arf_srcptr x, slong prec, arf_rnd_t rnd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is pointless extra code; arf_mul_via_mpfr already does the job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It removes one branching and removes two/three useless assignments/calculations. My thinking with this PR is to strip all the "unnecessary" stuff away from squaring in order to make it as fast as possible. I know it's pretty small improvement, but you do not think that it's worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mul_via_mpfr is only used for products that take 1000+ cycles to compute so a few cycles here will not be measurable.
@@ -111,6 +111,18 @@ Types, macros and constants | |||
test code. If in doubt, simply try with some convenient high precision | |||
instead of using this special value, and check that the result is exact. | |||
|
|||
.. macro:: nn_mul_2x1(r2, r1, r0, a1, a0, b0) | |||
|
|||
Sets `(r_2, r_1, r_0)` to `(a_1, a_0)` multiplied by `b_0`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to document that these are limbs. Also, I think these macros don't allow aliasing, which might be worth documenting.
@@ -553,7 +565,8 @@ Addition and multiplication | |||
|
|||
.. function:: int arf_neg_round(arf_t res, const arf_t x, slong prec, arf_rnd_t rnd) | |||
|
|||
Sets *res* to `-x`. | |||
Sets *res* to `-x`. Returns 0 if this operation was made exactly and 1 if | |||
truncation occurred. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"truncation" is the wrong term (means rounding down) - should be "rounding".
{ | ||
slong cx = *x; | ||
|
||
if (!COEFF_IS_MPZ(*res) && (cx > COEFF_MIN / 2 && cx < COEFF_MAX / 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These conditions are wrong. See _fmpz_add2_fast. Anyway, I think this function is not really needed; the compiler should be able to eliminate the redundant branches in _fmpz_add2_fast when passed the same operand twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has actually one property that I do not think the compiler can transform into, and that is that we now can allow cx
to be up to size COEFF_MAX / 2
instead of COEFF_MAX / 4
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_fmpz_add2_fast
/_fmpz_sub2_fast
need to work for values of c
larger than +/- 1. I think it's confusing if _fmpz_2times_fast
in contrast to the other functions only supports |c| <= 1
.
In practice, it makes no difference to go up to COEFF_MAX / 2
instead of COEFF_MAX / 4
since such huge exponents are very rare. We only care about optimizing for small exponents here. Better to be uniform and conservative.
Considering this, I don't really think this extra function is needed.
Clearly these functions and their assumptions should be documented...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will remove it then.
Did you do some profiling? |
I am done with the former part, with addition of adding
nn_sqr_2
which simplifiesnn_mul_2x2
by skipping oneumul_ppmm
.I am going to add
arf_sqr_rnd_down
and then implementarb_sqr
from there.