forked from numba/numba
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGE_LOG
3674 lines (3059 loc) · 147 KB
/
CHANGE_LOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Version 0.50.0
--------------
In development
Version 0.49.1 (May 7, 2020)
----------------------------
This is a bugfix release for 0.49.0, it fixes some residual issues with SSA
form, a critical bug in the branch pruning logic and a number of other smaller
issues:
* PR #5587: Fixed #5586 Threading Implementation Typos
* PR #5592: Fixes #5583 Remove references to cffi_support from docs and examples
* PR #5614: Fix invalid type in resolve for comparison expr in parfors.
* PR #5624: Fix erroneous rewrite of predicate to bit const on prune.
* PR #5627: Fixes #5623, SSA local def scan based on invalid equality
assumption.
* PR #5629: Fixes naming error in array_exprs
* PR #5630: Fix #5570. Incorrect race variable detection due to SSA naming.
* PR #5638: Make literal_unroll function work as a freevar.
* PR #5648: Unset the memory manager after EMM Plugin tests
* PR #5651: Fix some SSA issues
* PR #5652: Pin to sphinx=2.4.4 to avoid problem with C declaration
* PR #5658: Fix unifying undefined first class function types issue
* PR #5669: Update example in 5m guide WRT SSA type stability.
* PR #5676: Restore ``numba.types`` as public API
Authors:
* Graham Markall
* Juan Manuel Cruz Martinez
* Pearu Peterson
* Sean Law
* Stuart Archibald (core dev)
* Siu Kwan Lam (core dev)
Version 0.49.0 (Apr 16, 2020)
-----------------------------
This release is very large in terms of code changes. Large scale removal of
unsupported Python and NumPy versions has taken place along with a significant
amount of refactoring to simplify the Numba code base to make it easier for
contributors. Numba's intermediate representation has also undergone some
important changes to solve a number of long standing issues. In addition some
new features have been added and a large number of bugs have been fixed!
IMPORTANT: In this release Numba's internals have moved about a lot. A backwards
compatibility "shim" is provided for this release so as to not immediately break
projects using Numba's internals. If a module is imported from a moved location
the shim will issue a deprecation warning and suggest how to update the import
statement for the new location. The shim will be removed in 0.50.0!
Highlights of core feature changes include:
* Removal of all Python 2 related code and also updating the minimum supported
Python version to 3.6, the minimum supported NumPy version to 1.15 and the
minimum supported SciPy version to 1.0. (Stuart Archibald).
* Refactoring of the Numba code base. The code is now organised into submodules
by functionality. This cleans up Numba's top level namespace.
(Stuart Archibald).
* Introduction of an ``ir.Del`` free static single assignment form for Numba's
intermediate representation (Siu Kwan Lam and Stuart Archibald).
* An OpenMP-like thread masking API has been added for use with code using the
parallel CPU backends (Aaron Meurer and Stuart Archibald).
* For the CUDA target, all kernel launches now require a configuration, this
preventing accidental launches of kernels with the old default of a single
thread in a single block. The hard-coded autotuner is also now removed, such
tuning is deferred to CUDA API calls that provide the same functionality
(Graham Markall).
* The CUDA target also gained an External Memory Management plugin interface to
allow Numba to use another CUDA-aware library for all memory allocations and
deallocations (Graham Markall).
* The Numba Typed List container gained support for construction from iterables
(Valentin Haenel).
* Experimental support was added for first-class function types
(Pearu Peterson).
Enhancements from user contributed PRs (with thanks!):
* Aaron Meurer added support for thread masking at runtime in #4615.
* Andreas Sodeur fixed a long standing bug that was preventing ``cProfile`` from
working with Numba JIT compiled functions in #4476.
* Arik Funke fixed error messages in ``test_array_reductions`` (#5278), fixed
an issue with test discovery (#5239), made it so the documentation would build
again on windows (#5453) and fixed a nested list problem in the docs in #5489.
* Antonio Russo fixed a SyntaxWarning in #5252.
* Eric Wieser added support for inferring the types of object arrays (#5348) and
iterating over 2D arrays (#5115), also fixed some compiler warnings due to
missing (void) in #5222. Also helped improved the "shim" and associated
warnings in #5485, #5488, #5498 and partly #5532.
* Ethan Pronovost fixed a problem with the shim erroneously warning for jitclass
use in #5454 and also prevented illegal return values in jitclass ``__init__``
in #5505.
* Gabriel Majeri added SciPy 2019 talks to the docs in #5106.
* Graham Markall changed the Numba HTML documentation theme to resolve a number
of long standing issues in #5346. Also contributed were a large number of CUDA
enhancements and fixes, namely:
* #5519: CUDA: Silence the test suite - Fix #4809, remove autojit, delete
prints
* #5443: Fix #5196: Docs: assert in CUDA only enabled for debug
* #5436: Fix #5408: test_set_registers_57 fails on Maxwell
* #5423: Fix #5421: Add notes on printing in CUDA kernels
* #5400: Fix #4954, and some other small CUDA testsuite fixes
* #5328: NBEP 7: External Memory Management Plugin Interface
* #5144: Fix #4875: Make #2655 test with debug expect to pass
* #5323: Document lifetime semantics of CUDA Array Interface
* #5061: Prevent kernel launch with no configuration, remove autotuner
* #5099: Fix #5073: Slices of dynamic shared memory all alias
* #5136: CUDA: Enable asynchronous operations on the default stream
* #5085: Support other itemsizes with view
* #5059: Docs: Explain how to use Memcheck with Numba, fixups in CUDA
documentation
* #4957: Add notes on overwriting gufunc inputs to docs
* Greg Jennings fixed an issue with ``np.random.choice`` not acknowledging the
RNG seed correctly in #3897/#5310.
* Guilherme Leobas added support for ``np.isnat`` in #5293.
* Henry Schreiner made the llvmlite requirements more explicit in
requirements.txt in #5150.
* Ivan Butygin helped fix an issue with parfors sequential lowering in
#5114/#5250.
* Jacques Gaudin fixed a bug for Python >= 3.8 in ``numba -s`` in #5548.
* Jim Pivarski added some hints for debugging entry points in #5280.
* John Kirkham added ``numpy.dtype`` coercion for the ``dtype`` argument to CUDA
device arrays in #5252.
* Leo Fang added a list of libraries that support ``__cuda_array_interface__``
in #5104.
* Lucio Fernandez-Arjona added ``getitem`` for the NumPy record type when the
index is a ``StringLiteral`` type in #5182 and improved the documentation
rendering via additions to the TOC and removal of numbering in #5450.
* Mads R. B. Kristensen fixed an issue with ``__cuda_array_interface__`` not
requiring the context in #5189.
* Marcin Tolysz added support for nested modules in AOT compilation in #5174.
* Mike Williams fixed some issues with NumPy records and ``getitem`` in the CUDA
simulator in #5343.
* Pearu Peterson added experimental support for first-class function types in
#5287 (and fixes in #5459, #5473/#5429, and #5557).
* Ravi Teja Gutta added support for ``np.flip`` in #4376/#5313.
* Rohit Sanjay fixed an issue with type refinement for unicode input supplied to
typed-list ``extend()`` (#5295) and fixed unicode ``.strip()`` to strip all
whitespace characters in #5213.
* Vladimir Lukyanov fixed an awkward bug in ``typed.dict`` in #5361, added a fix
to ensure the LLVM and assembly dumps are highlighted correctly in #5357 and
implemented a Numba IR Lexer and added highlighting to Numba IR dumps in
#5333.
* hdf fixed an issue with the ``boundscheck`` flag in the CUDA jit target in
#5257.
General Enhancements:
* PR #4615: Allow masking threads out at runtime
* PR #4798: Add branch pruning based on raw predicates.
* PR #5115: Add support for iterating over 2D arrays
* PR #5117: Implement ord()/chr()
* PR #5122: Remove Python 2.
* PR #5127: Calling convention adaptor for boxer/unboxer to call jitcode
* PR #5151: implement None-typed typed-list
* PR #5174: Nested modules https://github.com/numba/numba/issues/4739
* PR #5182: Add getitem for Record type when index is StringLiteral
* PR #5185: extract code-gen utilities from closures
* PR #5197: Refactor Numba, part I
* PR #5210: Remove more unsupported Python versions from build tooling.
* PR #5212: Adds support for viewing the CFG of the ELF disassembly.
* PR #5227: Immutable typed-list
* PR #5231: Added support for ``np.asarray`` to be used with
``numba.typed.List``
* PR #5235: Added property ``dtype`` to ``numba.typed.List``
* PR #5272: Refactor parfor: split up ParforPass
* PR #5281: Make IR ir.Del free until legalized.
* PR #5287: First-class function type
* PR #5293: np.isnat
* PR #5294: Create typed-list from iterable
* PR #5295: refine typed-list on unicode input to extend
* PR #5296: Refactor parfor: better exception from passes
* PR #5308: Provide ``numba.extending.is_jitted``
* PR #5320: refactor array_analysis
* PR #5325: Let literal_unroll accept types.Named*Tuple
* PR #5330: refactor common operation in parfor lowering into a new util
* PR #5333: Add: highlight Numba IR dump
* PR #5342: Support for tuples passed to parfors.
* PR #5348: Add support for inferring the types of object arrays
* PR #5351: SSA again
* PR #5352: Add shim to accommodate refactoring.
* PR #5356: implement allocated parameter in njit
* PR #5369: Make test ordering more consistent across feature availability
* PR #5428: Wip/deprecate jitclass location
* PR #5441: Additional changes to first class function
* PR #5455: Move to llvmlite 0.32.*
* PR #5457: implement repr for untyped lists
Fixes:
* PR #4476: Another attempt at fixing frame injection in the dispatcher tracing
path
* PR #4942: Prevent some parfor aliasing. Rename copied function var to prevent
recursive type locking.
* PR #5092: Fix 5087
* PR #5150: More explicit llvmlite requirement in requirements.txt
* PR #5172: fix version spec for llvmlite
* PR #5176: Normalize kws going into fold_arguments.
* PR #5183: pass 'inline' explicitly to overload
* PR #5193: Fix CI failure due to missing files when installed
* PR #5213: Fix ``.strip()`` to strip all whitespace characters
* PR #5216: Fix namedtuple mistreated by dispatcher as simple tuple
* PR #5222: Fix compiler warnings due to missing (void)
* PR #5232: Fixes a bad import that breaks master
* PR #5239: fix test discovery for unittest
* PR #5247: Continue PR #5126
* PR #5250: Part fix/5098
* PR #5252: Trivially fix SyntaxWarning
* PR #5276: Add prange variant to has_no_side_effect.
* PR #5278: fix error messages in test_array_reductions
* PR #5310: PR #3897 continued
* PR #5313: Continues PR #4376
* PR #5318: Remove AUTHORS file reference from MANIFEST.in
* PR #5327: Add warning if FNV hashing is found as the default for CPython.
* PR #5338: Remove refcount pruning pass
* PR #5345: Disable test failing due to removed pass.
* PR #5357: Small fix to have llvm and asm highlighted properly
* PR #5361: 5081 typed.dict
* PR #5431: Add tolerance to numba extension module entrypoints.
* PR #5432: Fix code causing compiler warnings.
* PR #5445: Remove undefined variable
* PR #5454: Don't warn for numba.experimental.jitclass
* PR #5459: Fixes issue 5448
* PR #5480: Fix for #5477, literal_unroll KeyError searching for getitems
* PR #5485: Show the offending module in "no direct replacement" error message
* PR #5488: Add missing ``numba.config`` shim
* PR #5495: Fix missing null initializer for variable after phi strip
* PR #5498: Make the shim deprecation warnings work on python 3.6 too
* PR #5505: Better error message if __init__ returns value
* PR #5527: Attempt to fix #5518
* PR #5529: PR #5473 continued
* PR #5532: Make ``numba.<mod>`` available without an import
* PR #5542: Fixes RC2 module shim bug
* PR #5548: Fix #5537 Removed reference to ``platform.linux_distribution``
* PR #5555: Fix #5515 by reverting changes to ArrayAnalysis
* PR #5557: First-class function call cannot use keyword arguments
* PR #5569: Fix RewriteConstGetitems not registering calltype for new expr
* PR #5571: Pin down llvmlite requirement
CUDA Enhancements/Fixes:
* PR #5061: Prevent kernel launch with no configuration, remove autotuner
* PR #5085: Support other itemsizes with view
* PR #5099: Fix #5073: Slices of dynamic shared memory all alias
* PR #5104: Add a list of libraries that support __cuda_array_interface__
* PR #5136: CUDA: Enable asynchronous operations on the default stream
* PR #5144: Fix #4875: Make #2655 test with debug expect to pass
* PR #5189: __cuda_array_interface__ not requiring context
* PR #5253: Coerce ``dtype`` to ``numpy.dtype``
* PR #5257: boundscheck fix
* PR #5319: Make user facing error string use abs path not rel.
* PR #5323: Document lifetime semantics of CUDA Array Interface
* PR #5328: NBEP 7: External Memory Management Plugin Interface
* PR #5343: Fix cuda spoof
* PR #5400: Fix #4954, and some other small CUDA testsuite fixes
* PR #5436: Fix #5408: test_set_registers_57 fails on Maxwell
* PR #5519: CUDA: Silence the test suite - Fix #4809, remove autojit, delete
prints
Documentation Updates:
* PR #4957: Add notes on overwriting gufunc inputs to docs
* PR #5059: Docs: Explain how to use Memcheck with Numba, fixups in CUDA
documentation
* PR #5106: Add SciPy 2019 talks to docs
* PR #5147: Update master for 0.48.0 updates
* PR #5155: Explain what inlining at Numba IR level will do
* PR #5161: Fix README.rst formatting
* PR #5207: Remove AUTHORS list
* PR #5249: fix target path for See also
* PR #5262: fix typo in inlining docs
* PR #5270: fix 'see also' in typeddict docs
* PR #5280: Added some hints for debugging entry points.
* PR #5297: Update docs with intro to {g,}ufuncs.
* PR #5326: Update installation docs with OpenMP requirements.
* PR #5346: Docs: use sphinx_rtd_theme
* PR #5366: Remove reference to Python 2.7 in install check output
* PR #5423: Fix #5421: Add notes on printing in CUDA kernels
* PR #5438: Update package deps for doc building.
* PR #5440: Bump deprecation notices.
* PR #5443: Fix #5196: Docs: assert in CUDA only enabled for debug
* PR #5450: Docs: remove numbers and add titles to TOC
* PR #5453: fix building docs on windows
* PR #5489: docs: fix rendering of nested bulleted list
CI updates:
* PR #5314: Update the image used in Azure CI for OSX.
* PR #5360: Remove Travis CI badge.
Authors:
* Aaron Meurer
* Andreas Sodeur
* Antonio Russo
* Arik Funke
* Eric Wieser
* Ethan Pronovost
* Gabriel Majeri
* Graham Markall
* Greg Jennings
* Guilherme Leobas
* hdf
* Henry Schreiner
* Ivan Butygin
* Jacques Gaudin
* Jim Pivarski
* John Kirkham
* Leo Fang
* Lucio Fernandez-Arjona
* Mads R. B. Kristensen
* Marcin Tolysz
* Mike Williams
* Pearu Peterson
* Ravi Teja Gutta
* Rohit Sanjay
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
* Vladimir Lukyanov
Version 0.48.0 (Jan 27, 2020)
-----------------------------
This release is particularly small as it was present to catch anything that
missed the 0.47.0 deadline (the deadline deliberately coincided with the end of
support for Python 2.7). The next release will be considerably larger.
The core changes in this release are dominated by the start of the clean up
needed for the end of Python 2.7 support, improvements to the CUDA target and
support for numerous additional unicode string methods.
Enhancements from user contributed PRs (with thanks!):
* Brian Wignall fixed more spelling typos in #4998.
* Denis Smirnov added support for string methods ``capitalize`` (#4823),
``casefold`` (#4824), ``swapcase`` (#4825), ``rsplit`` (#4834), ``partition``
(#4845) and ``splitlines`` (#4849).
* Elena Totmenina extended support for string methods ``startswith`` (#4867) and
added ``endswith`` (#4868).
* Eric Wieser made ``type_callable`` return the decorated function itself in
#4760
* Ethan Pronovost added support for ``np.argwhere`` in #4617
* Graham Markall contributed a large number of CUDA enhancements and fixes,
namely:
* #5068: Remove Python 3.4 backports from utils
* #4975: Make ``device_array_like`` create contiguous arrays (Fixes #4832)
* #5023: Don't launch ForAll kernels with 0 elements (Fixes #5017)
* #5016: Fix various issues in CUDA library search (Fixes #4979)
* #5014: Enable use of records and bools for shared memory, remove ddt, add
additional transpose tests
* #4964: Fix #4628: Add more appropriate typing for CUDA device arrays
* #5007: test_consuming_strides: Keep dev array alive
* #4997: State that CUDA Toolkit 8.0 required in docs
* James Bourbeau added the Python 3.8 classifier to setup.py in #5027.
* John Kirkham added a clarification to the ``__cuda_array_interface__``
documentation in #5049.
* Leo Fang Fixed an indexing problem in ``dummyarray`` in #5012.
* Marcel Bargull fixed a build and test issue for Python 3.8 in #5029.
* Maria Rubtsov added support for string methods ``isdecimal`` (#4842),
``isdigit`` (#4843), ``isnumeric`` (#4844) and ``replace`` (#4865).
General Enhancements:
* PR #4760: Make type_callable return the decorated function
* PR #5010: merge string prs
This merge PR included the following:
* PR #4823: Implement str.capitalize() based on CPython
* PR #4824: Implement str.casefold() based on CPython
* PR #4825: Implement str.swapcase() based on CPython
* PR #4834: Implement str.rsplit() based on CPython
* PR #4842: Implement str.isdecimal
* PR #4843: Implement str.isdigit
* PR #4844: Implement str.isnumeric
* PR #4845: Implement str.partition() based on CPython
* PR #4849: Implement str.splitlines() based on CPython
* PR #4865: Implement str.replace
* PR #4867: Functionality extension str.startswith() based on CPython
* PR #4868: Add functionality for str.endswith()
* PR #5039: Disable help messages.
* PR #4617: Add coverage for ``np.argwhere``
Fixes:
* PR #4724: Only use lives (and not aliases) to create post parfor live set.
* PR #4998: Fix more spelling typos
* PR #5024: Propagate semantic constants ahead of static rewrites.
* PR #5027: Add Python 3.8 classifier to setup.py
* PR #5046: Update setup.py and buildscripts for dependency requirements
* PR #5053: Convert from arrays to names in define() and don't invalidate for
multiple consistent defines.
* PR #5058: Permit mixed int types in wrap_index
* PR #5078: Catch the use of global typed-list in JITed functions
* PR #5092: Fix #5087, bug in bytecode analysis.
CUDA Enhancements/Fixes:
* PR #4964: Fix #4628: Add more appropriate typing for CUDA device arrays
* PR #4975: Make ``device_array_like`` create contiguous arrays (Fixes #4832)
* PR #4997: State that CUDA Toolkit 8.0 required in docs
* PR #5007: test_consuming_strides: Keep dev array alive
* PR #5012: Fix IndexError when accessing the "-1" element of dummyarray
* PR #5014: Enable use of records and bools for shared memory, remove ddt, add
additional transpose tests
* PR #5016: Fix various issues in CUDA library search (Fixes #4979)
* PR #5023: Don't launch ForAll kernels with 0 elements (Fixes #5017)
* PR #5068: Remove Python 3.4 backports from utils
Documentation Updates:
* PR #5049: Clarify what dictionary means
* PR #5062: Update docs for updated version requirements
* PR #5090: Update deprecation notices for 0.48.0
CI updates:
* PR #5029: Install optional dependencies for Python 3.8 tests
* PR #5040: Drop Py2.7 and Py3.5 from public CI
* PR #5048: Fix CI py38
Authors:
* Brian Wignall
* Denis Smirnov
* Elena Totmenina
* Eric Wieser
* Ethan Pronovost
* Graham Markall
* James Bourbeau
* John Kirkham
* Leo Fang
* Marcel Bargull
* Maria Rubtsov
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
Version 0.47.0 (Jan 2, 2020)
-----------------------------
This release expands the capability of Numba in a number of important areas and
is also significant as it is the last major point release with support for
Python 2 and Python 3.5 included. The next release (0.48.0) will be for Python
3.6+ only! (This follows NumPy's deprecation schedule as specified in
`NEP 29 <https://numpy.org/neps/nep-0029-deprecation_policy.html>`_.)
Highlights of core feature changes include:
* Full support for Python 3.8 (Siu Kwan Lam)
* Opt-in bounds checking (Aaron Meurer)
* Support for ``map``, ``filter`` and ``reduce`` (Stuart Archibald)
Intel also kindly sponsored research and development that lead to some exciting
new features:
* Initial support for basic ``try``/``except`` use (Siu Kwan Lam)
* The ability to pass functions created from closures/lambdas as arguments
(Stuart Archibald)
* ``sorted`` and ``list.sort()`` now accept the ``key`` argument (Stuart
Archibald and Siu Kwan Lam)
* A new compiler pass triggered through the use of the function
``numba.literal_unroll`` which permits iteration over heterogeneous tuples
and constant lists of constants. (Stuart Archibald)
Enhancements from user contributed PRs (with thanks!):
* Ankit Mahato added a reference to a new talk on Numba at PyCon India 2019 in
#4862
* Brian Wignall kindly fixed some spelling mistakes and typos in #4909
* Denis Smirnov wrote numerous methods to considerable enhance string support
including:
* ``str.rindex()`` in #4861
* ``str.isprintable()`` in #4836
* ``str.index()`` in #4860
* ``start/end`` parameters for ``str.find()`` in #4866
* ``str.isspace()`` in #4835
* ``str.isidentifier()`` #4837
* ``str.rpartition()`` in #4841
* ``str.lower()`` and ``str.islower()`` in #4651
* Elena Totmenina implemented both ``str.isalnum()``, ``str.isalpha()`` and
``str.isascii`` in #4839, #4840 and #4847 respectively.
* Eric Larson fixed a bug in literal comparison in #4710
* Ethan Pronovost updated the ``np.arange`` implementation in #4770 to allow
the use of the ``dtype`` key word argument and also added ``bool``
implementations for several types in #4715.
* Graham Markall fixed some issues with the CUDA target, namely:
* #4931: Added physical limits for CC 7.0 / 7.5 to CUDA autotune
* #4934: Fixed bugs in TestCudaWarpOperations
* #4938: Improved errors / warnings for the CUDA vectorize decorator
* Guilherme Leobas fixed a typo in the ``urem`` implementation in #4667
* Isaac Virshup contributed a number of patches that fixed bugs, added support
for more NumPy functions and enhanced Python feature support. These
contributions included:
* #4729: Allow array construction with mixed type shape tuples
* #4904: Implementing ``np.lcm``
* #4780: Implement np.gcd and math.gcd
* #4779: Make slice constructor more similar to python.
* #4707: Added support for slice.indices
* #4578: Clarify numba ufunc supported features
* James Bourbeau fixed some issues with tooling, #4794 add ``setuptools`` as a
dependency and #4501 add pre-commit hooks for ``flake8`` compliance.
* Leo Fang made ``numba.dummyarray.Array`` iterable in #4629
* Marc Garcia fixed the ``numba.jit`` parameter name signature_or_function in
#4703
* Marcelo Duarte Trevisani patched the llvmlite requirement to ``>=0.30.0`` in
#4725
* Matt Cooper fixed a long standing CI problem in #4737 by remove maxParallel
* Matti Picus fixed an issue with ``collections.abc`` in #4734
from Azure Pipelines.
* Rob Ennis patched a bug in ``np.interp`` ``float32`` handling in #4911
* VDimir fixed a bug in array transposition layouts in #4777 and re-enabled and
fixed some idle tests in #4776.
* Vyacheslav Smirnov Enable support for `str.istitle()`` in #4645
General Enhancements:
* PR #4432: Bounds checking
* PR #4501: Add pre-commit hooks
* PR #4536: Handle kw args in inliner when callee is a function
* PR #4599: Permits closures to become functions, enables map(), filter()
* PR #4611: Implement method title() for unicode based on Cpython
* PR #4645: Enable support for istitle() method for unicode string
* PR #4651: Implement str.lower() and str.islower()
* PR #4652: Implement str.rfind()
* PR #4695: Refactor `overload*` and support `jit_options` and `inline`
* PR #4707: Added support for slice.indices
* PR #4715: Add `bool` overload for several types
* PR #4729: Allow array construction with mixed type shape tuples
* PR #4755: Python3.8 support
* PR #4756: Add parfor support for ndarray.fill.
* PR #4768: Update typeconv error message to ask for sys.executable.
* PR #4770: Update `np.arange` implementation with `@overload`
* PR #4779: Make slice constructor more similar to python.
* PR #4780: Implement np.gcd and math.gcd
* PR #4794: Add setuptools as a dependency
* PR #4802: put git hash into build string
* PR #4803: Better compiler error messages for improperly used reduction
variables.
* PR #4817: Typed list implement and expose allocation
* PR #4818: Typed list faster copy
* PR #4835: Implement str.isspace() based on CPython
* PR #4836: Implement str.isprintable() based on CPython
* PR #4837: Implement str.isidentifier() based on CPython
* PR #4839: Implement str.isalnum() based on CPython
* PR #4840: Implement str.isalpha() based on CPython
* PR #4841: Implement str.rpartition() based on CPython
* PR #4847: Implement str.isascii() based on CPython
* PR #4851: Add graphviz output for FunctionIR
* PR #4854: Python3.8 looplifting
* PR #4858: Implement str.expandtabs() based on CPython
* PR #4860: Implement str.index() based on CPython
* PR #4861: Implement str.rindex() based on CPython
* PR #4866: Support params start/end for str.find()
* PR #4874: Bump to llvmlite 0.31
* PR #4896: Specialise arange dtype on arch + python version.
* PR #4902: basic support for try except
* PR #4904: Implement np.lcm
* PR #4910: loop canonicalisation and type aware tuple unroller/loop body
versioning passes
* PR #4961: Update hash(tuple) for Python 3.8.
* PR #4977: Implement sort/sorted with key.
* PR #4987: Add `is_internal` property to all Type classes.
Fixes:
* PR #4090: Update to LLVM8 memset/memcpy intrinsic
* PR #4582: Convert sub to add and div to mul when doing the reduction across
the per-thread reduction array.
* PR #4648: Handle 0 correctly as slice parameter.
* PR #4660: Remove multiply defined variables from all blocks' equivalence sets.
* PR #4672: Fix pickling of dufunc
* PR #4710: BUG: Comparison for literal
* PR #4718: Change get_call_table to support intermediate Vars.
* PR #4725: Requires llvmlite >=0.30.0
* PR #4734: prefer to import from collections.abc
* PR #4736: fix flake8 errors
* PR #4776: Fix and enable idle tests from test_array_manipulation
* PR #4777: Fix transpose output array layout
* PR #4782: Fix issue with SVML (and knock-on function resolution effects).
* PR #4785: Treat 0d arrays like scalars.
* PR #4787: fix missing incref on flags
* PR #4789: fix typos in numba/targets/base.py
* PR #4791: fix typos
* PR #4811: fix spelling in now-failing tests
* PR #4852: windowing test should check equality only up to double precision
errors
* PR #4881: fix refining list by using extend on an iterator
* PR #4882: Fix return type in arange and zero step size handling.
* PR #4885: suppress spurious RuntimeWarning about ufunc sizes
* PR #4891: skip the xfail test for now. Py3.8 CFG refactor seems to have
changed the test case
* PR #4892: regex needs to accept singular form of "argument"
* PR #4901: fix typed list equals
* PR #4909: Fix some spelling typos
* PR #4911: np.interp bugfix for float32 handling
* PR #4920: fix creating list with JIT disabled
* PR #4921: fix creating dict with JIT disabled
* PR #4935: Better handling of prange with multiple reductions on the same
variable.
* PR #4946: Improve the error message for `raise <string>`.
* PR #4955: Move overload of literal_unroll to avoid circular dependency that
breaks Python 2.7
* PR #4962: Fix test error on windows
* PR #4973: Fixes a bug in the relabelling logic in literal_unroll.
* PR #4978: Fix overload_method problem with stararg
* PR #4981: Add ind_to_const to enable fewer equivalence classes.
* PR #4991: Continuation of #4588 (Let dead code removal handle removing more of
the unneeded code after prange conversion to parfor)
* PR #4994: Remove xfail for test which has since had underlying issue fixed.
* PR #5018: Fix #5011.
* PR #5019: skip pycc test on Python 3.8 + macOS because of distutils issue
CUDA Enhancements/Fixes:
* PR #4629: Make numba.dummyarray.Array iterable
* PR #4675: Bump cuda array interface to version 2
* PR #4741: Update choosing the "CUDA_PATH" for windows
* PR #4838: Permit ravel('A') for contig device arrays in CUDA target
* PR #4931: Add physical limits for CC 7.0 / 7.5 to autotune
* PR #4934: Fix fails in TestCudaWarpOperations
* PR #4938: Improve errors / warnings for cuda vectorize decorator
Documentation Updates:
* PR #4418: Directed graph task roadmap
* PR #4578: Clarify numba ufunc supported features
* PR #4655: fix sphinx build warning
* PR #4667: Fix typo on urem implementation
* PR #4669: Add link to ParallelAccelerator paper.
* PR #4703: Fix numba.jit parameter name signature_or_function
* PR #4862: Addition of PyCon India 2019 talk on Numba
* PR #4947: Document jitclass with numba.typed use.
* PR #4958: Add docs for `try..except`
* PR #4993: Update deprecations for 0.47
CI Updates:
* PR #4737: remove maxParallel from Azure Pipelines
* PR #4767: pin to 2.7.16 for py27 on osx
* PR #4781: WIP/runtest cf pytest
Authors:
* Aaron Meurer
* Ankit Mahato
* Brian Wignall
* Denis Smirnov
* Ehsan Totoni (core dev)
* Elena Totmenina
* Eric Larson
* Ethan Pronovost
* Giovanni Cavallin
* Graham Markall
* Guilherme Leobas
* Isaac Virshup
* James Bourbeau
* Leo Fang
* Marc Garcia
* Marcelo Duarte Trevisani
* Matt Cooper
* Matti Picus
* Rob Ennis
* Rujal Desai
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* VDimir
* Valentin Haenel (core dev)
* Vyacheslav Smirnov
Version 0.46.0
--------------
This release significantly reworked one of the main parts of Numba, the compiler
pipeline, to make it more extensible and easier to use. The purpose of this was
to continue enhancing Numba's ability for use as a compiler toolkit. In a
similar vein, Numba now has an extension registration mechanism to allow other
Numba-using projects to automatically have their Numba JIT compilable functions
discovered. There were also a number of other related compiler toolkit
enhancement added along with some more NumPy features and a lot of bug fixes.
This release has updated the CUDA Array Interface specification to version 2,
which clarifies the `strides` attribute for C-contiguous arrays and specifies
the treatment for zero-size arrays. The implementation in Numba has been
changed and may affect downstream packages relying on the old behavior
(see issue #4661).
Enhancements from user contributed PRs (with thanks!):
* Aaron Meurer fixed some Python issues in the code base in #4345 and #4341.
* Ashwin Srinath fixed a CUDA performance bug via #4576.
* Ethan Pronovost added support for triangular indices functions in #4601 (the
NumPy functions ``tril_indices``, ``tril_indices_from``, ``triu_indices``, and
``triu_indices_from``).
* Gerald Dalley fixed a tear down race occurring in Python 2.
* Gregory R. Lee fixed the use of deprecated ``inspect.getargspec``.
* Guilherme Leobas contributed five PRs, adding support for ``np.append`` and
``np.count_nonzero`` in #4518 and #4386. The typed List was fixed to accept
unsigned integers in #4510. #4463 made a fix to NamedTuple internals and #4397
updated the docs for ``np.sum``.
* James Bourbeau added a new feature to permit the automatic application of the
`jit` decorator to a whole module in #4331. Also some small fixes to the docs
and the code base were made in #4447 and #4433, and a fix to inplace array
operation in #4228.
* Jim Crist fixed a bug in the rendering of patched errors in #4464.
* Leo Fang updated the CUDA Array Interface contract in #4609.
* Pearu Peterson added support for Unicode based NumPy arrays in #4425.
* Peter Andreas Entschev fixed a CUDA concurrency bug in #4581.
* Lucio Fernandez-Arjona extended Numba's ``np.sum`` support to now accept the
``dtype`` kwarg in #4472.
* Pedro A. Morales Maries added support for ``np.cross`` in #4128 and also added
the necessary extension ``numba.numpy_extensions.cross2d`` in #4595.
* David Hoese, Eric Firing, Joshua Adelman, and Juan Nunez-Iglesias all made
documentation fixes in #4565, #4482, #4455, #4375 respectively.
* Vyacheslav Smirnov and Rujal Desai enabled support for ``count()`` on unicode
strings in #4606.
General Enhancements:
* PR #4113: Add rewrite for semantic constants.
* PR #4128: Add np.cross support
* PR #4162: Make IR comparable and legalize it.
* PR #4208: R&D inlining, jitted and overloaded.
* PR #4331: Automatic JIT of called functions
* PR #4353: Inspection tool to check what numba supports
* PR #4386: Implement np.count_nonzero
* PR #4425: Unicode array support
* PR #4427: Entrypoints for numba extensions
* PR #4467: Literal dispatch
* PR #4472: Allow dtype input argument in np.sum
* PR #4513: New compiler.
* PR #4518: add support for np.append
* PR #4554: Refactor NRT C-API
* PR #4556: 0.46 scheduled deprecations
* PR #4567: Add env var to disable performance warnings.
* PR #4568: add np.array_equal support
* PR #4595: Implement numba.cross2d
* PR #4601: Add triangular indices functions
* PR #4606: Enable support for count() method for unicode string
Fixes:
* PR #4228: Fix inplace operator error for arrays
* PR #4282: Detect and raise unsupported on generator expressions
* PR #4305: Don't allow the allocation of mutable objects written into a
container to be hoisted.
* PR #4311: Avoid deprecated use of inspect.getargspec
* PR #4328: Replace GC macro with function call
* PR #4330: Loosen up typed container casting checks
* PR #4341: Fix some coding lines at the top of some files (utf8 -> utf-8)
* PR #4345: Replace "import \*" with explicit imports in numba/types
* PR #4346: Fix incorrect alg in isupper for ascii strings.
* PR #4349: test using jitclass in typed-list
* PR #4361: Add allocation hoisting info to LICM section at diagnostic L4
* PR #4366: Offset search box to avoid wrapping on some pages with Safari.
Fixes #4365.
* PR #4372: Replace all "except BaseException" with "except Exception".
* PR #4407: Restore the "free" conda channel for NumPy 1.10 support.
* PR #4408: Add lowering for constant bytes.
* PR #4409: Add exception chaining for better error context
* PR #4411: Name of type should not contain user facing description for debug.
* PR #4412: Fix #4387. Limit the number of return types for recursive functions
* PR #4426: Fixed two module teardown races in py2.
* PR #4431: Fix and test numpy.random.random_sample(n) for np117
* PR #4463: NamedTuple - Raises an error on non-iterable elements
* PR #4464: Add a newline in patched errors
* PR #4474: Fix liveness for remove dead of parfors (and other IR extensions)
* PR #4510: Make List.__getitem__ accept unsigned parameters
* PR #4512: Raise specific error at typing time for iteration on >1D array.
* PR #4532: Fix static_getitem with Literal type as index
* PR #4547: Update to inliner cost model information.
* PR #4557: Use specific random number seed when generating arbitrary test data
* PR #4559: Adjust test timeouts
* PR #4564: Skip unicode array tests on ppc64le that trigger an LLVM bug
* PR #4621: Fix packaging issue due to missing numba/cext
* PR #4623: Fix issue 4520 due to storage model mismatch
* PR #4644: Updates for llvmlite 0.30.0
CUDA Enhancements/Fixes:
* PR #4410: Fix #4111. cudasim mishandling recarray
* PR #4576: Replace use of `np.prod` with `functools.reduce` for computing size
from shape
* PR #4581: Prevent taking the GIL in ForAll
* PR #4592: Fix #4589. Just pass NULL for b2d_func for constant dynamic
sharedmem
* PR #4609: Update CUDA Array Interface & Enforce Numba compliance
* PR #4619: Implement math.{degrees, radians} for the CUDA target.
* PR #4675: Bump cuda array interface to version 2
Documentation Updates:
* PR #4317: Add docs for ARMv8/AArch64
* PR #4318: Add supported platforms to the docs. Closes #4316
* PR #4375: Add docstrings to inspect methods
* PR #4388: Update Python 2.7 EOL statement
* PR #4397: Add note about np.sum
* PR #4447: Minor parallel performance tips edits
* PR #4455: Clarify docs for typed dict with regard to arrays
* PR #4482: Fix example in guvectorize docstring.
* PR #4541: fix two typos in architecture.rst
* PR #4548: Document numba.extending.intrinsic and inlining.
* PR #4565: Fix typo in jit-compilation docs
* PR #4607: add dependency list to docs
* PR #4614: Add documentation for implementing new compiler passes.
CI Updates:
* PR #4415: Make 32bit incremental builds on linux not use free channel
* PR #4433: Removes stale azure comment
* PR #4493: Fix Overload Inliner wrt CUDA Intrinsics
* PR #4593: Enable Azure CI batching
Contributors:
* Aaron Meurer
* Ashwin Srinath
* David Hoese
* Ehsan Totoni (core dev)
* Eric Firing
* Ethan Pronovost
* Gerald Dalley
* Gregory R. Lee
* Guilherme Leobas
* James Bourbeau
* Jim Crist
* Joshua Adelman
* Juan Nunez-Iglesias
* Leo Fang
* Lucio Fernandez-Arjona
* Pearu Peterson
* Pedro A. Morales Marie
* Peter Andreas Entschev
* Rujal Desai
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
* Vyacheslav Smirnov
Version 0.45.1
--------------
This patch release addresses some regressions reported in the 0.45.0 release and
adds support for NumPy 1.17:
* PR #4325: accept scalar/0d-arrays
* PR #4338: Fix #4299. Parfors reduction vars not deleted.
* PR #4350: Use process level locks for fork() only.
* PR #4354: Try to fix #4352.
* PR #4357: Fix np1.17 isnan, isinf, isfinite ufuncs
* PR #4363: Fix np.interp for np1.17 nan handling
* PR #4371: Fix nump1.17 random function non-aliasing
Contributors:
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
* Valentin Haenel (core dev)
Version 0.45.0
--------------
In this release, Numba gained an experimental :ref:`numba.typed.List
<feature-typed-list>` container as a future replacement of the :ref:`reflected
list <feature-reflected-list>`. In addition, functions decorated with
``parallel=True`` can now be cached to reduce compilation overhead associated
with the auto-parallelization.
Enhancements from user contributed PRs (with thanks!):
* James Bourbeau added the Numba version to reportable error messages in #4227,
added the ``signature`` parameter to ``inspect_types`` in #4200, improved the
docstring of ``normalize_signature`` in #4205, and fixed #3658 by adding
reference counting to ``register_dispatcher`` in #4254
* Guilherme Leobas implemented the dominator tree and dominance frontier
algorithms in #4216 and #4149, respectively.
* Nick White fixed the issue with ``round`` in the CUDA target in #4137.
* Joshua Adelman added support for determining if a value is in a `range`
(i.e. ``x in range(...)``) in #4129, and added windowing functions
(``np.bartlett``, ``np.hamming``, ``np.blackman``, ``np.hanning``,
``np.kaiser``) from NumPy in #4076.
* Lucio Fernandez-Arjona added support for ``np.select`` in #4077
* Rob Ennis added support for ``np.flatnonzero`` in #4157
* Keith Kraus extended the ``__cuda_array_interface__`` with an optional mask
attribute in #4199.
* Gregory R. Lee replaced deprecated use of ``inspect.getargspec`` in #4311.
General Enhancements:
* PR #4328: Replace GC macro with function call
* PR #4311: Avoid deprecated use of inspect.getargspec
* PR #4296: Slacken window function testing tol on ppc64le
* PR #4254: Add reference counting to register_dispatcher
* PR #4239: Support len() of multi-dim arrays in array analysis
* PR #4234: Raise informative error for np.kron array order
* PR #4232: Add unicodetype db, low level str functions and examples.
* PR #4229: Make hashing cacheable
* PR #4227: Include numba version in reportable error message
* PR #4216: Add dominator tree
* PR #4200: Add signature parameter to inspect_types
* PR #4196: Catch missing imports of internal functions.
* PR #4180: Update use of unlowerable global message.
* PR #4166: Add tests for PR #4149
* PR #4157: Support for np.flatnonzero
* PR #4149: Implement dominance frontier for SSA for the Numba IR
* PR #4148: Call branch pruning in inline_closure_call()
* PR #4132: Reduce usage of inttoptr
* PR #4129: Support contains for range
* PR #4112: better error messages for np.transpose and tuples
* PR #4110: Add range attrs, start, stop, step
* PR #4077: Add np select
* PR #4076: Add numpy windowing functions support (np.bartlett, np.hamming,
np.blackman, np.hanning, np.kaiser)
* PR #4095: Support ir.Global/FreeVar in find_const()
* PR #3691: Make TypingError abort compiling earlier
* PR #3646: Log internal errors encountered in typeinfer
Fixes:
* PR #4303: Work around scipy bug 10206
* PR #4302: Fix flake8 issue on master
* PR #4301: Fix integer literal bug in np.select impl
* PR #4291: Fix pickling of jitclass type
* PR #4262: Resolves #4251 - Fix bug in reshape analysis.
* PR #4233: Fixes issue revealed by #4215
* PR #4224: Fix #4223. Looplifting error due to StaticSetItem in objectmode
* PR #4222: Fix bad python path.
* PR #4178: Fix unary operator overload, check with unicode impl
* PR #4173: Fix return type in np.bincount with weights
* PR #4153: Fix slice shape assignment in array analysis
* PR #4152: fix status check in dict lookup
* PR #4145: Use callable instead of checking __module__
* PR #4118: Fix inline assembly support on CPU.
* PR #4088: Resolves #4075 - parfors array_analysis bug.
* PR #4085: Resolves #3314 - parfors array_analysis bug with reshape.
CUDA Enhancements/Fixes:
* PR #4199: Extend `__cuda_array_interface__` with optional mask attribute,
bump version to 1
* PR #4137: CUDA - Fix round Builtin
* PR #4114: Support 3rd party activated CUDA context
Documentation Updates:
* PR #4317: Add docs for ARMv8/AArch64
* PR #4318: Add supported platforms to the docs. Closes #4316
* PR #4295: Alter deprecation schedules
* PR #4253: fix typo in pysupported docs
* PR #4252: fix typo on repomap
* PR #4241: remove unused import
* PR #4240: fix typo in jitclass docs
* PR #4205: Update return value order in normalize_signature docstring
* PR #4237: Update doc links to point to latest not dev docs.
* PR #4197: hyperlink repomap
* PR #4170: Clarify docs on accumulating into arrays in prange
* PR #4147: fix docstring for DictType iterables
* PR #3951: A guide to overloading
CI Updates:
* PR #4300: AArch64 has no faulthandler package