inference-directed codegen via CodeInstance edges and typeinf_ext_toplevel APIs #56880

vtjnash · 2024-12-21T04:52:32Z

This is building upon the many efforts around using CodeInstance everywhere (especially as the return value from jl_type_infer and the input format to edges) by moving a lot of hard-coded algorithms that were previously in C (such as recursive_compile_graph and jl_ci_cache_lookup), and which were therefore previously also slightly broken (especially with concurrent environments), into Julia's Compiler.jl code, where we can most likely maintain them much better going forward. See descriptions in the individual commits for some of the specifics of the changes and fixes, and how to change existing code to use these API correctly. In followup stages, most code relevant to precompile_utils, trim, and even the allocation-checker should consider being moved into Julia also now, since being written in C/C++ is currently providing negative value for maintaining those, and the change in the API boundary should now make that additional conversion easier.

Gives a considerably smaller system image, despite having more code, by being better algorithms which avoid allocating permanent garbage: 155 MB -> 147MB in .text

Makes a slightly larger Pkg.ji file cache, hopefully mainly due to being more strategic about what code is compiled (because that logic is mostly now in Julia instead of C), as it appear to have both inferred and compiled about 10% more code according to high level analysis of it:
$ du -sh usr/share/julia/compiled/v1.12/
237M # on PR
222M # on master

Sometimes an edge (especially from precompile file, but sometimes from inference) will specify a CodeInstance that does not need to be compiled for its ABI and simply needs to be cloned to point to the existing copy of it.

…tandard helper function

This was failing the h_world_age test sometimes.

This avoids unnecessary compression when running (not generating code). While generating code, we continue the legacy behavior of storing compressed code, since restarting from a ji without that is quite slow. Eventually, we should also remove that code also once we have generated the object file from it. This replaces the defective SOURCE_MODE_FORCE_SOURCE option with a new `typeinf_ext_toplevel` batch-mode interface for compilation which returns all required source code. Only two options remain now: SOURCE_MODE_NOT_REQUIRED : Require only that the IPO information (e.g. rettype and friends) is present. SOURCE_MODE_FORCE_ABI : Require that the IPO information is present (for ABI computation) and that the returned CodeInstance can be invoked on the host target (preferably after inference, called directly, but perfectly acceptable for Base.Compiler to instead force the runtime to use a stub there or call into it with the interpreter instead by having failed to provide any code). This replaces the awkward `jl_create_native` interface (which is now just a shim for calling the new batch-mode `typeinf_ext_toplevel`) with a simpler `jl_emit_native` API, which does not do any inference or other callbacks, but simply is a batch-mode call to `jl_emit_codeinfo` and the work to build the external wrapper around them for linkage.

vchuravy · 2024-12-21T10:47:03Z

src/aotcompile.cpp

+        ct->world_age = jl_typeinf_world;
+        codeinfos = (jl_array_t*)jl_apply(fargs, 4);
+        ct->world_age = last_age;
+#define jl_is_array_any(v)    jl_typetagis(v,jl_array_any_type)


Move to header?

vchuravy · 2024-12-21T10:52:52Z

src/aotcompile.cpp

+
+// also be used be extern consumers like GPUCompiler.jl to obtain a module containing
+// all reachable & inferrrable functions.
+void *jl_emit_native_impl(jl_array_t *codeinfos, LLVMOrcThreadSafeModuleRef llvmmod, const jl_cgparams_t *cgparams, int _external_linkage)


Mark this DLLEXPORT?

I think from a GPUCompiler pespective we would also be okay with an interface that just takes a single codeinstance + codeinfo

vtjnash added 5 commits December 21, 2024 03:35

teach jitlayers to use equivalent edges

e5c05fe

Sometimes an edge (especially from precompile file, but sometimes from inference) will specify a CodeInstance that does not need to be compiled for its ABI and simply needs to be cloned to point to the existing copy of it.

opaque_closure: fix data-race mistakes with reading fields by using s…

43c4a43

…tandard helper function

opaque_closure: fix world-age mistake in fallback path

38df40b

This was failing the h_world_age test sometimes.

delete unused code, so the jit no longer uses the inferred field at all

cf7be9c

vtjnash added compiler:codegen Generation of LLVM IR and native code don't squash Don't squash merge labels Dec 21, 2024

vchuravy reviewed Dec 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference-directed codegen via CodeInstance edges and typeinf_ext_toplevel APIs #56880

inference-directed codegen via CodeInstance edges and typeinf_ext_toplevel APIs #56880

vtjnash commented Dec 21, 2024

vchuravy Dec 21, 2024

vchuravy Dec 21, 2024

inference-directed codegen via CodeInstance edges and typeinf_ext_toplevel APIs #56880

Are you sure you want to change the base?

inference-directed codegen via CodeInstance edges and typeinf_ext_toplevel APIs #56880

Conversation

vtjnash commented Dec 21, 2024

vchuravy Dec 21, 2024

Choose a reason for hiding this comment

vchuravy Dec 21, 2024

Choose a reason for hiding this comment