Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM2 by default #155

Open
24 of 42 tasks
Tracked by #145
Emoun opened this issue Sep 19, 2024 · 20 comments
Open
24 of 42 tasks
Tracked by #145

LLVM2 by default #155

Emoun opened this issue Sep 19, 2024 · 20 comments

Comments

@Emoun
Copy link
Member

Emoun commented Sep 19, 2024

This issue tracks changing our whole setup to use the new compiler by default.

Things that need doing (This is an ordered list that should mostly be done in-order):

  • 1) Testing of toolchain2 command in misc/build.sh by key persons. Each should read this list and test toolchain2 and only flag their name if everything worked as expected and they agree with the following plan.
  • 2) Tagging of master branches as llvm1 in repos to ensure old compiler may still be built. Affected repos:
    • newlib
    • benchmarks
    • patmos
  • 3) Merge llvm2 branches into respective masters on the following repos:
    • newlib
    • benchmarks
    • patmos
  • 4) Update LLVM2 CI to use merged commits:
    • newlib
    • benchmarks
  • 5) Update misc/build.sh to default to toolchain2:
    • Make ARM Mac use x86 prebuilt binary until update of llvm version (v12 to v19)
    • When no target is given, the default target list should be equivalent to toolchain2
    • The check for whether patmos uses llvm2 branch should be switched to instead check for llvm1 and switch to that tag.
    • A new target should be added for using the old compiler (toolchain1). This target should switch newlib and patmos to the llvm1 tag.
    • The llvm target should default to downloading the new compiler repo. A new llvm1 target should be added for downloading the old compiler repo.
  • 6) Testing that the default misc/build.sh works for key persons:
  • 7) Update the handbook to reflect the change
    • The section on llvm2 should be moved to be default and replaced with one referencing llvm1
    • Updated handbook needs to be downloadable at https://patmos.compute.dtu.dk/patmos_handbook.pdf
    • Also update the README in patmos
@Emoun
Copy link
Member Author

Emoun commented Sep 19, 2024

If anyone has an issue that is a blocker for switching to the new compiler, comment in this issue.

@Emoun
Copy link
Member Author

Emoun commented Sep 19, 2024

From @torurstrom:

There is a small issue regarding compiling/linking on llvm2. In patmos try
make comp APP=argo2_test

The file
cmp/nocinit.c
defines several constants that are used in the mp and noc libraries, and on master these are correctly linked, but on llvm2 this doesn't work for some reason.

@Emoun
Copy link
Member Author

Emoun commented Sep 23, 2024

@torurstrom I looked into the problem with argo2_test and looking at this line in the noc lib, what seems to be missing is integration with poseidon, that should suply the missing symbols.

I don't know how this was done in master and whether the llvm2 branch messed with something.

@torurstrom
Copy link
Contributor

@Emoun the file nocinit.c is generated by Argo.scala and for a 6 core with argo looks like this:

/**
 * AUTO-Generated file DO NOT EDIT!!!
 * Loads the pre calculated schedule into the Slot and Route tables.
 */

const int NOC_CONFS = 1;
const int NOC_CORES = 6;
const int NOC_TABLES = 2;
const int NOC_SCHEDULE_ENTRIES = 6;

const int noc_init_array[NOC_CONFS][NOC_CORES][NOC_TABLES][NOC_SCHEDULE_ENTRIES] = {
	{
		{
			{5,3408930,1836325,525090,852258,459298},
			{5,0,0,0,0,0}
		},
		{
			{5,3409186,1835813,852514,458786,525346},
			{5,0,0,0,0,0}
		},
		{
			{5,3408674,1836069,459042,525602,852002},
			{5,0,0,0,0,0}
		},
		{
			{5,1835554,3408165,524322,853026,460066},
			{5,0,0,0,0,0}
		},
		{
			{5,1835042,3408421,853282,459554,524578},
			{5,0,0,0,0,0}
		},
		{
			{5,1835298,3407909,459810,524834,852770},
			{5,0,0,0,0,0}
		}
	}
};

the Makefile in the c folder then uses it via the NOCINIT variable:

$(BUILDDIR)/%.elf: %.c $(NOCINIT) $(LIBMP) $(LIBNOC) $(LIBCORETHREAD) $(LIBETH) $(LIBSD) $(LIBAUDIO) $(LIBELF) Makefile
	mkdir -p $(BUILDDIR)/$(dir $*)
	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(filter %.c %.s,$^) -Wl,-L$(BUILDDIR),-lmp,-lnoc,-lcorethread,-leth,-lelf,-lsd,-laudio

For whatever reason this no longer works.

@schoeberl
Copy link
Member

schoeberl commented Nov 27, 2024

I tried llvm2 on MacOS/ARM:
compile native fails
download with -q, compiling hello world gives the following error:

Error: Block without region: sw.epilog7351102[186], SCCSize: 0
Assertion failed: (UnassignedBlocks == 0), function computeRegions, file PatmosFunctionSplitter.cpp, line 1381.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:

I get a similar error on an Ubuntu 22.04 (in AWS) when installing it with -q

Error: Block without region: sw.epilog7351102[186], SCCSize: 0
patmos-llc: /home/runner/work/patmos-llvm-project/patmos-llvm-project/llvm/lib/Target/Patmos/PatmosFunctionSplitter.cpp:1381: void llvm::agraph::computeRegions(llvm::ablocks&): Assertion `UnassignedBlocks == 0' failed.

The reason is that -O2 is broken. Not a kill to stop migration, but something to look at some time. See also: t-crest/patmos-llvm-project#14

@schoeberl
Copy link
Member

schoeberl commented Nov 28, 2024

My mistake to try llvm2 and not toolchain2. How did we get any GNU specific tar options in?

Error: Must have GNU Tar installed as 'tar'.
Error: Your installed 'tar':
Error: bsdtar 3.5.3 - libarchive 3.5.3 zlib/1.2.12 liblzma/5.4.3 bz2lib/1.0.8 
Error: Hint: if you are using macos, bsdtar is the default 'tar' installed. You can use brew to get GNU tar instead

Usually I do not like to add gnu specific tools to mu Unix version, but I will try.

After installing tar the compiler builds, but errors when building newlib:

../../../../patmos-newlib/newlib/doc/makedoc.c:1316:31: error: incompatible integer to pointer conversion passing 'long' to parameter of type 'stinst_type' (aka 'void (*)()') [-Wint-conversion]
                     add_to_definition(ptr, atol(word));

Now it works on my side. Ignore that one here.

@Emoun
Copy link
Member Author

Emoun commented Nov 28, 2024

@schoeberl I'm currently working on updating the compiler to LLVM 19. This is taking a long time but should resolve some problems we have on newer platforms (Mac ARM, ubuntu 24). I have no ETA on when this is done.

To install GNU Tar, look at this part of the CI. We have to use GNU Tar as part of packaging, because bsdtar (mac default) does not have the features needed. Also, the github MacOS environment now comes with GNU Tar by default and the README highlights this requirement. I realize it is suboptimal to depend on a specific tar (GNU tar vs bsd tar) but I have not found a way to achieve what is needed in a portable way.

@Emoun
Copy link
Member Author

Emoun commented Nov 28, 2024

Since the update to LLVM19 is taking longer than hoped, I'm thinking of just making LLVM2 default now without the update. Please let me know if you got it compiling/working on you machine, then I might do the merge today or tomorrow.

@schoeberl
Copy link
Member

As I have not runnig any version on my Mac/ARM, I will not oppose it. The old -q download that worked on my Mac (some weeks/months) ago is gone. So making llvm2 default or not makes no difference on Mac/ARM. Then let's go ahead, especially as you plan to preserve the old compiler for some time. I will now run on helena to check, as she is up again.

@schoeberl
Copy link
Member

Success! toolchain2 is working on Helena and on my Mac/ARM! The issue was the missing -O2. See also: t-crest/patmos-llvm-project#14

@Emoun
Copy link
Member Author

Emoun commented Dec 2, 2024

LLVM2 has been made default in all repositories. I have tested on the compute cluster that a clean build produces the new compiler by default. I'm currently testing that the old compiler may be used with the command ./misc/build.sh toolchain1.

@schoeberl @lucapezza @torurstrom @tjarker Please test at your convenience. Testing ideas:

  • Clean build from empty t-crest folder using ./misc/build.sh
  • Clean quick build from empty t-crest folder using ./misc/build.sh -q
  • Clean build of the old compiler toolchain using the same as above two commands with toolchain1 added
  • Make sure you can synthesize patmos and psuh to fpga

Please use the new compiler for all things you need. When you feed everything is working as expected, please but a checkout by your name under point 6 in the above list.

@schoeberl
Copy link
Member

Native build on Mac/ARM fails with the newlib error, as reported above.
Build with -q somehow works, but fails with Patmos as it needs poseidon, which should be built before Patmos. I fixed the build.sh. Otherwise I am fine on my Mac.

@schoeberl
Copy link
Member

After the build.sh fix, it also compiled and run fine on Helena (Ubuntu 20.04).
BTW, I changed the Patmos default configuration to 4 cores, as we should do more multicore work.

@Emoun
Copy link
Member Author

Emoun commented Dec 3, 2024

Native build on Mac/ARM fails with the newlib error, as reported above.

@schoeberl Is it the issue from this comment? Have you built from scratch? If not, could you try to go into the llvm-project/patmos-newlib folder and do git pull and then try to build again?

@schoeberl
Copy link
Member

I have built from scratch. Yes, it is that comment.

@Emoun
Copy link
Member Author

Emoun commented Dec 5, 2024

@schoeberl That line has been fixed with an explicit cast. From your comment, I do not see the explicit cast, which makes me think you somehow are using an outdated version of newlib. Try deleting the newlib folder llvm-project/patmos-newlib and newlib build folder llvm-project/build-newlib and run the build script again

@schoeberl
Copy link
Member

I will try a new build. After using '-q' I do not have llvm-project.

@schoeberl
Copy link
Member

That is super strange, as I still get the error:

../../../../patmos-newlib/newlib/doc/makedoc.c:1316:31: error: incompatible integer to pointer conversion passing 'long' to parameter of type 'stinst_type' (aka 'void (*)()') [-Wint-conversion]
                     add_to_definition(ptr, atol(word));

And it is a clean build.sh from a plain git clone of misc.

@Emoun
Copy link
Member Author

Emoun commented Dec 12, 2024

Could you maybe send me the whole build log? Run the build with ./misc/build.sh > log.txt 2>&1 and then send me the log

@schoeberl
Copy link
Member

Fixing the ceckout of newlib in build.sh fixed the problem above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants