Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Add scripts to run samples automatically #829

Merged
merged 4 commits into from
Sep 23, 2020
Merged

Conversation

AntonioND
Copy link
Contributor

@AntonioND AntonioND commented Sep 15, 2020

The scripts are a mix of bash and a bit of GNU expect for flexibility, as each sample needs to be tested in slightly different ways.

Each sample must have a test.sh script in its folder so that the CI detects it and runs it. They are executed by run_sample.sh script, created using run_test.sh as an example.

This PR adds several samples to the CI, but not all of them:

  • The ml folder samples, for example, take far too long to be part of a regular CI run, and need caching of docker images, which is a task on its own.

  • The openmp and nodejs samples seem to be broken.

This PR also fixes the following samples:

  • languages/dotnet: The following error happened when trying to run it:
FailFast:
Couldn't find a valid ICU package installed on the system. Set the configuration flag
System.Globalization.Invariant to true if you want to run with no globalization support.

By adding a new file with this configuration flag set to true, the sample works again. dotnet/core#2186 (comment)

  • basic/helloworld: This PR also simplifies the sample so that it matches the helloworld test.

  • languages/java: When the number of ethreads is 1 the java app hangs after printing the message. With a higher number of ethreads it exits as expected.

This is a partial fix for #231, and is intended to make #332 easier by adding this new bash/expect framework.

@AntonioND AntonioND force-pushed the anninodi/samples branch 2 times, most recently from 31eb4b8 to 31143f5 Compare September 15, 2020 14:30
@AntonioND
Copy link
Contributor Author

I need to make this work in the CI, it looks like the build output isn't saved where I expected it to be.

@AntonioND AntonioND force-pushed the anninodi/samples branch 7 times, most recently from 7c674b3 to 04e98b9 Compare September 16, 2020 14:06
@AntonioND
Copy link
Contributor Author

So now the 8 samples I've added to the CI pass (the CI failures are not this PR's fault).

They also pass in my VM (with run-hw and run-sw):

run_mode=run-sw \
SGXLKL_ETHREADS=1 \
SGXLKL_BUILD_MODE=debug \
SGXLKL_ROOT=~/sgx-lkl \
.azure-pipelines/scripts/test_runner.sh samples  

Copy link
Contributor

@SeanTAllen SeanTAllen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be reasonable to say based on the number of things with 20 minute time outs that if 1 or 2 or these hang that not all of them will be run before the CI run is terminated?

samples/basic/attack/read_memory.sh Outdated Show resolved Hide resolved
samples/containers/encrypted/test.sh Show resolved Hide resolved
samples/containers/encrypted/test.sh Outdated Show resolved Hide resolved
samples/containers/redis/README.md Outdated Show resolved Hide resolved
samples/languages/dotnet/README.md Show resolved Hide resolved
samples/languages/java/README.md Outdated Show resolved Hide resolved
samples/languages/java/test.sh Outdated Show resolved Hide resolved
@SeanTAllen
Copy link
Contributor

@AntonioND rather than using expect, would it be reasonable to merely check the output value like we do with existing tests?

@AntonioND
Copy link
Contributor Author

AntonioND commented Sep 16, 2020

@AntonioND rather than using expect, would it be reasonable to merely check the output value like we do with existing tests?

That's not possible in all cases. For example, the language/java sample doesn't exit after it's done printing the demo message, I have to wait until it's printed and then send a "^C" for the test to finish. For the basic/attack sample you have to wait until the initial process is running and it prints "I'm ready to be attacked", attack it from a second process, and send a "\n" to the initial one to end. I tried doing this by having a countdown in the initial process and launching the second one at the right time, but it wasn't reliable at all. It is pretty reliable with expect.

EDIT: Maybe that behaviour of the java sample is a bug that could be added to the issue I'm going to create to fix the other samples...

@SeanTAllen
Copy link
Contributor

It is pretty reliable with expect.

That sounds like it is flakey. Are you saying it should be expected to fail sometimes with expect?

@AntonioND
Copy link
Contributor Author

That sounds like it is flakey. Are you saying it should be expected to fail sometimes with expect?

I can't be 100% sure it is always going to work, specially after seeing that some samples (like the java one) don't exit when I would expect them to.

Regarding the ones I'm testing with expect, I've never seen them fail so far. Of course, if someone changes the text output of the demo code the expect script will break, that's what I was thinking about when I wrote "pretty reliable".

In general I agree that expect is not a nice thing to use, but in the case of basic/attack I don't know how to do it in a different way, and in the case of languages/java, assuming that the hangup is not a bug (I don't know the expected behaviour after printing the test output), I don't know how to make the test end either.

@AntonioND
Copy link
Contributor Author

Would it be reasonable to say based on the number of things with 20 minute time outs that if 1 or 2 or these hang that not all of them will be run before the CI run is terminated?

I've also fixed this, now there is only one sample that has that timeout. It's not needed when I run the CI, but it is needed when I run the test in my VM.

@AntonioND AntonioND force-pushed the anninodi/samples branch 3 times, most recently from 4d9e471 to 1cb8976 Compare September 17, 2020 09:10
@AntonioND
Copy link
Contributor Author

I've left a note in the languages/java sample that mentions that the sample isn't supposed to need expect scripts, and to remove them when the issue is fixed. I also mention in the comment the GitHub issue that needs to be fixed.

Copy link

@wintersteiger wintersteiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be nice if we could get of expect, but I don't have a good alternative solution either.

samples/basic/helloworld/enclave_config.json Outdated Show resolved Hide resolved
When trying to run the dotnet sample the following error happened:

```
FailFast:
Couldn't find a valid ICU package installed on the system. Set the
configuration flag System.Globalization.Invariant to true if you want to
run with no globalization support.
```

By adding this file, the configuration flag mentioned in the error
message is set to true, and the sample works again. [1]

[1] dotnet/core#2186 (comment)
@AntonioND
Copy link
Contributor Author

So @prp mentioned to me that the java sample is failing because the number of ethreads was 1. I've increased the number of ethreads in that sample and now it's working fine and it doesn't need expect scripts. However, the basic/attack one still needs them (but it's the only sample that does).

@AntonioND AntonioND mentioned this pull request Sep 18, 2020
@SeanTAllen
Copy link
Contributor

LGTM other than the CI failures which I believe are unrelated.

@vtikoo vtikoo requested a review from hukoyu September 21, 2020 14:54
Copy link
Contributor

@vtikoo vtikoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@hukoyu hukoyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AntonioND overall looks good. I provided some feedback. After addressing them this PR is good to be merged.

.azure-pipelines/scripts/run_sample.sh Outdated Show resolved Hide resolved
.azure-pipelines/scripts/run_sample.sh Outdated Show resolved Hide resolved
make "$run_mode"
elif [[ "$test_mode" == "gettimeout" ]]; then
# Default
exit 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 1 second and it is too short to be default value. May be 60 seconds is better.

Copy link
Contributor Author

@AntonioND AntonioND Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this means "default". This is how it works in the case of the Makefiles of the tests. If there is no target called gettimeout, make will return 1. If the target is there, it prints the timeout length.

Take a look at tests/ltp/batch.mk, target gettimeout, for example.

samples/common.mk Outdated Show resolved Hide resolved
samples_dir=$(dirname $(realpath "$BASH_SOURCE"))
SGXLKL_ROOT=$(realpath "${samples_dir}/..")

if [[ -z "${SGXLKL_PREFIX}" ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are already exported in Makefile. Do we need to export again? Also SGXLKL_SETUP_TOOL exists in common.sh and not in common.mk just FYI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some samples need this in the bash script because not all of the commands are run in the Makefile. For example, SGXLKL_SETUP_TOOL is needed in containers/redis/test.sh. A couple of them are needed in languages/python/test.sh.

I only added SGXLKL_SETUP_TOOL to the bash script because it's not needed in the Makefiles.

samples/containers/redis/run-redis-client.sh Outdated Show resolved Hide resolved
samples/containers/redis/test.sh Show resolved Hide resolved
samples/languages/dotnet/test.sh Show resolved Hide resolved
samples/languages/java/test.sh Show resolved Hide resolved
samples/languages/python/test.sh Show resolved Hide resolved
The helloworld demo used a really complicated system to build the elf
file on the host and then transfer it to the docker image. It's better
to simply build it inside docker, as the test does.

This commit copies Dockerfile, Makefile and helloworld.c from the test
to the sample. The Makefile has been modified to still use
enclave_config.json (the test wasn't using it).
When the number of ethreads is 1 the java app hangs after printing the
message. With a higher number of ethreads it exits as expected.
The scripts are a mix of bash and a bit of GNU expect for flexibility,
as each sample needs to be tested in slightly different ways.

Each sample must have a test.sh script in its folder so that the CI
detects it and runs it. They are executed by run_sample.sh script,
created using run_test.sh as an example.

This commit adds several samples to the CI, but not all of them:

- The ml folder samples, for example, take far too long to be part of a
  regular CI run, and need caching of docker images, which is a task on
  its own.

- The openmp and nodejs samples seem to be broken.
@AntonioND AntonioND force-pushed the anninodi/samples branch 2 times, most recently from fed1ece to 570c78a Compare September 23, 2020 09:58
@AntonioND
Copy link
Contributor Author

I'm going to merge this PR now after addressing the last few comments. I'm still going to be working on the samples for a while, so I'm happy to push a PR with more changes if needed.

@AntonioND AntonioND merged commit f8f09a3 into oe_port Sep 23, 2020
@AntonioND AntonioND deleted the anninodi/samples branch September 23, 2020 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants