-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add config option to deploy custom elastic-agents as test services #786
Conversation
🌐 Coverage report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks cleaner to me than the previous one.
Could you add an example test package to test/packages
that uses this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good, I added some comments, the only blocking one would be to decide how we want to configure the custom agent. Allowing any kind of docker-compose configuration may complicate maintainability of elastic-package
, see my comment about this.
case "agent": | ||
dockerComposeYMLPath := filepath.Join(serviceDeployerPath, "docker-compose.yml") | ||
if _, err := os.Stat(dockerComposeYMLPath); err != nil { | ||
return nil, errors.Wrap(err, "can't find expected file docker-compose.yml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. Include the path where the file was expected to be found.
test/packages/with-custom-agent/apache/_dev/build/docs/README.md
Outdated
Show resolved
Hide resolved
environment: | ||
FLEET_ENROLL: "1" | ||
FLEET_INSECURE: "1" | ||
FLEET_URL: "http://fleet-server:8220" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, this makes me think that we might need some way to pass variables to these configs. These connection settings depend on the stack and how it is started. For example in #789 I am changing these options, and this would break scenarios with these environment variables.
An option can be to create a different configuration file, expected for example in _dev/deploy/agent/config.yml
, that is used to "patch" a base agent configuration. So for example in this case, it could be something like this:
image: "docker.elastic.co/beats/elastic-agent-complete:8.2.0"
pid: host
cap_add:
- AUDIT_CONTROL
- AUDIT_READ
user: root
This way every package using this deployer only needs to configure the minimal relevant set of settings. And we can control the settings needed for enrollment, or other settings that may be dependent of elastic-package stack
.
Another middle-ground option could be to use env_file
here to include an environment file that would be generated by the deployer, and that includes this kind of environment variables needed by agents to connect with the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you're referring to this issue (Extend "profiles" with local patches).
Do you think that we need a proposal to sketch the final look and then iterate on that? Maybe we should focus on this specific issue. It might be tricky if we want "patches" to be backward compatible with older stacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider this deployer a separate thing to possible general patches for profiles or the stack subcommand. I think that something like this "agent" deployer has value on itself. Even if later we consider more advanced configurations for profiles or the stack subcommand.
The problem I see with general local patches for elastic-package stack
is that they may become a source of many reproducibility problems. You may patch the stack to work on one package, and later you start working on a different package and something unexpectedly doesn't work. Or it may be difficult to know or remember that a certain package needs a patched stack.
This could cause a new set of problems similar to the ones with the packages contained in the registry depending on where elastic-package stack up
was executed (#599).
And as you mention it may be difficult to support patches with multiple versions of the stack in a general way.
I think it is fine to look for a way to start/patch specialized agents on test time as is being done in this PR. These agents are disposable and developers have more awareness on when they are being started. Packages could also select the version of the agent to use, limiting the problems of supporting multiple versions. Variants could help to test with multiple versions or configurations if there are differences.
And if some day we also have stack-level patches, I think it would be ok if the patches are different to the ones used for this "agent" deployer, at the end they are different things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to look for a way to start/patch specialized agents on test time as is being done in this PR.
What about unpatching? Did you plan for this too or is it like the Kubernetes agent, once installed it stays there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to look for a way to start/patch specialized agents on test time as is being done in this PR.
What about unpatching? Did you plan for this too or is it like the Kubernetes agent, once installed it stays there.
No need to unpatch. Current implementation starts this patched agent as a docker compose service, and destroys it on tear down. So it doesn't stay. I think this is a good approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(If I understand it correctly, please @marc-gr correct me if I am wrong 🙂 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is correct 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach :)
Added a couple of comments about implementation details.
internal/testrunner/runners/system/servicedeployer/custom_agent.go
Outdated
Show resolved
Hide resolved
internal/testrunner/runners/system/servicedeployer/custom_agent.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! I left a few comments, but maybe it's better to sync over zoom to explain the technical decisions. Up to you, Marc :)
environment: | ||
- FLEET_ENROLL=1 | ||
- FLEET_INSECURE=1 | ||
- FLEET_URL=http://fleet-server:8220 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on this file, isn't FLEET_TOKEN_POLICY_NAME
not required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would only be needed if we want it to be enrolled to the current stack policy, but we are just interested in it during the test, so the policy will change right after for the test one, running the config useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're right! Could you please drop a comment there in case somebody else will come here?
internal/testrunner/runners/system/servicedeployer/custom-agent-base-config.yml
Outdated
Show resolved
Hide resolved
ExtraArgs: []string{"--build", "-d"}, | ||
} | ||
err = p.Up(opts) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "could not boot up service using Docker Compose") | ||
} | ||
|
||
// Connect service network with stack network (for the purpose of metrics collection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to move docker.ConnectToNetwork
before checking if the service is healthy? Do you think that it won't break if the service doesn't boot up in time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding it makes no difference for the other cases, and in this one is a requirement (healthcheck only passes once enrolled, and needs to be in the same network to be enrolled). Do you think of any specific scenarios where this might be troublesome?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but the compose.go
code is generic, so it may affect also other services like kind or docker based.
You could verify the behavior by simulating a broken Docker service, for instance: a container that immediately fails setup.
|
||
cd, err := newDockerComposeServiceDeployer( | ||
append(ymlPaths, path), | ||
appConfig.StackImageRefs(install.DefaultStackVersion).AsEnv(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean that it won't be possible to use other stack versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other stacks could be used by adding the image
option in the custom-agent.yml file or using service variants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: does it mean that it's possible to boot up Elastic stack 8.1 and the agent 8.3? I'm wondering if we need a helper to select the proper stack. See kind deployer
internal/testrunner/runners/system/servicedeployer/custom_agent.go
Outdated
Show resolved
Hide resolved
internal/testrunner/runners/system/servicedeployer/custom_agent.go
Outdated
Show resolved
Hide resolved
} | ||
|
||
for k, v := range cv { | ||
bc.Services["custom-agent"][k] = v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify a bit what is the purpose of the processing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we take all the custom options from custom-agent.yml
and use them to override the base config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the base config one from ~/.elastic-agent
?
|
||
cd, err := newDockerComposeServiceDeployer( | ||
append(ymlPaths, path), | ||
appConfig.StackImageRefs(install.DefaultStackVersion).AsEnv(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: does it mean that it's possible to boot up Elastic stack 8.1 and the agent 8.3? I'm wondering if we need a helper to select the proper stack. See kind deployer
return "", errors.Wrap(err, "marshal custom-agent config") | ||
} | ||
|
||
tf, err := os.CreateTemp("", dockerCustomAgentName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: There is ~/.elastic-package/tmp
directory and probably also a function to reach out to it.
} | ||
|
||
func createCustomAgentYaml(cfgPath string) (string, error) { | ||
bc := struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could you please replace shortcut-named vars with something meaningful? I admit that I got lost, cv, cb, bc, .. :)
This is useful if you need different capabilities than the provided by the | ||
`elastic-agent` used by the `elastic-package stack` command. | ||
|
||
`custom-agent.yml` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the custom-agent.yml
will be part of a package, we need to cover it with package-spec. You will need to open one more PR.
if len(outCtxt.Ports) > 0 { | ||
outCtxt.Port = outCtxt.Ports[0] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't remember if this condition is still required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we keep it then the port can be referenced from the test config and that can be useful in some cases, not a hard requirement though so we can remove it if there is any concern
For some integrations eg: auditbeat or winlogbeat dependant ones. Would be useful to define custom agents. This adds the ability to define them as a system test service and enroll them during the tests instead of the elastic-package stack ones.
Example test service definition:
Example service test config: