Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create different agent per each test execution #1724

Merged
merged 120 commits into from
Apr 9, 2024

Conversation

mrodm
Copy link
Contributor

@mrodm mrodm commented Mar 13, 2024

This PR adds support to create a new agent per each test run.

Created a new module agentdeployer that is in charge of creating agents separately from the servicedeployer module.

Added a new environment variable ELASTIC_PACKAGE_TEST_ENABLE_INDEPENDENT_AGENT to enable creating independent agents for tests (system tests). In order to enable this feature, this environment variable must be set to true:

ELASTIC_PACKAGE_TEST_ENABLE_INDEPENDENT=true elastic-package test system -v

ELASTIC_PACKAGE_TEST_ENABLE_INDEPENDENT=true elastic-package test system --config-file <path> --setup --test-independent-agent
ELASTIC_PACKAGE_TEST_ENABLE_INDEPENDENT=true elastic-package test system --test-independent-agent --no-provision
ELASTIC_PACKAGE_TEST_ENABLE_INDEPENDENT=true elastic-package test system --test-independent-agent --tear-down

Pending:

  • Test with Terraform services (gcp, aws and aws_logs).
  • Issues using stages in system tests
  • Issues with tests using elastic-agent as a hostname to run requests
    • Choose between environment variable or hard-coded variables
    • Added environment variables but also ensure that it can be used elastic-agent alias as previously.
  • Test with previous versions
    • Elastic stack 7.17.x (e.g. 7.17.18) - Policy name is not the same
    • Elastic stack >=8.0.0,<8.3.0 - No tags in this version.
      • Use agent hostname adding as suffix the Test RunID (integer).
      • Kubernetes requires to use tags to try to distinguish from other agents. Filtered agents with Policy ID (created in advance)
  • Ensure logs are reviewed after each test: elastic-agent does not run in the same docker-compose project
    • Working for docker-compose scenarios
  • Update if needed internal/service/boot.go
    • Added a new options field to keep the previous behavior (DeployIndependentAgent).
  • Update benchrunner system to use also agentdeployer module
    • Added a new options field to keep the previous behavior (DeployIndependentAgent).
  • Issues with enrollment token after some executions
  • custom agents (_dev/deploy/agent) should be moved to agentdeployer module?
    • These compose scenarios contain both agent and service definitions
  • Add system configuration name into the agent hostname?
    • Not sure if needed, would that allow to run more tests in parallel ?
  • Include variants in custom agents (servicedeployer) ?
  • Allow more than one test run in parallel for Kubernetes? It would require allowing more than one Elastic agent in Kubernetes.

Relates #787

@mrodm mrodm self-assigned this Mar 13, 2024
@mrodm mrodm force-pushed the add_agent_deployer branch from 8064fe9 to 50ad13d Compare March 13, 2024 18:46
@mrodm
Copy link
Contributor Author

mrodm commented Mar 14, 2024

Containers are using the alias elastic-agent to run requests to the agent. That happens for instance in these test packages: hits_count_assertion and ti_anomali. Example:

Two different solutions here:

  • Update each package to use the corresponding new hostname in these commands
  • Try to add a new environment variable that contains this information.
    • Example: e4d893e
    • In the example above where the hostname appears in the configuration file (ndjson) it does not work.
    • Set hostname as environment variable:
      {"message_type":"start","integrator_metadata":{"user_metadata":"{\"url\":\"http://${AGENT_HOSTNAME}:9080/\",\"secret\":...}}
      
    • Error from the container log:
      [2024-03-14 12:12:20,506] [ERROR] HTTPConnectionPool(host='$%7bagent_hostname%7d', port=9080): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x786e2cfba7c0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
      

WDYT @jsoriano ? Should we go for the environment variables ?

I tried also to not define hostname in the compose scenario


but doing so:

  • the elastic-agent enrolls just as a hash (container ID):
    image

  • there is an error in hit_counts_assertion test package:

    {"level":"debug","ts":"2024-03-14T17:58:32.839Z","caller":"output/util.go:23","msg":"Connecting...","address":"elastic-package-agent-hit_count_assertion-test-docker-custom-agent-1:9999"}
    Error: dial tcp: lookup elastic-package-agent-hit_count_assertion-test-docker-custom-agent-1: no such host
    

So, I tried to set the hostname for the agents directly in the docker-compose scenario. That hostname is shared (using environment variables) with the service, so they could run queries against elastic-agent if needed (e.g. send some logs).

@@ -113,15 +117,14 @@ func (tsd TerraformServiceDeployer) SetUp(ctx context.Context, svcInfo ServiceIn
env: tfEnvironment,
shutdownTimeout: 300 * time.Second,
}
outCtxt := svcInfo
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid creating a temporal variable

Comment on lines +334 to +341
if r.options.RunTearDown || r.options.RunTestsOnly {
logger.Debug("Skip creating output directory")
} else {
outputDir, err := servicedeployer.CreateOutputDir(r.locationManager, svcInfo.Test.RunID)
if err != nil {
return servicedeployer.ServiceInfo{}, fmt.Errorf("could not create output dir for terraform deployer %w", err)
}
svcInfo.OutputDir = outputDir
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that this folder is just created with --setup or regular testing commands.

The OutputDir and RunID values will be retrieved from ServiceState in case testing is run in stages (e.g. --no-provision)

serviceDeployer, err := servicedeployer.Factory(serviceOptions)
if err != nil {
return nil, fmt.Errorf("could not create service runner: %w", err)
// Configure package (single data stream) via Fleet APIs.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move creation of test Agent Policy as one of the first steps, so it can be used to enroll the new agents.

@mrodm mrodm requested a review from jsoriano April 4, 2024 17:14
Comment on lines +1112 to +1115
// In case of custom agent (servicedeployer) enabling independent agents, update serviceOptions to include test policy too
if r.options.RunIndependentElasticAgent {
serviceOptions.PolicyName = agentInfo.Policy.Name
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, it is not set the Agent Policy name in all the scenarios, to avoid changing the current behaviour. Currently, in servicedeployer it is used the default policy Elastic-Agent (elastic-package)

internal/kibana/agents.go Outdated Show resolved Hide resolved
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mrodm

@mrodm mrodm merged commit 26e9a98 into elastic:main Apr 9, 2024
3 checks passed
@mrodm mrodm deleted the add_agent_deployer branch April 9, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants