Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable auto-scaling for web and streaming API #1

Open
andreaswittig opened this issue Nov 16, 2022 · 10 comments
Open

Enable auto-scaling for web and streaming API #1

andreaswittig opened this issue Nov 16, 2022 · 10 comments
Labels
enhancement New feature or request

Comments

@andreaswittig
Copy link
Contributor

Evaluate and implement auto-scaling for ECS services web and streaming API.

@andreaswittig andreaswittig added the enhancement New feature or request label Nov 16, 2022
@scrappydog
Copy link

+1 for this feature

(some documentation of best practices on manual scale up process would be nice too)

@compuguy
Copy link

compuguy commented Nov 25, 2022

Based on several days of working with the three services, one can do an HA and auto-scaling configuration out of the box if one sets AutoScaling to true, and sets the DesiredCount, MaxCapacity, and MinCapacity. The only service that doesn't scale well is the sidekiq service. According to this page https://docs.joinmastodon.org/admin/scaling/#sidekiq, you can have multiple sidekiq services on different queues, except for the scheduler queue. There can only be one of those. My fork has a few of these changes already in the istoleyourpw-deploy branch: https://github.com/compuguy/mastodon-on-aws

Edit: Came across this article (https://nora.codes/post/scaling-mastodon-in-the-face-of-an-exodus/), it explains how to split up the sidekiq tasks. Can have multiple instances with the default, push, and pull queues, and have one instance for mailer and scheduler.

@scrappydog
Copy link

My Sidekiq task is regularly pegging at 100% CPU utilization... definitely need some guidance on configuring scaling...

@michaelwittig
Copy link
Contributor

@scrappydog Same for us. I'm not sure if that is an issue. It likely doesn't matter if the background tasks utilize all resources as long as they finish withou much delay. For us, we see spikes to 100% but only for minutes. Do you see the same pattern?
Screenshot 2022-11-28 at 09 42 10

@scrappydog
Copy link

That looks very similar to utilization on my instance.

My inner system admin really "wants" to add another task... but I agree as long as jobs are completing in a reasonable time it's not an immediate issue.

BUT we are running tiny instances for testing... we NEED a way to scale up... :-)

@scrappydog
Copy link

I bumped the CPU allocation up on the Sidekiq task to CPU .5 vCPU | Memory 3 GB...

This feels happier for now... but it doesn't address the real scalability question...

@scrappydog
Copy link

scrappydog commented Nov 30, 2022

image

Upgraded about half way through this graph... definably a lot better!

@michaelwittig
Copy link
Contributor

I opened up #20 for sidekiq. This issue is about auto-scaling for web and streaming API.

Enabling auto-scaling is not the big deal here. What we need is a good metric to trigger scale out/in. And we need a test workload to test tis with. I have no idea how we can simulate mastodon load. If anyone here is reading this running an instance with enough users to benefit rom auto-scaling please let us know.

@nodomain
Copy link

nodomain commented Dec 4, 2022

Just add a relay server and you will have CPU load in a minute.

https://github.com/brodi1/activitypub-relays

@compuguy
Copy link

compuguy commented Dec 4, 2022

I opened up #20 for sidekiq. This issue is about auto-scaling for web and streaming API.

Enabling auto-scaling is not the big deal here. What we need is a good metric to trigger scale out/in. And we need a test workload to test tis with. I have no idea how we can simulate mastodon load. If anyone here is reading this running an instance with enough users to benefit rom auto-scaling please let us know.

Yeah it's quite easy to autoscale the web and streaming API's. But for most people it's #20 that's more important since Sidekiq does most of the heavy lifting for Mastodon...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants