Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacks may end up with weight of 0 in Route 53 #490

Open
musiKk opened this issue Sep 22, 2017 · 7 comments
Open

Stacks may end up with weight of 0 in Route 53 #490

musiKk opened this issue Sep 22, 2017 · 7 comments
Assignees

Comments

@musiKk
Copy link

musiKk commented Sep 22, 2017

Closely related to #449.

We somehow got a stack with a weight of 0. This is no problem if this is the only stack but when deploying a new stack with senza create it will also get a weight of 0 which will cause both stacks to immediately get 50% traffic each which might not be desired. Additionally, any senza scale will lead to an immediate 100% traffic for the stack that's being scaled, even if specifying a lower percentage.

$ senza traffic my-service
Stack Name│Version│Identifier        │Weight%
my-service 1x16b11 my-service-1x16b11     0.0 
my-service 1x17b13 my-service-1x17b13     0.0 
$ senza traffic my-service 1x17b13 15
Calculating new weights.. OK
Stack Name│Version│Identifier         │Old Weight%│Delta│Compensation│New Weight%│Current
my-service 1x16b11 my-service-1x16b11         0.0                            0.0
my-service 1x17b13 my-service-1x17b13         0.0 100.0         85.0       100.0 <
Setting weights for my-service.my-team.my-domain... OK
$ senza traffic my-service
Stack Name│Version│Identifier        │Weight%
my-service 1x16b11 my-service-1x16b11     0.0 
my-service 1x17b13 my-service-1x17b13   100.0

I could understand this happening when there are two stacks with a 100/0% distribution and the stack with 100% is forcefully deleted but we definitely did NOT do this. We only delete stacks that don't receive any traffic.

As of yet I have no idea how we ended up in this situation. It is extremely undesirable.

@lotharschulz
Copy link
Contributor

Weights attached to senza stacks usually reflect percentage of traffic. The behaviour for this corner case can indeed be surprising. We are going to add a warning similar to https://github.com/zalando-stups/senza/blob/master/senza/traffic.py#L122 to notify people additionally to the compensation detail you got already as per your description.

@musiKk
Copy link
Author

musiKk commented Sep 22, 2017

When I'm already in the situation of having two stacks with a weight of 0 each, I find the resulting behavior quite understandable. I'm more concerned with getting into this situation in the first place which should be avoided at all costs. I'm looking forward to any improvement there. :)

By the way, when I speak of weight, I'm actually referring to the weight in Route 53 which only indirectly maps to the weight given in senza. I guess I should be more clear there in the future. So a DNS weight of 0/0 will lead to an actual distribution of 50%/50% and any subsequent scaling will change it to 200/0 and 100%/0% respectively. This creates a very rough and surprising scaling of 0->50->100%.

@jmcs
Copy link
Member

jmcs commented Jan 12, 2018

it will also get a weight of 0 which will cause both stacks to immediately get 50% traffic each which might not be desired.

This (2 stacks with the same weight will receive the same amount of requests no matter what it is) is a result of AWS Route 53 behavior we cannot control (unless we redo the entire routing approach in the STUPS ecosystem). Stacks are created with 0 weight since in a "normal" situation it will mean it will not get any traffic.

I also believe we can't change the behavior of increasing the traffic to also touch the other stack at 0% because until now people could always rely on the fact that stacks with 0% of traffic wouldn't get traffic by accident except on the specific corner case where everything has weight 0.

Do you have any suggestion to improve the behavior in a way that will not break deployment setups that are currently working?

@jmcs
Copy link
Member

jmcs commented Jan 12, 2018

Additionally, any senza scale will lead to an immediate 100% traffic for the stack that's being scaled, even if specifying a lower percentage.

One potential fix for this specific issue could be to threat the case where both/all stacks have 0 traffic weight as a special case and not filtering them out when redistributing the traffic weight. Would that solve this part of the issue for you?

@musiKk
Copy link
Author

musiKk commented Jan 12, 2018

One potential fix for this specific issue could be to threat the case where both/all stacks have 0 traffic weight as a special case and not filtering them out when redistributing the traffic weight. Would that solve this part of the issue for you?

In some sense the setting of weights from senza is just buggy. Say I want to give 10% to some stack, that stack has to get a weight of 20 and the rest (180) has to be distributed among all other stacks. Right now in the "all weights zero" scenario, the stack would get 200 and that's it.

Of course there is still one special case where the stack is the only stack available in which case it should of course receive the 200 regardless of what is provided on the command line (or reject anything but 100% with an appropriate error message).

I think my two paragraphs above translate into a "yes", but I'm not 100% sure. 😉 I hope it's clear what I mean.

This would solve the scaling but creation of new stacks would still be affected by the "existing stack has 0 weight and a new stack will also get a 0 so 50/50% is the result". I don't know how/if this can be fixed. Maybe check for other stacks during creation and fix weights as necessary? That way an existing 0 weight could be fixed to a 200/0 when adding a new stack or even a 0/0 could be fixed to a 100/100/0 and so on. This would also open the door to fix a, say 1/0 to a 200/0/0......

@a1exsh
Copy link
Contributor

a1exsh commented Jan 12, 2018

I think the low-hanging fruit here is to add a check in senza create to stop if all currently running stack versions have zero traffic.

@jmcs jmcs self-assigned this Jan 16, 2018
@musiKk
Copy link
Author

musiKk commented Oct 21, 2019

It's still not clear how the bug behaves but recently we triggered it repeatedly after issuing a senza update to various stacks. The changes compared to the running stack were changing Minimum and Maximum of the AutoScaling object. In all cases there was only one version of the stack running so having traffic at zero only created some small issues. But it's unclear whether having multiple versions would have triggered the bug as well.

For all intents and purposes I deem the update command as unusable unless only a single version of a particular application is running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants