-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stacks may end up with weight of 0 in Route 53 #490
Comments
Weights attached to senza stacks usually reflect percentage of traffic. The behaviour for this corner case can indeed be surprising. We are going to add a warning similar to https://github.com/zalando-stups/senza/blob/master/senza/traffic.py#L122 to notify people additionally to the compensation detail you got already as per your description. |
When I'm already in the situation of having two stacks with a weight of 0 each, I find the resulting behavior quite understandable. I'm more concerned with getting into this situation in the first place which should be avoided at all costs. I'm looking forward to any improvement there. :) By the way, when I speak of weight, I'm actually referring to the weight in Route 53 which only indirectly maps to the weight given in senza. I guess I should be more clear there in the future. So a DNS weight of 0/0 will lead to an actual distribution of 50%/50% and any subsequent scaling will change it to 200/0 and 100%/0% respectively. This creates a very rough and surprising scaling of 0->50->100%. |
This (2 stacks with the same weight will receive the same amount of requests no matter what it is) is a result of AWS Route 53 behavior we cannot control (unless we redo the entire routing approach in the STUPS ecosystem). Stacks are created with 0 weight since in a "normal" situation it will mean it will not get any traffic. I also believe we can't change the behavior of increasing the traffic to also touch the other stack at 0% because until now people could always rely on the fact that stacks with 0% of traffic wouldn't get traffic by accident except on the specific corner case where everything has weight 0. Do you have any suggestion to improve the behavior in a way that will not break deployment setups that are currently working? |
One potential fix for this specific issue could be to threat the case where both/all stacks have 0 traffic weight as a special case and not filtering them out when redistributing the traffic weight. Would that solve this part of the issue for you? |
In some sense the setting of weights from senza is just buggy. Say I want to give 10% to some stack, that stack has to get a weight of 20 and the rest (180) has to be distributed among all other stacks. Right now in the "all weights zero" scenario, the stack would get 200 and that's it. Of course there is still one special case where the stack is the only stack available in which case it should of course receive the 200 regardless of what is provided on the command line (or reject anything but 100% with an appropriate error message). I think my two paragraphs above translate into a "yes", but I'm not 100% sure. 😉 I hope it's clear what I mean. This would solve the scaling but creation of new stacks would still be affected by the "existing stack has 0 weight and a new stack will also get a 0 so 50/50% is the result". I don't know how/if this can be fixed. Maybe check for other stacks during creation and fix weights as necessary? That way an existing 0 weight could be fixed to a 200/0 when adding a new stack or even a 0/0 could be fixed to a 100/100/0 and so on. This would also open the door to fix a, say 1/0 to a 200/0/0...... |
I think the low-hanging fruit here is to add a check in |
It's still not clear how the bug behaves but recently we triggered it repeatedly after issuing a For all intents and purposes I deem the |
Closely related to #449.
We somehow got a stack with a weight of 0. This is no problem if this is the only stack but when deploying a new stack with
senza create
it will also get a weight of 0 which will cause both stacks to immediately get 50% traffic each which might not be desired. Additionally, anysenza scale
will lead to an immediate 100% traffic for the stack that's being scaled, even if specifying a lower percentage.I could understand this happening when there are two stacks with a 100/0% distribution and the stack with 100% is forcefully deleted but we definitely did NOT do this. We only delete stacks that don't receive any traffic.
As of yet I have no idea how we ended up in this situation. It is extremely undesirable.
The text was updated successfully, but these errors were encountered: