metrics: Internalize metric label sets

The major memory user in the metric subsystem is the per series label sets. This is for two reasons: - The labels are stored multiple times in different places: Twice as part of `registered_metric` (one current labels and one original one for relabeling - these are often just the same), one copy as the lookup key in the values map and another final copy in the metadata copy for scraping - Lots of the labels are actually duplicated across metric series. For example for one io-queue class/mountpoint/iogroup combination you get 15 series all with the same labels. Similar is true for other metrics like reactor or scheduler metrics. In Redpanda we have an even more extreme pattern where per partition metrics create more than 60 series with the same label set. One idea to improve this would be to intern the individual label strings. In theory this would give improvements on top of the duplication described above. Most label keys and values are duplicated even if not all labels are the same (e.g.: shard). On a more detailed look this however wouldn't be very effective. Most labels are fairly short, e.g. looking at some seastar and redpanda metrics: ``` io_queue...{class="raft",iogroup="0",mountpoint="none",shard="0"} redpanda...{namespace="kafka",partition="9985",shard="2",topic="foobar"} ``` Each of the label key and values are shorter than 16 bytes and hence fit into the seastar short string optimization. Hence, each string only takes 16 bytes. In the above example maybe one of the eight strings might leave the SSO length (e.g.: mountpoint and topic) but those are in the minority. Interning the string in less than 16 bytes is not trivial so the gain would actually not be that large (deletion needs to be thread safe as the metadata might be copied to different shards at scrape time). There is further hidden overhead. Labels are stored in a `std::map` for fast lookup. Each node in the map has a ~32 byte overhead for the node struct. So even at a zero overhead internalisation we could only save about 2x in the most likely case. Hence this patch takes the strategy of internalizing the whole labels set (`std::map`). This means that we also internalize map overhead and still get an overall reduction as described above. For the actual internalization we store the labels `std::map` in a `std::shared_ptr` (for cross shared safety). It's stored in a `std::set` for lookup when adding new metrics. In Redpanda this patch reduces the metrics memory usage by a factor of 3-4 and hence saving gigabytes of memory at high partition counts (yes we likey have too many metrics). This also makes `update_metrics_if_needed` faster as less copying is needed and also increases cache locality at scrape time making that cheaper as well (a yet to be committed microbench by Travis Downs shows a ~10% improvement). Once all other shared_ptr references have gone out of scope we need to delete the entry out of the internalization set. We do this scan in `update_metrics_if_needed`. The overhead is minimal as the set doesn't actually get super big and extra overhead is compensated by `update_metrics_if_needed` being cheaper as mentioned above. Ideally we could do some deleter tricks but this is being complicated by thread safety issues in case the last reference is not on the origin shard.
scylladb · Nov 28, 2024 · 3a8bcbe · 3a8bcbe
1 parent 665fed0
commit 3a8bcbe
Show file tree

Hide file tree

Showing 5 changed files with 105 additions and 31 deletions.
diff --git a/apps/io_tester/io_tester.cc b/apps/io_tester/io_tester.cc
@@ -642,8 +642,8 @@ class io_class_data : public class_data {
         const auto& mf = values.find(m_name);
         assert(mf != values.end());
         for (auto&& mi : mf->second) {
-            auto&& cname = mi.first.find("class");
-            if (cname != mi.first.end() && cname->second == name()) {
+            auto&& cname = mi.first.labels().find("class");
+            if (cname != mi.first.labels().end() && cname->second == name()) {
                 out << YAML::Key << m_name << YAML::Value << mi.second->get_function()().d();
             }
         }

diff --git a/include/seastar/core/metrics_api.hh b/include/seastar/core/metrics_api.hh
@@ -42,6 +42,8 @@ namespace metrics {
 namespace impl {
 
 using labels_type = std::map<sstring, sstring>;
+using internalized_labels_ref = std::shared_ptr<const labels_type>;
+
 }
 }
 }
@@ -106,7 +108,7 @@ class metric_id {
 public:
     metric_id() = default;
     metric_id(group_name_type group, metric_name_type name,
-                    labels_type labels = {})
+                    internalized_labels_ref labels)
                     : _group(std::move(group)), _name(
                                     std::move(name)), _labels(labels) {
     }
@@ -123,17 +125,20 @@ public:
         _group = name;
     }
     const instance_id_type & instance_id() const {
-        return _labels.at(shard_label.name());
+        return _labels->at(shard_label.name());
     }
     const metric_name_type & name() const {
         return _name;
     }
     const labels_type& labels() const {
-        return _labels;
+        return *_labels;
     }
-    labels_type& labels() {
+    internalized_labels_ref internalized_labels() const {
         return _labels;
     }
+    void update_labels(internalized_labels_ref labels) {
+        _labels = labels;
+    }
     sstring full_name() const;
 
     bool operator<(const metric_id&) const;
@@ -144,7 +149,7 @@ private:
     }
     group_name_type _group;
     metric_name_type _name;
-    labels_type _labels;
+    internalized_labels_ref _labels;
 };
 }
 }
@@ -191,11 +196,44 @@ struct metric_family_info {
  */
 struct metric_info {
     metric_id id;
-    labels_type original_labels;
+    internalized_labels_ref original_labels;
     bool enabled;
     skip_when_empty should_skip_when_empty;
 };
 
+class internalized_holder {
+    internalized_labels_ref _labels;
+public:
+    explicit internalized_holder(labels_type labels) : _labels(std::make_shared<labels_type>(std::move(labels))) {
+    }
+
+    explicit internalized_holder(internalized_labels_ref labels) : _labels(std::move(labels)) {
+    }
+
+    internalized_labels_ref labels_ref() const {
+        return _labels;
+    }
+
+    const labels_type& labels() const {
+        return *_labels;
+    }
+
+    size_t has_users() const {
+        // Getting the count wrong isn't a correctness issue but will just make internalization worse
+        return _labels.use_count() > 1;
+    }
+};
+
+inline bool operator<(const internalized_holder& lhs, const labels_type& rhs) {
+    return lhs.labels() < rhs;
+}
+inline bool operator<(const labels_type& lhs, const internalized_holder& rhs) {
+    return lhs < rhs.labels();
+}
+inline bool operator<(const internalized_holder& lhs, const internalized_holder& rhs) {
+    return lhs.labels() < rhs.labels();
+}
+
 
 class impl;
 
@@ -234,7 +272,7 @@ public:
 };
 
 using register_ref = shared_ptr<registered_metric>;
-using metric_instances = std::map<labels_type, register_ref>;
+using metric_instances = std::map<internalized_holder, register_ref, std::less<>>;
 using metrics_registration = std::vector<register_ref>;
 
 class metric_groups_impl : public metric_groups_def {
@@ -268,12 +306,12 @@ public:
     metric_family(metric_instances&& instances) : _instances(std::move(instances)) {
     }
 
-    register_ref& operator[](const labels_type& l) {
-        return _instances[l];
+    register_ref& operator[](const internalized_labels_ref& l) {
+        return _instances[internalized_holder(l)];
     }
 
-    const register_ref& at(const labels_type& l) const {
-        return _instances.at(l);
+    const register_ref& at(const internalized_labels_ref& l) const {
+        return _instances.at(internalized_holder(l));
     }
 
     metric_family_info& info() {
@@ -356,6 +394,8 @@ struct config {
     sstring hostname;
 };
 
+using internalized_set = std::set<internalized_holder, std::less<>>;
+
 class impl {
     value_map _value_map;
     config _config;
@@ -365,6 +405,7 @@ class impl {
     std::vector<std::deque<metric_function>> _current_metrics;
     std::vector<relabel_config> _relabel_configs;
     std::vector<metric_family_config> _metric_family_configs;
+    internalized_set _internalized_labels;
 public:
     value_map& get_value_map() {
         return _value_map;
@@ -375,6 +416,7 @@ public:
     }
 
     register_ref add_registration(const metric_id& id, const metric_type& type, metric_function f, const description& d, bool enabled, skip_when_empty skip, const std::vector<std::string>& aggregate_labels);
+    internalized_labels_ref internalize_labels(labels_type labels);
     void remove_registration(const metric_id& id);
     future<> stop() {
         return make_ready_future<>();
@@ -412,6 +454,10 @@ public:
     void set_metric_family_configs(const std::vector<metric_family_config>& metrics_config);
 
     void update_aggregate(metric_family_info& mf) const noexcept;
+
+private:
+    void gc_internalized_labels();
+    bool apply_relabeling(const relabel_config& rc, metric_info& info);
 };
 
 const value_map& get_value_map();

diff --git a/src/core/metrics.cc b/src/core/metrics.cc
@@ -145,7 +145,7 @@ const std::vector<metric_family_config>& get_metric_family_configs() {
     return impl::get_local_impl()->get_metric_family_configs();
 }
 
-static bool apply_relabeling(const relabel_config& rc, impl::metric_info& info) {
+bool impl::impl::apply_relabeling(const relabel_config& rc, metric_info& info) {
     std::stringstream s;
     bool first = true;
     for (auto&& l: rc.source_labels) {
@@ -181,14 +181,18 @@ static bool apply_relabeling(const relabel_config& rc, impl::metric_info& info)
         }
         case relabel_config::relabel_action::drop_label: {
             if (info.id.labels().find(rc.target_label) != info.id.labels().end()) {
-                info.id.labels().erase(rc.target_label);
+                auto new_labels = info.id.labels();
+                new_labels.erase(rc.target_label);
+                info.id.update_labels(internalize_labels(std::move(new_labels)));
             }
             return true;
         };
         case relabel_config::relabel_action::replace: {
             if (!rc.target_label.empty()) {
                 std::string fmt_s = match.format(rc.replacement);
-                info.id.labels()[rc.target_label] = fmt_s;
+                auto new_labels = info.id.labels();
+                new_labels[rc.target_label] = fmt_s;
+                info.id.update_labels(internalize_labels(std::move(new_labels)));
             }
             return true;
         }
@@ -219,7 +223,7 @@ registered_metric::registered_metric(metric_id id, metric_function f, bool enabl
     _info.enabled = enabled;
     _info.should_skip_when_empty = skip;
     _info.id = id;
-    _info.original_labels = id.labels();
+    _info.original_labels = id.internalized_labels();
 }
 
 metric_value metric_value::operator+(const metric_value& c) {
@@ -302,6 +306,14 @@ metric_groups_impl::~metric_groups_impl() {
     }
 }
 
+internalized_labels_ref impl::internalize_labels(labels_type labels) {
+    auto it = _internalized_labels.find(labels);
+    if (it == _internalized_labels.end()) {
+        it = _internalized_labels.emplace(labels).first;
+    }
+    return it->labels_ref();
+}
+
 metric_groups_impl& metric_groups_impl::add_metric(group_name_type name, const metric_definition& md)  {
     // We could just do this in the constructor but some metric groups (like the
     // smp queue ones) are actually constructed on a different shard originally
@@ -311,7 +323,9 @@ metric_groups_impl& metric_groups_impl::add_metric(group_name_type name, const m
         _impl = get_local_impl();
     }
 
-    metric_id id(name, md._impl->name, md._impl->labels);
+    auto internalized_labels = get_local_impl()->internalize_labels(md._impl->labels);
+
+    metric_id id(name, md._impl->name, internalized_labels);
 
     auto reg = get_local_impl()->add_registration(id, md._impl->type, md._impl->f, md._impl->d, md._impl->enabled, md._impl->_skip_when_empty, md._impl->aggregate_labels);
 
@@ -404,6 +418,16 @@ instance_id_type shard() {
     return to_sstring(this_shard_id());
 }
 
+void impl::gc_internalized_labels() {
+    for (auto it = _internalized_labels.begin(); it != _internalized_labels.end();) {
+        if (!it->has_users()) {
+            it = _internalized_labels.erase(it);
+        } else {
+            ++it;
+        }
+    }
+}
+
 void impl::update_metrics_if_needed() {
     if (_dirty) {
         // Forcing the metadata to an empty initialization
@@ -435,6 +459,8 @@ void impl::update_metrics_if_needed() {
         _current_metrics.resize(i);
         _metadata = mt_ref;
         _dirty = false;
+
+        gc_internalized_labels();
     }
 }
 
@@ -463,7 +489,7 @@ register_ref impl::add_registration(const metric_id& id, const metric_type& type
         if (metric.info().type != type.base_type) {
             throw std::runtime_error("registering metrics " + name + " registered with different type.");
         }
-        metric[rm->info().id.labels()] = rm;
+        metric[rm->info().id.internalized_labels()] = rm;
         for (auto&& i : rm->info().id.labels()) {
             _labels.insert(i.first);
         }
@@ -474,7 +500,7 @@ register_ref impl::add_registration(const metric_id& id, const metric_type& type
         _value_map[name].info().name = id.full_name();
         _value_map[name].info().aggregate_labels = aggregate_labels;
         impl::update_aggregate(_value_map[name].info());
-        _value_map[name][rm->info().id.labels()] = rm;
+        _value_map[name][rm->info().id.internalized_labels()] = rm;
     }
     dirty();
 
@@ -487,14 +513,13 @@ future<metric_relabeling_result> impl::set_relabel_configs(const std::vector<rel
     for (auto&& family : _value_map) {
         std::vector<shared_ptr<registered_metric>> rms;
         for (auto&& metric = family.second.begin(); metric != family.second.end();) {
-            metric->second->info().id.labels().clear();
-            metric->second->info().id.labels() = metric->second->info().original_labels;
+            metric->second->info().id.update_labels(metric->second->info().original_labels);
             for (auto rl : _relabel_configs) {
                 if (apply_relabeling(rl, metric->second->info())) {
                     dirty();
                 }
             }
-            if (metric->first != metric->second->info().id.labels()) {
+            if (metric->first.labels() != metric->second->info().id.labels()) {
                 // If a metric labels were changed, we should remove it from the map, and place it back again
                 rms.push_back(metric->second);
                 family.second.erase(metric++);
@@ -504,7 +529,7 @@ future<metric_relabeling_result> impl::set_relabel_configs(const std::vector<rel
             }
         }
         for (auto rm : rms) {
-            labels_type lb = rm->info().id.labels();
+            auto ilb = rm->info().id.internalized_labels();
             if (family.second.find(rm->info().id.labels()) != family.second.end()) {
                 /*
                  You can not have a two metrics with the same name and label, there is a problem with the configuration.
@@ -515,12 +540,14 @@ future<metric_relabeling_result> impl::set_relabel_configs(const std::vector<rel
                 */
                 seastar_logger.error("Metrics: After relabeling, registering metrics twice for metrics : {}", family.first);
                 auto id = get_unique_id();
-                lb["err"] = id;
+                auto new_labels = *ilb;
+                new_labels["err"] = id;
+                ilb = internalize_labels(new_labels);
                 conflicts.metrics_relabeled_due_to_collision++;
-                rm->info().id.labels()["err"] = id;
+                rm->info().id.update_labels(ilb);
             }
 
-            family.second[lb] = rm;
+            family.second[ilb] = rm;
         }
     }
     return make_ready_future<metric_relabeling_result>(conflicts);

diff --git a/src/core/scollectd.cc b/src/core/scollectd.cc
@@ -97,8 +97,9 @@ registration::registration(type_instance_id&& id)
 }
 
 seastar::metrics::impl::metric_id to_metrics_id(const type_instance_id & id) {
-    return seastar::metrics::impl::metric_id(id.plugin(), id.type_instance(),
-            {{seastar::metrics::shard_label.name(), seastar::metrics::impl::shard()}});
+    seastar::metrics::impl::labels_type labels {{seastar::metrics::shard_label.name(), seastar::metrics::impl::shard()}};
+    auto internalized_labels = std::make_shared<seastar::metrics::impl::labels_type>(std::move(labels));
+    return seastar::metrics::impl::metric_id(id.plugin(), id.type_instance(), std::move(internalized_labels));
 }
 
 
@@ -575,7 +576,7 @@ options::options(program_options::option_group* parent_group)
 
 static seastar::metrics::impl::register_ref get_register(const scollectd::type_instance_id& i) {
     seastar::metrics::impl::metric_id id = to_metrics_id(i);
-    return seastar::metrics::impl::get_value_map().at(id.full_name()).at(id.labels());
+    return seastar::metrics::impl::get_value_map().at(id.full_name()).at(id.internalized_labels());
 }
 
 std::vector<collectd_value> get_collectd_value(

diff --git a/tests/unit/rpc_test.cc b/tests/unit/rpc_test.cc
@@ -1668,7 +1668,7 @@ SEASTAR_THREAD_TEST_CASE(test_rpc_metric_domains) {
         const auto& mf = values.find(name);
         BOOST_REQUIRE(mf != values.end());
         for (auto&& mi : mf->second) {
-            for (auto&&li : mi.first) {
+            for (auto&&li : mi.first.labels()) {
                 if (li.first == "domain" && li.second == domain) {
                     return mi.second->get_function()().i();
                 }