-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #19 from jetstack/site-content-changes
Added extra site content, improved existing content and renamed some sections
- Loading branch information
Showing
8 changed files
with
91 additions
and
761 deletions.
There are no files selected for viewing
742 changes: 0 additions & 742 deletions
742
charts/opencost-config/dashboards/OpenCost-multiday-summary.json
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# Grafana dashboards & Prometheus | ||
|
||
Depending on how you are runnning Prometheus and Grafana, you might need to tweak the PromQL used in the various dashboards provided in [our chart](https://github.com/jetstack/finops-stack/tree/main/charts/opencost-config/dashboards). This page contains notes and troubleshooting tips based on our experiences. | ||
|
||
## Prometheus | ||
|
||
### Duplicated labels | ||
|
||
The Opencost metrics use the `instance` and `namespace` labels to identify the _pod and namespace to which the metric relates_, i.e. _not_ the Opencost Prometheus Exporter pod. Depending on how your Prometheus service is configured or if you are using Google Managed Prometheus (GMP) the values of these labels could be overwritten, namely: | ||
|
||
- GMP replaces the `instance` and `namespace` values with the name and namespace of the pod the metrics were scraped from. In our case this is always the Exporter pod in the `finops-stack` namespace. | ||
|
||
- In standalone Prometheus the same behaviour as above will occur unless config value `honor_labels` has been set to true. | ||
|
||
In both cases the original values of the namespace and instance labels will be copied to `exported_namespace` and `exported_instance` respectively. | ||
|
||
__NOTE:__ Our dashboards use the `exported_*` labels as GMP does not support the `honor_labels` configuration. | ||
|
||
### Query execution time | ||
|
||
Several of the PromQL queries use group_left 'joins'; be aware these can get expensive and time consuming if you select very broad time ranges. | ||
|
||
### Accessing Google Managed Prometheus APIs | ||
|
||
**Only applicable if you are using GKE + Google Managed Prometheus** | ||
|
||
GMP provides endpoints for almost all of Prometheus' API (as documented [here](https://cloud.google.com/stackdriver/docs/managed-prometheus/query-api-ui#http-api-details)); like all of GCP's APIs they require a valid Bearer token in the Authorization header. Thus, the Grafana GMP data source needs a valid token. Google do provide 2 suggestions for automating this process ([here](https://cloud.google.com/stackdriver/docs/managed-prometheus/query-api-ui)). Neither solution is simple and both have downsides, so we provide an alternative via the [`gmp-proxy` chart](https://github.com/jetstack/finops-stack/tree/main/charts/gmp-proxy) in the `finops-stack` repo. This chart installs an Envoy Proxy workload which is configured to add a valid Bearer token to each GMP API request before forwarding it to the GMP endpoint. | ||
|
||
#### Usage | ||
|
||
##### Pre-requisites | ||
|
||
The GMP Proxy Pod uses GCP Workload Identity so you'll need to associate the SA used by the Pod with a GCP SA that has the following roles: | ||
|
||
- monitoring.viewer | ||
- iam.serviceAccountTokenCreator | ||
|
||
##### Envoy Proxy image | ||
|
||
The standard Envoy Docker image can't be used as is because an additional Lua library is required, so we have provided a custom image which includes this library. If you prefer to use your own image, this is the `Dockerfile` we used: | ||
|
||
``` | ||
# Image from https://hub.docker.com/r/envoyproxy/envoy | ||
FROM envoyproxy/envoy:v1.31-latest | ||
RUN apt update && apt install -y luarocks | ||
RUN luarocks install lua-cjson | ||
``` | ||
|
||
## Dashboards | ||
|
||
Currently the working and tested dashboards are: | ||
|
||
### FinOps Stack: Cluster cost & efficiency | ||
|
||
The purpose of this dashboard is to provide an overview of the cost of your cluster and namespaces using Opencost's calculations. The time range control is disabled and you can only select from 3 time periods using a drop down in the top left. This is because some of the queries are expensive and with too long a time period cause the query request to timeout. | ||
|
||
### FinOps Stack: Used | Wasted Resources for cluster & namespace | ||
|
||
The purpose of this dashboard is to provide details of resource usage and wastage at the cluster, namespace and pod level. You can select different time periods using the standard Grafana time range control in the top right. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,6 +19,6 @@ GRAFANA_INGRESS="false" | |
# GRAFANA_PUBLIC_IP_NAME="name-of-public-ip" | ||
# GRAFANA_FQDN="grafana.host.name" | ||
|
||
## GCP SA for workload identity for cert-manager (only required if using ingress) | ||
# CERT_MANAGER_SA_ANNOTATION="iam.gke.io/gcp-service-account: [email protected]" | ||
# CERT_MANAGER_EMAIL="[email protected]" | ||
## GCP SA for workload identity for cert-manager (need to be defined but only used if cert-manager is being installed) | ||
CERT_MANAGER_SA_ANNOTATION="iam.gke.io/gcp-service-account: [email protected]" | ||
CERT_MANAGER_EMAIL="[email protected]" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters