PerfectScale Prometheus Exporter

The PerfectScale Prometheus Exporter seamlessly integrates PerfectScale insights into your existing monitoring and alerting infrastructure

PerfectScale Prometheus Exporter is a powerful feature that converts PerfectScale's optimization recommendations, cost insights, and resource utilization metrics into Prometheus format. Exposing these insights as Prometheus metrics enables you to seamlessly incorporate PerfectScale's detailed analysis into your standard monitoring and alerting workflows, enhancing visibility into cost-efficiency, resource usage across your infrastructure, and overall system performance.

To expose insights from PerfectScale as Prometheus metrics, a Kubernetes cluster with Helm v3 or higher and the PerfectScale agent installed and running is required.

Installing the Chart

You can seamlessly install the chart in two simple steps:

Add the PerfectScale Helm repository

helm repo add perfectscale <https://perfectscale-io.github.io> --force-update

Install the chart

helm upgrade --install --namespace perfectscale psc-prom-exporter perfectscale/psc-prom-exporter

The Psc-Prom-Exporter must be installed in the same namespace as the PerfectScale Agent to reuse the existing credentials secret.

This will install Prometheus exporter with basic configuration, without manually configured scrapers, dashboards, and alerts. To configure it according to your needs, please follow the documentation. Quick install or Configuration and Monitoring integrations sections.

The general behavior of the Prom-Exporter is defined in the config section of the values file. Monitoring tools integration is configured in the scrapers, alerts, and dashboards sections of the values file.

Quick install:

Prom-Exporter with Prometheus Operator with Alerts example and Grafana dashboard

helm upgrade --install --namespace perfectscale psc-prom-exporter \
--set scrapers.serviceMonitor.enabled=true \
--set alerts.prometheusRule.enabled=true \
--set dashbords.grafana.enabled=true \
perfectscale/psc-prom-exporter

Prom-Exporter with DataDog and DataDog alerts

helm upgrade --install --namespace perfectscale psc-prom-exporter \
--set scrapers.datadog.enabled=true \
--set alerts.datadogMonitor.enabled=true \
perfectscale/psc-prom-exporter

Configuration

config:
  metrics:
    recommendations:
      enabled: true    # Enables resource recommendation metrics
    costs:
      enabled: true    # Enables cost analysis metrics
    indicators:
      enabled: true    # Enables resource utilization indicators

  filters:
    workloads:
      includeMuted: true           # Includes workloads marked as muted in PerfectScale
      minRunningMinutes: 30        # Only includes workloads running longer than 30 minutes
      types:                       # Workload types to include
        - "*"                      # "*" includes all workload types (Deployments, StatefulSets, etc.)
    namespaces:
      exclude:                     # Namespaces to exclude from metrics
        - kube-system
        - default
    indicators:
      types:                       # Types of indicators to expose
        - waste                    # Resource waste indicators
        - risk                     # Risk indicators

  labels:
    includedLabels:
      - "*"                        # Includes all Kubernetes labels
    excludedLabels:
      - "/.*perfectscale.*/"      # Excludes labels matching this regex pattern
    excludeClusterUID: false       # Includes cluster UID in metrics

Metrics configuration

metrics:
    recommendations:
      enabled: true    # Enable resource recommendation metrics
    costs:
      enabled: true    # Enable cost analysis metrics
    indicators:
      enabled: true    # Enable resource utilization indicators

recommendations

recommendations allows managing the exposure of resource recommendation metrics and, when enabled, exposes the following metrics:

ps_recommended_memory_request_bytes
ps_recommended_cpu_request_cores
ps_recommended_memory_limit_bytes
ps_recommended_cpu_limit_cores
ps_current_memory_request_bytes
ps_current_cpu_request_cores
ps_current_memory_limit_bytes
ps_current_cpu_limit_cores

costs

costs allows managing the exposure of cost analysis metrics and, when enabled, exposes the following metrics:

ps_cost_usd - hourly workload cos in USD
ps_waste_usd - estimated hourly waste in USD

indicators

indicators allows managing the exposure of resource utilization indicators and, when enabled, exposes the ps_workload_indicators metric with labels for different indicator types.

Filters configuration

filters:
    workloads:
      includeMuted: true          # Include muted workloads
      minRunningMinutes: 30       # Minimum runtime to include
      types:                      # Workload types to include
        - "*"                     # All types
        # Or specify specific types:
        # - "Deployment"
        # - "StatefulSet"
        # - "DaemonSet"
    namespaces:
      exclude:                     # Namespaces to exclude from metrics
        - kube-system
        - default
    indicators:
      types:                       # Types of indicators to expose
        - waste                    # Resource waste indicators
        - risk                     # Risk indicators

workloads

workloads allows managing which workload should be included in the metric based on its type, run-time, etc.

namespaces

namespaces allows managing which namespace should be excluded from the metric.

indicators

indicators allows managing indicators to expose

Labels configuration

The labels section allows managing Kubernetes labels to be included in the metrics.

The excludedLabels take precedence over includedLabels.

labels:
  includedLabels:
    - "*"                    # Include all labels
    # Or specify specific labels:
    # - "label_app"
    # - "label_environment"

  excludedLabels:
    - "/.*perfectscale.*/"   # Exclude labels matching regex
    # - "label_internal_id"  # Exclude specific label

  excludeClusterUID: false   # Include cluster UID in metrics

In the Prometheus format, all special label characters are replaced with an underscore _. Example:

Kubernetes: app.kubernetes.io/instance

Prometheus: label_app_kubernetes_io_instance

Label pattern support

Pattern

Description

"label_full_name"

Matches the exact label name

"*"

Matches all labels

"/some-regexp/"

Matches labels with the specific regular expression

Configuration examples

Here are a few configuration examples designed to help you effortlessly integrate PerfectScale's insights into your monitoring workflow.

Recommended configuration for production environments with short-living workloads excluded

config:
  metrics:
    recommendations:
      enabled: true
    costs:
      enabled: true
    indicators:
      enabled: true

  filters:
    workloads:
      includeMuted: false
      minRunningMinutes: 60
      types:
        - "Deployment"
        - "StatefulSet"
        - "DaemonSet"
    namespaces:
      exclude:
        - kube-system
        - default
        - monitoring
    indicators:
      types:
        - waste
        - risk

  labels:
    includedLabels:
      - "label_app"
      - "label_environment"
      - "label_team"
    excludedLabels:
      - "/.*internal.*/
      - "/.*perfectscale.*/"

    excludeClusterUID: false

Monitoring integrations

The PerfectScale Prometheus Exporter is compatible with a range of out-of-the-box monitoring systems to enhance your monitoring process and make it more efficient. This support allows seamless integration, enabling you to quickly leverage PerfectScale's insights without requiring extensive configuration.

Prometheus Operator integration

This integration will create a Service Monitor Custom Resource to scrape Prom-exporter with Prometheus Operator.

scrapers:
  serviceMonitor:
    enabled: true
    interval: 5m
    path: /metrics
    timeout: 30s
    # Additional Label to the serviceMonitor
    # labels:
    #   prometheus: kube-prometheus

Prometheus (Standard Discovery)

This integration allows the use of annotations for Prometheus auto-discovery.

scrapers:
  prometheus:
    enabled: true
    path: "/metrics"
    port: "http"
    interval: "30s"
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "8080"
      prometheus.io/path: "/metrics"

Datadog Autodiscovery

This integration allows you to pull metrics directly into Datadog. It supports both v1 and v2 autodiscovery annotations.

scrapers:
  datadog:
    enabled: true
    # Choose AD version: "v1" or "v2"
    adVersion: "v2"
    containerName: "psc-prom-exporter"
    # Common configuration for both versions
    config:
      endpoint: "/metrics"
      port: 8080
      namespace: "perfectscale"
      # Set maxReturnedMetrics, DataDog default value is 2000. Increase if needed.
      maxReturnedMetrics: 10000
      # List of metrics to collect with optional renaming
      metrics:
        # Wildcard pattern for ps_ metrics
        - name: "ps_.*"
        # Renaming exampple:
        #- name: "ps_waste_usd"
          #rename: "perfectscale.waste.usd"
      # Additional configuration options
      options: {}

Dashboards

PerfectScale provides a pre-built Grafana dashboard that offers valuable insights. These dashboards allow you to visualize key metrics and enhance your monitoring experience with minimal configuration required.

Single Workload Dashboard

This dashboard offers a comprehensive view of your application's performance. It showcases detailed workload metrics and a clear overview of how resources are utilized, enabling you to monitor efficiency trends and identify optimization opportunities. It includes the following data:

Detailed workload metrics
Resource usage patterns
Cost breakdown
Optimization opportunities

Grafana dashboard

Its available as part of helm chart:

dashboards:
  grafana:
    enabled: true
    namespace: monitoring
    labels:
      grafana_dashboard: "1"
      team: perfectscale
    annotations:
      grafana.folder: "PerfectScale"

You can also find it in:

Our monitoring Git repo: https://github.com/perfectscale-io/observability
On Grafana Dashboards: https://grafana.com/grafana/dashboards/22278

DataDog dashboard

You can find it in our monitoring Git repo: https://github.com/perfectscale-io/observability

Alert rules

You can effortlessly set up custom alert rules in the prometheusRule Helm values configuration section. This will help you stay ahead of key indicator changes, allowing you to address potential issues proactively and prevent them from impacting the system.

Configuring alerts

Cost optimization alert

There is an example of a cost optimization alert designed to help you proactively manage expenses and enhance resource efficiency for Prometheus and DataDog.

  rules:
    - alert: "PerfectScale Waste Cost Surge"
      enabled: true
      expr: |
        (
          ps_waste_usd > 0
          and
          (
            ps_waste_usd
            /
            (ps_waste_usd offset 1h)
          ) > 1.5
        )
      for: 15m
      labels:
        severity: warning
        team: cost-optimization
        type: waste
      annotations:
        summary: "Waste cost increased by more than 50%"
        description: "Workload {{ $labels.workload_name }} in namespace {{ $labels.namespace }} has increased its waste by more than 50% in the last hour. Current waste: {{ $value | humanizePercentage }} USD/hour"

    - alert: "PerfectScale High Absolute Waste"
      enabled: true
      expr: |
        ps_waste_usd > 100
      for: 30m
      labels:
        severity: warning
        team: cost-optimization
        type: waste
      annotations:
        summary: "High waste cost detected"
        description: "Workload {{ $labels.workload_name }} in namespace {{ $labels.namespace }} has waste cost exceeding 100 USD/hour. Current waste: {{ $value | humanize }} USD/hour"

Enabling alerts

To activate the pre-configured alerts for Alert Manager (Prometheus Rule):

alerts:
  prometheusRule:
    enabled: true
    # Additional labels to the PrometheusRule
    #labels:
      #release: prometheus

To enable alerts in DataDog (Alert rules):

alerts:
  datadogMonitor:
    enabled: true
    namespace: "datadog"
    labels:
      team: cost-optimization
      severity: warning

PreviousHow to find your allocated vCPU?NextSecurity

Last updated 7 months ago