# PerfectScale Prometheus Exporter

PerfectScale Prometheus Exporter is a powerful feature that converts PerfectScale's optimization recommendations, cost insights, and resource utilization metrics into Prometheus format. Exposing these insights as Prometheus metrics enables you to seamlessly incorporate PerfectScale's detailed analysis into your standard monitoring and alerting workflows, enhancing visibility into cost-efficiency, resource usage across your infrastructure, and overall system performance.

{% hint style="info" %}
To expose insights from PerfectScale as Prometheus metrics, a Kubernetes cluster with Helm v3 or higher and the PerfectScale agent installed and running is required.
{% endhint %}

## Installing the Chart

You can seamlessly install the chart in two simple steps:

1. Add the PerfectScale Helm repository

```
helm repo add perfectscale https://perfectscale-io.github.io --force-update
```

2. Install the chart

```
helm upgrade --install --namespace perfectscale psc-prom-exporter \
      --set settings.psUrl=<https://api-{your-web-ui-url}> \
      --set settings.telemetryUrl=<https://api-{{your-web-ui-url}}/psc-telemetry>
      perfectscale/psc-prom-exporter
```

{% hint style="info" %}
The `Psc-Prom-Exporter` must be installed in the same namespace as the PerfectScale Agent to reuse the existing credentials secret.
{% endhint %}

This will install Prometheus exporter with basic configuration, without manually configured scrapers, dashboards, and alerts. To configure it according to your needs, please follow the documentation. [Quick install](#quick-install) or [Configuration](#configuration) and [Monitoring integrations](#monitoring-integrations) sections.

{% hint style="info" %}
The general behavior of the Prom-Exporter is defined in the `config` section of the values file. Monitoring tools integration is configured in the `scrapers`, `alerts`, and `dashboards` sections of the values file.
{% endhint %}

## Quick install:

### Prom-Exporter with Prometheus Operator with Alerts example and Grafana dashboard

```sh
helm upgrade --install --namespace perfectscale psc-prom-exporter \
    --set scrapers.serviceMonitor.enabled=true \
    --set alerts.prometheusRule.enabled=true \
    --set dashbords.grafana.enabled=true \
    --set settings.psUrl=<https://api-{your-web-ui-url}> \
    --set settings.telemetryUrl=<https://api-{{your-web-ui-url}}/psc-telemetry>
perfectscale/psc-prom-exporter
```

### Prom-Exporter with DataDog and DataDog alerts

```
helm upgrade --install --namespace perfectscale psc-prom-exporter \
    --set scrapers.datadog.enabled=true \
    --set alerts.datadogMonitor.enabled=true \
    --set settings.psUrl=<https://api-{your-web-ui-url}> \
    --set settings.telemetryUrl=<https://api-{{your-web-ui-url}}/psc-telemetry>
perfectscale/psc-prom-exporter
```

## Configuration

```yaml
config:
  metrics:
    recommendations:
      enabled: true    # Enables resource recommendation metrics
    costs:
      enabled: true    # Enables cost analysis metrics
    indicators:
      enabled: true    # Enables resource utilization indicators

  filters:
    workloads:
      includeMuted: true           # Includes workloads marked as muted in PerfectScale
      minRunningMinutes: 30        # Only includes workloads running longer than 30 minutes
      types:                       # Workload types to include
        - "*"                      # "*" includes all workload types (Deployments, StatefulSets, etc.)
    namespaces:
      exclude:                     # Namespaces to exclude from metrics
        - kube-system
        - default
    indicators:
      types:                       # Types of indicators to expose
        - waste                    # Resource waste indicators
        - risk                     # Risk indicators

  labels:
    includedLabels:
      - "*"                        # Includes all Kubernetes labels
    excludedLabels:
      - "/.*perfectscale.*/"      # Excludes labels matching this regex pattern
    excludeClusterUID: false       # Includes cluster UID in metrics
```

### Metrics configuration <a href="#metrics-configuration" id="metrics-configuration"></a>

```yaml
metrics:
    recommendations:
      enabled: true    # Enable resource recommendation metrics
    costs:
      enabled: true    # Enable cost analysis metrics
    indicators:
      enabled: true    # Enable resource utilization indicators
```

#### recommendations

`recommendations` allows managing the exposure of resource recommendation metrics and, when enabled, exposes the following metrics:

* `ps_recommended_memory_request_bytes`
* `ps_recommended_cpu_request_cores`
* `ps_recommended_memory_limit_bytes`
* `ps_recommended_cpu_limit_cores`
* `ps_current_memory_request_bytes`
* `ps_current_cpu_request_cores`
* `ps_current_memory_limit_bytes`
* `ps_current_cpu_limit_cores`

#### costs

`costs` allows managing the exposure of cost analysis metrics and, when enabled, exposes the following metrics:

* `ps_cost_usd` - hourly workload cos in USD
* `ps_waste_usd` - estimated hourly waste in USD

#### indicators

`indicators` allows managing the exposure of resource utilization indicators and, when enabled, exposes the `ps_workload_indicators` metric with labels for different indicator types.

### Filters configuration <a href="#filters-configuration" id="filters-configuration"></a>

```yaml
filters:
    workloads:
      includeMuted: true          # Include muted workloads
      minRunningMinutes: 30       # Minimum runtime to include
      types:                      # Workload types to include
        - "*"                     # All types
        # Or specify specific types:
        # - "Deployment"
        # - "StatefulSet"
        # - "DaemonSet"
    namespaces:
      exclude:                     # Namespaces to exclude from metrics
        - kube-system
        - default
    indicators:
      types:                       # Types of indicators to expose
        - waste                    # Resource waste indicators
        - risk                     # Risk indicators
```

#### workloads

`workloads` allows managing which workload should be included in the metric based on its type, run-time, etc.

#### namespaces

`namespaces` allows managing which namespace should be excluded from the metric.

#### indicators

`indicators` allows managing indicators to expose

### Labels configuration <a href="#labels-configuration" id="labels-configuration"></a>

The labels section allows managing Kubernetes labels to be included in the metrics.

{% hint style="info" %}
The `excludedLabels` take precedence over `includedLabels`.
{% endhint %}

```yaml
labels:
  includedLabels:
    - "*"                    # Include all labels
    # Or specify specific labels:
    # - "label_app"
    # - "label_environment"

  excludedLabels:
    - "/.*perfectscale.*/"   # Exclude labels matching regex
    # - "label_internal_id"  # Exclude specific label

  excludeClusterUID: false   # Include cluster UID in metrics
```

{% hint style="info" %}
In the Prometheus format, all special label characters are replaced with an underscore `_`.\
\
**Example**:

Kubernetes: `app.kubernetes.io/instance`&#x20;

Prometheus: `label_app_kubernetes_io_instance`
{% endhint %}

#### **Label pattern support**

| Pattern             | Description                                         |
| ------------------- | --------------------------------------------------- |
| `"label_full_name"` | Matches the exact label name                        |
| `"*"`               | Matches all labels                                  |
| `"/some-regexp/"`   | Matches labels with the specific regular expression |

### Configuration examples

Here are a few configuration examples designed to help you effortlessly integrate PerfectScale's insights into your monitoring workflow.

#### Recommended configuration for production environments with short-living workloads excluded

```yaml
config:
  metrics:
    recommendations:
      enabled: true
    costs:
      enabled: true
    indicators:
      enabled: true

  filters:
    workloads:
      includeMuted: false
      minRunningMinutes: 60
      types:
        - "Deployment"
        - "StatefulSet"
        - "DaemonSet"
    namespaces:
      exclude:
        - kube-system
        - default
        - monitoring
    indicators:
      types:
        - waste
        - risk

  labels:
    includedLabels:
      - "label_app"
      - "label_environment"
      - "label_team"
    excludedLabels:
      - "/.*internal.*/
      - "/.*perfectscale.*/"

    excludeClusterUID: false
```

## Monitoring integrations <a href="#monitoring-integrations" id="monitoring-integrations"></a>

The PerfectScale Prometheus Exporter is compatible with a range of out-of-the-box monitoring systems to enhance your monitoring process and make it more efficient. This support allows seamless integration, enabling you to quickly leverage PerfectScale's insights without requiring extensive configuration.

### Prometheus Operator integration <a href="#id-1.-prometheus-operator-integration" id="id-1.-prometheus-operator-integration"></a>

This integration will create a Service Monitor Custom Resource to scrape Prom-exporter with Prometheus Operator.

```yaml
scrapers:
  serviceMonitor:
    enabled: true
    interval: 5m
    path: /metrics
    timeout: 30s
    # Additional Label to the serviceMonitor
    # labels:
    #   prometheus: kube-prometheus
```

### Prometheus (Standard Discovery) <a href="#id-2.-prometheus-standard-discovery" id="id-2.-prometheus-standard-discovery"></a>

This integration allows the use of annotations for Prometheus auto-discovery.

```yaml
scrapers:
  prometheus:
    enabled: true
    path: "/metrics"
    port: "http"
    interval: "30s"
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "8080"
      prometheus.io/path: "/metrics"
```

### Datadog Autodiscovery <a href="#id-3.-datadog-operator" id="id-3.-datadog-operator"></a>

This integration allows you to pull metrics directly into Datadog. It supports both v1 and v2 autodiscovery annotations.

<pre class="language-yaml"><code class="lang-yaml">scrapers:
<strong>  datadog:
</strong>    enabled: true
    # Choose AD version: "v1" or "v2"
    adVersion: "v2"
    containerName: "psc-prom-exporter"
    # Common configuration for both versions
    config:
      endpoint: "/metrics"
      port: 8080
      namespace: "perfectscale"
      # Set maxReturnedMetrics, DataDog default value is 2000. Increase if needed.
      maxReturnedMetrics: 10000
      # List of metrics to collect with optional renaming
      metrics:
        # Wildcard pattern for ps_ metrics
        - name: "ps_.*"
        # Renaming exampple:
        #- name: "ps_waste_usd"
          #rename: "perfectscale.waste.usd"
      # Additional configuration options
      options: {}
</code></pre>

## Dashboards

PerfectScale provides a pre-built Grafana dashboard that offers valuable insights. These dashboards allow you to visualize key metrics and enhance your monitoring experience with minimal configuration required.

**Single Workload Dashboard**

This dashboard offers a comprehensive view of your application's performance. It showcases detailed workload metrics and a clear overview of how resources are utilized, enabling you to monitor efficiency trends and identify optimization opportunities. It includes the following data:

* Detailed workload metrics
* Resource usage patterns
* Cost breakdown
* Optimization opportunities

### Grafana dashboard

Its available as part of helm chart:

```yaml
dashboards:
  grafana:
    enabled: true
    namespace: monitoring
    labels:
      grafana_dashboard: "1"
      team: perfectscale
    annotations:
      grafana.folder: "PerfectScale"
```

You can also find it in:

* Our monitoring Git repo: <https://github.com/perfectscale-io/observability>
* On Grafana Dashboards: <https://grafana.com/grafana/dashboards/22278>

### DataDog dashboard

You can find it in our monitoring Git repo: <https://github.com/perfectscale-io/observability>

## Alert rules

You can effortlessly set up custom alert rules in the `prometheusRule` Helm values configuration section. This will help you stay ahead of key indicator changes, allowing you to address potential issues proactively and prevent them from impacting the system.

### Configuring alerts <a href="#cost-optimization-alerts" id="cost-optimization-alerts"></a>

**Cost optimization alert**

There is an example of a cost optimization alert designed to help you proactively manage expenses and enhance resource efficiency for Prometheus and DataDog.&#x20;

```yaml
  rules:
    - alert: "PerfectScale Waste Cost Surge"
      enabled: true
      expr: |
        (
          ps_waste_usd > 0
          and
          (
            ps_waste_usd
            /
            (ps_waste_usd offset 1h)
          ) > 1.5
        )
      for: 15m
      labels:
        severity: warning
        team: cost-optimization
        type: waste
      annotations:
        summary: "Waste cost increased by more than 50%"
        description: "Workload {{ $labels.workload_name }} in namespace {{ $labels.namespace }} has increased its waste by more than 50% in the last hour. Current waste: {{ $value | humanizePercentage }} USD/hour"

    - alert: "PerfectScale High Absolute Waste"
      enabled: true
      expr: |
        ps_waste_usd > 100
      for: 30m
      labels:
        severity: warning
        team: cost-optimization
        type: waste
      annotations:
        summary: "High waste cost detected"
        description: "Workload {{ $labels.workload_name }} in namespace {{ $labels.namespace }} has waste cost exceeding 100 USD/hour. Current waste: {{ $value | humanize }} USD/hour"
```

### Enabling alerts

To activate the pre-configured alerts for Alert Manager (Prometheus Rule):

```yaml
alerts:
  prometheusRule:
    enabled: true
    # Additional labels to the PrometheusRule
    #labels:
      #release: prometheus
```

To enable alerts in DataDog (Alert rules):

```yaml
alerts:
  datadogMonitor:
    enabled: true
    namespace: "datadog"
    labels:
      team: cost-optimization
      severity: warning
```
