# GPU optimization

{% hint style="info" %}
PerfectScale now only supports NVIDIA Data Center GPU Manager (DCGM). GPU support is available starting with the exporter version 1.0.55.
{% endhint %}

PerfectScale delivers exceptional GPU utilization visibility to monitor and optimize GPU resources within your Kubernetes clusters. This feature helps teams identify optimization opportunities, reduce resource waste, and improve overall K8s efficiency.

{% hint style="warning" %}
To enable GPU visibility support in PerfectScale, the **NVIDIA DCGM exporter** should be installed. Additionally, specific configuration parameters should be set when deploying or upgrading the PerfectScale agent. Learn more [here](https://docs.perfectscale.io/2.0-self-hosted-or-perfectscale-documentation/getting-started/how-to-onboard-a-cluster#gpu-support).
{% endhint %}

When PerfectScale detects active GPU resources within a cluster, it automatically enables GPU-specific widgets and utilization insights in the UI. These components provide detailed metrics on GPU usage, allocation efficiency, and workload distribution, enabling data-driven K8s optimization.

## Podfit GPU visibility

To quickly identify GPU-allocated workloads in **PodFit**, switch to the **GPU view** by clicking the GPU tab from the view selector, as shown below, then sort the table by GPU usage. This will bring all GPU-consuming workloads to the top.

<figure><img src="https://3591580169-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzCh9aABpk7yLeToPr6vk%2Fuploads%2FPyYp7dkwPRDeEdbjMAKN%2Fimage.png?alt=media&#x26;token=aa50ecd1-7f9b-4e2a-ad92-534a97a9f9b0" alt=""><figcaption><p>GPU view - Podfit</p></figcaption></figure>

Click on a workload to open the detailed Zoom-in view. This panel provides in-depth information about the workload’s current state and behavior, along with historical data on resource allocation and utilization over time. It includes GPU utilization metrics, as well as other resource usage, and detected performance risks. To learn more about zoom-in capabilities, [go here](https://docs.perfectscale.io/2.0-self-hosted-or-perfectscale-documentation/podfit-or-vertical-pod-right-sizing#detailed-workload-analysis).

<figure><img src="https://3591580169-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzCh9aABpk7yLeToPr6vk%2Fuploads%2FqiJy0jTri6cCto19hAMu%2Fimage.png?alt=media&#x26;token=e4190f64-47b9-4076-bb94-0e9c26d9594f" alt=""><figcaption><p>Workload details - GPU widget</p></figcaption></figure>

## Infrafit GPU visibility

To see detailed GPU usage across your infrastructure, go to **InfraFit**. The GPU chart shows how much of your GPUs are being used versus how much was requested, making it easy to spot inefficiencies and find ways to optimize.

<figure><img src="https://3591580169-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzCh9aABpk7yLeToPr6vk%2Fuploads%2FK4GiUQIYQGqjFkz6wWN3%2Fimage.png?alt=media&#x26;token=52a25fbf-dd04-4a1e-b9e9-f9d5bcd9de13" alt=""><figcaption><p>GPU view - Infrafit</p></figcaption></figure>

This view helps you quickly evaluate the difference between requested GPU resources and actual usage, making it easy to pinpoint underutilized or idle GPU capacity across your clusters.&#x20;

By clicking on the specific node group, you will get a granular breakdown of individual instances within that group, along with key metrics for each one.

<figure><img src="https://3591580169-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzCh9aABpk7yLeToPr6vk%2Fuploads%2FbSztGndDF999rM8LTThi%2Fimage.png?alt=media&#x26;token=7213e0f5-826a-4e36-a8c8-988513425c54" alt=""><figcaption><p>GPU utilization by instance</p></figcaption></figure>

Clicking on a specific instance will display a list of workloads running on that machine, allowing for deeper investigation and analysis.
