Enable GPU monitoring
GPU monitoring is a new feature implemented in Nezha Monitoring v0.17.x. Before using the feature, please ensure your Dashboard version is higher than v0.17.2 and Agent version is higher than v0.17.0.
Enable
From Command-Line Flag
Append the --gpu
flag to the Agent argument. For example:
/opt/nezha/agent/nezha-agent -s example.com:5555 -p example --gpu
From configuration file
Execute the following command to modify Agent configuration to enable GPU monitoring.
/opt/nezha/agent/nezha-agent edit
In the interactive menu returned, choose to enable GPU monitoring.
Enable GPU utilization monitoring
GPU model and GPU utilization are two different monitor items, which uses different approaches to obtain their value.
Windows and macOS supports getting GPU utilization without extra dependencies, and support multiple graphics card brands.
Linux distros support only NVIDIA and AMD cards and need to install extra dependencies.
Below are the instructions on how to enable GPU utilization monitoring on Linux for NVIDIA / AMD graphics cards.
NVIDIA
NVIDIA cards need the nvidia-smi
utility to get GPU utilization. This utility is included in the official driver by default.
If you use unofficial drivers like nouveau
, then it's not possible to get GPU utilization.
AMD
AMD cards need to install the open source amdgpu
driver and the rocm-smi
utility.
Mainstream distros have already packaged rocm-smi
, below are commands to install the utility on these distros:
# Arch Linux
pacman -Sy rocm-smi-lib
# Debian / Ubuntu
apt install rocm-smi
# Fedora / RHEL 8+
dnf install rocm-smi
If your distro doesn't have the package, then you will need to compile rocm_smi_lib
manually.
Required dependencies:git
cmake
gcc
First, clone the git repository of rocm_smi_lib
:
git clone https://github.com/ROCm/rocm_smi_lib
Then compile the libraries and install them on your system.
cd rocm_smi_lib
mkdir -p build
cd build
cmake ..
make -j $(nproc)
# Install library file and header; default location is /opt/rocm
make install