Replies: 18 comments 65 replies
-
If your running Turing or Volta then your best bet currently is to use nouveau, it will set the clocks to the lowest it can! |
Beta Was this translation helpful? Give feedback.
-
You actually can under-volt on Linux. It's just that the way you do it is not what you'd have guessed. This post describes the process, but the basic idea is that due to how overclocking works on modern GPUs, overclocking is undervolting. The GPU has a set of pre-defined voltage levels, and each voltage level has a range of GPU speeds - when the speed moves outside of the range, the voltage is moved up or down accordingly. So when you overclock, you are increasing the allowed GPU speeds for each voltage level - that means that for a given MHz, the voltage will be lower - that's undervolting. The next observation is that these GPUs have a power limit (eg: 350W for a 3090 FE). So the GPU will run as fast as it can within that power limit. So, if you overclock, which is undervolting, you will be running at a higher speed when you hit the power limit - unless you overclock too much and the GPU crashes first. But this means you will not save any power by undervolting if you don't do something about the power. You'll just run faster. To control power, you have two mechanisms - you can reduce the max speed and you can reduce the power limit, and both have their uses. Setting the power limit: If you reduce the power limit (eg: 300W instead of 350W in my example), then you will obviously save power, and this will happen whether you overclock/undervolt or not. But when you hit the power limit, it will be at some speed and GPU behaviour at the power limit is often spiky with the speed changing constantly as it bumps up and down. Overclocking/undervolting lets you reach a higher speed before you hit the power limit. Setting the max speed: On the other hand, if you set the max speed, then the GPU will not go any faster, regardless of the power limit. This gives you a smooth constant speed but perhaps leaves some performance on the table - as the power consumption at max speed varies, depending on the work the GPU is asked to do. So some 10 year old game can hit max speed at 200W while a brand new one might hit 300W, etc. If your goal is to reduce power consumption without giving up performance, then the general process is to run whatever you consider your representative workload to be (your most resource intensive game, etc). While it's running observe what the max speed reached is before/at which you hit the power limit. You can use Now you set your max speed to that value and begin experimenting with overclock values that allow you to hit that speed while staying under the power limit and without crashing. If you're not absolutely focused on always running at lower power, you might set your goal as getting your most intensive game to run at your target speed just at the power limit - then most of the time you'll be well below it, but you still allow full power to be available at the most extreme case. If you want to always reduce power, then set your reduced power limit and then iterate to find your right overclock. Note that your goals might be incompatible. It might turn out that the overclock required to hit your max speed at your desired power level is unstable and you crash. Then you have to decide whether to reduce the max speed even more or increase the power level. But something has to give. Clock offsets are always multiples of 15MHz. While you can specify a value in between it will be rounded down. The commands: First, you must turn on Persistence Mode, or the overclock will reset when work units change (which is often).
Then, set your max speed. In my example, I'm setting it to 1830MHz. The
Then, you set your overclock. This is done with
In my example, the best overclock I can set that allows me to hit 1830MHz in my worst case workload is +240MHz. In most games I could go higher but I don't want to micromanage it. In all of this, I haven't talked about voltages, because it's not necessary information, but if you want to see what voltage level you are using, you can use
Unfortunately, this isn't exposed in So, there you go. |
Beta Was this translation helpful? Give feedback.
-
@philipl thank you for the hint with The issue with It would be really cool if undervolting support could be directly integrated into nvidia-smi or so. It would be even better if undervolting would not be necessary in the first place and the voltage curve more efficient out-of-the-box. Any reason why this is the case? |
Beta Was this translation helpful? Give feedback.
-
I really hope this open driver has proper power management when it gets implemented. Prop driver's power managament is ultra trash rendering linux+nvidia unusable is some cases, especially on laptops. One such big issue |
Beta Was this translation helpful? Give feedback.
-
Still, we have to waste power and bear with fan noise. |
Beta Was this translation helpful? Give feedback.
-
Bumping this because I would really like to see support for undervolting in Linux. Especially for my small form factor 7.4l gaming rig with a 3070. Things get toasty |
Beta Was this translation helpful? Give feedback.
-
Up! |
Beta Was this translation helpful? Give feedback.
-
TEMP of memory like hwinfo |
Beta Was this translation helpful? Give feedback.
-
Undvolting my Nvidia card is the last feature that is missing for me under Linux. If this were implemented, I could finally delete win10. I just keep it for gaming, because I can undervolt my RTX 3070 and save 90W. |
Beta Was this translation helpful? Give feedback.
-
As we are close to ending 2023 and Linux has seen significant growth: any news on this? |
Beta Was this translation helpful? Give feedback.
-
Proper undervolting support already exists. Ask Linux's "many" programmers to make an app for you. |
Beta Was this translation helpful? Give feedback.
-
So we have only |
Beta Was this translation helpful? Give feedback.
-
I create a simple script, which technically doing undervolt using offsets based on temperature. At this moment it working with one Also I was wrong that instability coming from driver. It's a FinFETs and power MOSFETs nature. Plus my MOSFETs can't provide more than 150A, controller limit that. From one side its good for their temperature, from another: lower the voltage - lower stress for MOSFETs and higher stability in theory.
|
Beta Was this translation helpful? Give feedback.
-
I got a question for folks here. I have a pair of nvlinked 3090s in my workstation running Ubuntu 20.04 desktop, and i use it primarily remotely, so the monitor isn't plugged in (well i do have a PiKVM so one "monitor" is plugged in and X11 is active, that will not change as it's needed for certain admin situations) Typically when i access it over SSH I can see nvidia-smi reports usually the core clock at 0Mhz instead of the usual min of 210Mhz. I changed my undervolting nvidia-settings calls as detailed above to set the clock range e.g. It makes the GPUs draw 44W or so instead of 23W at idle. Anyone know what's up with this? I just gotta put up with it and choose between lower power consumption when running workloads or lower idle power consumption, and just can't have both? I'll also note my 3080Ti tends to stay at 210Mhz idle all the time but draws also in the realm of 20-something watts at idle. Though I reckon this is due to not having ram on the backside of the card, possibly if i had 3090Ti's for a similar one sided vram config, there'd be an idle power efficiency gain due to this. |
Beta Was this translation helpful? Give feedback.
-
@unphased, looks like your GPUs not switching to "Level 0" power level, that idle freq identical for a lot of Nvidia GPUs, there no "ZeroCore power" like on AMDs GPUs, Nvidia GPUs always work at idle freq like 210MHz. There was some discussion on forums about that. |
Beta Was this translation helpful? Give feedback.
-
Reimplement script to python to use pynvml. Result: vkcube - no any problems, real games - stuttery mess. IDK what to do next. I don't have knowledge to rewrite code to C. |
Beta Was this translation helpful? Give feedback.
-
If nvidia could simply give a method / API to
We could easily have everything solved. Undervolting? Done. Easily make a script with an array of frequencies (which you could create yourself, or just grab from MSI Afterburner on windows) to set at each voltage. Overclocking? Easy. Just loop through each voltage, view its frequency target, and apply that target with a user supplied offset AMD already has this solved. Now its your turn. We need nvidia to be viable under Linux, however with these clear missing features, its hard to justify purchasing a new nvidia GPU. You guys have done a really good job on Explicit Sync and the drivers as a whole for the past year or two. There's only very little left to make Nvidia a recommendable option under Linux. However these little things which are left are some of the biggest bugs / issues / missing features. |
Beta Was this translation helpful? Give feedback.
-
I am planning to switch to Linux in the near future and right now I'm in thorough "getting ready" phase. So in preparation I made a Python script that can achieve an "undervolt" using a cheap trickery. Don't judge my code, I don't work with Python at all but it was the only suitable language to use in this case. I am posting this in good faith that someone picks up on this and develops a proper application on top of it. |
Beta Was this translation helpful? Give feedback.
-
Is your feature request related to a problem? Please describe.
On Windows it is possible to use third party utilities like MSI Afterburner to undervolt GPUs, on Linux this functionality does not exist, even through nvidia-smi and nvidia-settings after being removed. Utilities such as gwe also cannot support this due to it not being exposed: https://gitlab.com/leinardi/gwe/-/issues/118
Describe the solution you'd like
Offer the functionality to be able to undervolt supported GPUs
Beta Was this translation helpful? Give feedback.
All reactions