What exactly is pulling 100-110W when running local LLM?

k_michael@alien.top · 2 years ago

What exactly is pulling 100-110W when running local LLM?

inyourfaceplate@alien.top · 2 years ago

What tools are you using to monitor these metrics in your screenshot?

k_michael@alien.top · 2 years ago

At the top, in the menu bar, it’s good old iStat Menus
On the right side this is MX Power Gadget
And in the terminal, on the left, it’s asitop

I think they all read from the same source, powermetrics, but interpret it in slightly different ways.

Infamous_Charge2666@alien.top · 2 years ago

Training models on a laptop is counter intuitive. It will eventually kill your battery and damage the laptop. A laptop doesnt have the airflow to allow intensive project run for long periods of time. Apple runs cooler but you’ll eventually realize you are better off building a server and use the laptop to log in to remotely train/ running inference or buy online pods.

k_michael@alien.top · 2 years ago

I’m not training, just running inference for fun every now and then. This question was mostly just for curiosity. But thank you, you are right!

FlishFlashman@alien.top · 2 years ago

I think part of the answer is that RAM uses more power than you think when it’s running near full-tilt, like it is during generation. Micron’s advice is to figure 3w per 8GB for DDR4, and more than that for the highest performance parts. The fact that the RAM is on package probably offsets that somewhat, but that’s still more than single digits.

Power consumption on my 24Core GPU M1 Max is similar to yours, though somewhat lower as you’d expect, according to both iStat Menus and Stats.app.

There is also the question of how accurate they are.

k_michael@alien.top · 2 years ago

Oh wow, that would be much more than I expected. Could make a lot of sense as the LLM is probably hitting the RAM very hard. Thank you 🙌

Herr_Drosselmeyer@alien.top · 2 years ago

Under full load and if thermals allow it, that machine can draw up to 120 from the wall. Likely the tool isn’t reading the SOC power draw correctly.

k_michael@alien.top · 2 years ago

Hm, you are right, I do also remember the anandtech article on m1 max power draw. Maybe the tool really isn’t reading reading the draw correctly 🤔 It’s still interesting though, if I run a 3D game on my MBP i draw maybe 65-70W under full load. The LLM must be using some component that the 3D game isn’t 🤷‍♂️

FlishFlashman@alien.top · 2 years ago

Game GPU-use probably hits the cache. LLM really won’t since each token involves reading all the model data.

k_michael@alien.top · 2 years ago

That’s a good point actually! Thanks