stephlow's comments

stephlow · 2026-04-13T00:12:43 1776039163

I own a single R9700 for the same reason you mentioned, looking into getting a second one. Was a lot of fiddling to get working on arch but RDNA4 and ROCm have come a long way. Every once in a while arch package updates break things but that’s not exclusive to ROCm.

LLM’s run great on it, it’s happily running gemma4 31b at the moment and I’m quite impressed. For the amount of VRAM you get it’s hard to beat, apart from the Intel cards maybe. But the driver support doesn’t seem to be that great there either.

Had some trouble with running comfyui, but it’s not my main use case, so I did not spent a lot of time figuring that out yet

canpan · 2026-04-13T01:05:27 1776042327

Thanks for the answer. Brings my hope up. Looking in my local shops, I can get 3 cards for the price of one 5090.

May I ask, what kind of tok/s you are getting with the r9700? I assume you got it fully in vram?

jhgorrell · 2026-04-13T02:54:21 1776048861

Stock install, no tuning.

  $uname -r
  6.8.0-107-generic
  $ollama --version
  ollama version is 0.20.2
  $ollama run "gemma4:31b" --verbose "write fizzbuzz in python."
  [...]
  total duration:       45.141599637s
  load duration:        143.633498ms
  prompt eval count:    21 token(s)
  prompt eval duration: 48.047609ms
  prompt eval rate:     437.07 tokens/s
  eval count:           1057 token(s)
  eval duration:        44.676612241s
  eval rate:            23.66 tokens/s

theoli · 2026-04-13T03:43:28 1776051808

I have a dual R9700 machine, with both cards on PCIe gen4 x8 slots. The 256bit GDDR6 memory bandwidth is the main limiting factor and makes dense models above 9b fairly slow.

The model that is currently loaded full time for all workloads on this machine is Unsloth's Q3_K_M quant of Qwen 3.5 122b, which has 10b active parameters. With almost no context usage it will generate 59 tok/sec. At 10,000 input tokens it will prefill at about 1500 tok/sec and generate at 51 tok/sec. At 110,000 input tokens it will prefill at about 950 tok/sec and generate at 30 tok/sec.

Smaller MoE models with 3b active will push 70 tok/sec at 10,000 context. Dense models like Qwen 3.5 27b and Devstral Small 2 at 24b will only generate at around 13 - 15 tok/sec with 10,000 context.

This is all on llama.cpp with the Vulkan backend. I didn't get to far in testing / using anything that requires ROCm because there is an outstanding ROCm bug where the GPU clock stays at 100% (and drawing like 60 watts) even when the model is not processing anything. The issue is now closed but multiple commenters indicate it is still a problem. Using the Vulkan backend my per-card idle draw is between 1 and 2 watts with the display outputs shut down and no kernel frame buffer.

stephlow · on May 7, 2022

This is correct. I have a couple of analog oscillators in my modular rack and the case needs to heat up for a couple of minutes before they generate a stable pitch. Not familiar with the exact reasons to be honest.

klodolph · on May 8, 2022

The temperature sensitive part in a VCO is actually the exponential converter inside, which has a strong temperature dependence. This is because the current through a transistor depends on the ratio V_BE / V_T, and V_T is proportional to temperature.

It's a neat trick that lets you build good temperature sensors out of transistors, which is very convenient, but in these VCOs you have to add temperature compensation to the exponential converter, and the temperature compensation is far from perfect.

If you take linear VCOs (instead of 1V/octave exponential), they'll be much more temperature stable.

kevin_thibedeau · on May 7, 2022

Higher quality equipment that needs a stable timebase will use ovenized oscillators with an integrated heater and a control loop to maintain a stable temperature. These are all inaccurate before they are up to operating temperature.

dylan604 · on May 7, 2022

TVs and CRT monitors were common to change the shape and color of the picture as it got to temp. You never made critical color decisions on a monitor that hadn't been on for at least 15 minutes, preferably 30.

colechristensen · on May 7, 2022

Basically all components have temperature dependent values

stephlow · on April 4, 2022

I feel your pain, I’ve been there. If you go to the person in the photos app, there’s an option to feature that person less in memories.

mips_avatar · on April 4, 2022

It was actually a beautiful vignette, though seeing it pop up on my main screen while I was on a date was a bit of a weird experience.