Shabby Lemmy

5mon

How To Run Deepseek R1 671b Fully Locally On a $2000 EPYC Server

https://digitalspaceport.com/how-to-run-deepseek-r1-671b-fully-locally-on-2000-epyc-rig/

A note that this setup runs a 671B model in Q4 quantization at 3-4 TPS, running a Q8 would need something beefier. To run a 671B model in the original Q8 at 6-8 TPS you'd need a dual socket EPYC server motherboard with 768GB of RAM.

小莱卡 - 5mon

Brb gotta start my FundMe campaign for one of these servers lol

FuckBigTech347 - 5mon

DW. in like two years from now, companies will start throwing out similar machines. Just keep an eye on second-hand markets and dumpsters.

CriticalResist8 - 5mon

btw do you recommend running a quantized higher-parameter model (locally) or lower-parameter but not quantized, if I had to pick between the two?

☆ Yσɠƚԋσʂ ☆ - 5mon

I find higher parameter tends to produce better output, but depends on what you're doing too. For example, for stuff like code generation accuracy is more important. So even a smaller model that's not quantized might do better. It also depends on the specific model as well.

CriticalResist8 - 5mon

Thanks I'll have to try them both then it seems