In my 2 week absence I've been building out my first home server; One, to run all or most of my A.I. programs on the server so it won't bog down my personal rig for other stuff, and two, to use it as a place to store the hundreds of gigabytes of tv shows and movies I've downloaded and ripped from thrift store and yardsale DVDs. Using Jellyfin to stream everything.
And because I needed to redo my ethernet cables since most were ran under my house and were starting to break down from being on the ground and ran through bricks.
I probably won't be posting anything for a couple more days because I gotta get stable diffusion setup again and copy all my main models and LoRAs over, and then troubleshoot pytorch and python since it seems to break itself every time I install it XD
Now details for the computer nerds like myself:
The main PC is a repurposed Dell Optiplex 9010, with a Xeon e3-1220 v2 CPU (4 core, 4 thread at 3.2Ghz), 24gb of RAM, and the odd one out, a Nvidia Titan Xp Star Wars edition GPU... 12gb of Vram and more cuda cores than a 1080 Ti.
Now its not as fast as my 3070 because it doesn't have the A.i. accelerator cores, but its got 4gb more vram than a 3070 which makes a big difference when loading large models and while I haven't test it yet with stable diffusion, I know it will be slower, but I can crank the settings up more than I could on my 3070.
The CPU is by far the weakest part of the whole system, but the entire PC was $20 at a yard sale, no RAM, hard drive, or gpu, but the network card it had was worth $20 on its own, so good deal all in all.
I do plan on upgrading it to a Ryzen 5 5600 one day since Dell motherboards have alot of weird proprietary plugs and cables; the 9010 being one that's compatible with a regular PSU, but the SATA plugs are under the GPU, so I can only fit at max 2 right angle cables under it since they engineer the mother board like a modern car... stupid design that only serves to cause problems (*cough* ford bronco oil pan *cough*)
But from what I've tested the GPU on, it can fit entire 13b LLMs into memory and gets between 25-30 tokens per second (7b models gets about 50 tps), which is basically half of my 3070, but considering this card has zero A.i. specific cores and is just running on CUDA cores, that's pretty good.
I had a 1050 2gb to see if it could even handle any sort of A.i. workload and it "will" work but its so slow it makes 2 snails racing looks like a Formula 1 race. At its peak it managed to get 1.63 tokens per second, or 15 times slower than the Titan Xp and 34 times slower a 3070. Will work, but reading a book is faster than waiting on it to slog out words :p
Oh and the Titan Xp was a $1200 GPU 7 years ago, now they're $230 for the collectors edition on eBay, the price to performance is a absolute steal if you're willing to sacrifice a bit of speed for more Vram. It still a beast in any rasterized game at 1080p and 1440p with high settings too.
The networking stuff is just the typical 'modem-to-router-to-switch' setup, networking 101 stuff.
TD:LR Star Wars Titan XP is a Sith relic that can run the Empires biggest models.
Edit: Put the Titan XP on my personal setup and it gets around 50 tps on all models once they're loaded into vram, my 3070 ranges from 60-100 tps depending on model, so I can confirm, old titans and 1080 tis are good for LLMs, stable diffusion takes about a minute to get 4, 768x768 images with 5 loras in a stack at 30 steps, not the best, but by far not the worst (1050 2gb took 2.5 hours on the same test)
Viewed: |
19 times |
Added: |
3 weeks, 3 days ago
08 Jul 2025 22:56 CEST
|
|