Forget the High-Cost Nvidia GPUs: Run an LLM on a 1997 Pentium II CPU
AI/Software

Forget the High-Cost Nvidia GPUs: Run an LLM on a 1997 Pentium II CPU

A 1997-era Pentium II CPU can run a modern AI model, albeit at a significantly slower rate than contemporary GPUs.

Conventional belief holds that one needs a vast amount of Nvidia GPUs, costing around $50,000 each, to operate state-of-the-art AI models. However, a recent claim by EXO Labs suggests otherwise.

They have managed to get the Llama 2 LLM working on a Pentium II processor from 1997, which they acquired for less than $120 on eBay. The only downside? It’s approximately 20,000 times slower than a modern GPU.

Exo Labs faced significant challenges in configuring the ancient machine for modern applications, including compatibility issues with its outdated USB ports and the need to compile necessary files to suit the old processor’s architecture.

Once everything was sorted, the 260K parameter version of Llama 2 achieved 39.31 tokens per second on this nostalgic setup, while the more complex 15M parameter version only managed 1.03 tokens per second. Even attempts to run a one billion parameter version yielded a crawling speed of 0.0093 tokens per second.

To summarize: Although it’s commendable to get a modern language model functioning on such a legacy system, the performance disparity highlights the critical importance of speed in technology.

Next article

Nvidia and MediaTek Expected to Unveil New Arm Laptop Chips at Computex 2025

Newsletter

Get the most talked about stories directly in your inbox

Every week we share the most relevant news in tech, culture, and entertainment. Join our community.

Your privacy is important to us. We promise not to send you spam!