@OnurCetinkaya

OnurCetinkaya@alien.top · 2 years ago

I am just gonna do some bad maths.

For the price of single 4090 you can get

CPU Mainboard combo with 16 ram slots. $1,320

total 512 GB

Mistral 7B runs around 7 tokens per second on a regular CPU, that is like 5 words per second.

On above setups 512 GB ram size we can fit a 512B parameters model, that will run 5*7/512=0.068 words per second with the current architecture, if this new architecture actually works and give 78x speed up it will be 5.3 words per second, the average persons reading speed is around 4 words per second. And average persons speaking speed is around 2 words per second.

Fingers crossed this can put a small dent on Nvidia’s stock price.

OnurCetinkaya@alien.top · 2 years ago

Waiting for someone to stitch 100 phi-1.3Bs together :D