With the proof of concept done and users able to get over 180gb/s on a PC with AMD’s 3d vcache, it sure would be nice if we could figure a way to use that bandwidth for CPU based inferencing. I think it only worked on Windows but if that is the case we should be able to come up with a way to do it under Linux too.

  • FaustBargain@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    So there are CPU intrinsics for prefetching data. If we can get better at anticipating the next pieces of data that need to be calculated you can speckle in those preload instructions and achieve more speed.