Title sums it up.
If model fits completely inside 12Gb than it would work faster on a desktop, if model not fits into 12Gb but fits fully in 16Gb then you have a good chances it would run faster on a laptop with 16Gb GPU.
i can’t speak for the desktop 3080ti, but i have that laptop card and it’s roughly equivalent in performance to my 4060ti desktop card.
You mind shooting a few test to have real word numbers? Like what kind of speeds are you getting for a 7b q6 and 13b q6, they should fully fit in VRAM
That’s odd considering the 4060 Ti desktop is 8GB VRAM. But are you saying just speed or are you able to run larger parameter LLMs on your laptop that your desktop wouldn’t be able to?
I have the 16gb version of 4060ti, so the cards have nearly identical capabilities.
You mind shooting a few test to have real word numbers for the laptop version? Like what kind of speeds are you getting for a 7b q6 and 13b q6, they should fully fit in VRAM