Power User
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
lans_throwaway@alien.topB to LocalLLaMAEnglish · 2 years ago

Look ahead decoding offers massive (~1.5x) speedup for inference

lmsys.org

external-link
message-square
4
link
fedilink
  • cross-posted to:
  • localllama
1
external-link

Look ahead decoding offers massive (~1.5x) speedup for inference

lmsys.org

lans_throwaway@alien.topB to LocalLLaMAEnglish · 2 years ago
message-square
4
link
fedilink
  • cross-posted to:
  • localllama
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding | LMSYS Org
lmsys.org
external-link

TL;DR: We introduce lookahead decoding, a new, exact, and parallel decoding algorithm to accelerate LLM inference. Look...

  • CasimirsBlake@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    Any chance P40s can benefit from this through llama.cpp?

LocalLLaMA

localllama

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@poweruser.forum

Community to discuss about Llama, the family of large language models created by Meta AI.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 4 users / day
  • 4 users / week
  • 4 users / month
  • 4 users / 6 months
  • 1 local subscriber
  • 11 subscribers
  • 1.02K Posts
  • 5.82K Comments
  • Modlog
  • mods:
  • communick
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org