Ke Li 🍁's banner
Ke Li 🍁's profile picture

Ke Li 🍁

@KL_Div6,482 subscribers

Assistant Professor of Computing Science @SFU. Ph.D. from @Berkeley_EECS and Bachelor's from @UofTCompSci. Formerly @GoogleAI and Member of @the_IAS.

Shorts

LLMs require more GPU memory as they generate longer responses. Can we make GPU memory constant without significantly sacrificing accuracy? IceCache is a new method for managing KV caches that leverages Dynamic Continuous Indexing (DCI) to efficiently group and retrieve tokens by semantics. Joint work w/ Yuzhen Mao, Qitong Wang and Martin Ester. For details, check out the links below.

LLMs require more GPU memory as they generate longer responses. Can we make GPU memory constant without significantly sacrificing accuracy? IceCache is a new method for managing KV caches that leverages Dynamic Continuous Indexing (DCI) to efficiently group and retrieve tokens by semantics. Joint work w/ Yuzhen Mao, Qitong Wang and Martin Ester. For details, check out the links below.

21,163 Aufrufe

Videos

Keine weiteren Inhalte verfügbar