ConvergePowered byBayAI Circle

Members-only session

RecordingInfrastructure51 min

Serving big models at small latency

Lena Vasquez · Distinguished Engineer, Helix Compute

Batching, KV caching and the systems tricks behind fast, cheap inference.

Become a member

Unlock every recording and course, get members-only sessions, and stream the full GenAI Summit archive — all year round.

  • All recordings
  • Members-only sessions
  • Summit archive
$29/ month
Become a member