Skip to main content
Top

Hierarchical Verification of Speculative Beams for Accelerating LLM Inference

  • 2026
  • OriginalPaper
  • Chapter
Published in:

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The chapter introduces the Hierarchical Verification Tree (HVT) framework, a novel approach to accelerating Large Language Model (LLM) inference. It addresses the computational bottlenecks in LLM inference by prioritizing high-quality speculative beams and pruning low-likelihood candidates early in the decoding process. The text delves into the theoretical foundations of HVT, providing a formal mathematical framework that supports its design. It also outlines the implementation details and experimental setup, including the use of PyTorch and HuggingFace libraries. The chapter presents extensive evaluations across three benchmark datasets: WikiText-103, CNN/DailyMail, and XSum. These evaluations compare HVT against conventional decoding methods, highlighting its superiority in terms of latency, perplexity, ROUGE scores, and energy consumption. The results demonstrate that HVT achieves a speedup factor of 2.3 over greedy decoding on WikiText-103 and consistently yields higher ROUGE scores for summarization tasks. The chapter concludes by discussing the broader applicability of HVT and potential future enhancements, making it a comprehensive resource for professionals interested in optimizing LLM inference.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Business + Economics & Engineering + Technology"

Online-Abonnement

Springer Professional "Business + Economics & Engineering + Technology" gives you access to:

  • more than 130.000 books
  • more than 540 journals

from the following subject areas:

  • Automotive
  • Construction + Real Estate
  • Business IT + Informatics
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology
  • Insurance + Risk


Secure your knowledge advantage now!

Springer Professional "Engineering + Technology"

Online-Abonnement

Springer Professional "Engineering + Technology" gives you access to:

  • more than 75.000 books
  • more than 390 journals

from the following specialised fileds:

  • Automotive
  • Business IT + Informatics
  • Construction + Real Estate
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology





 

Secure your knowledge advantage now!

Springer Professional "Business + Economics"

Online-Abonnement

Springer Professional "Business + Economics" gives you access to:

  • more than 100.000 books
  • more than 340 journals

from the following specialised fileds:

  • Construction + Real Estate
  • Business IT + Informatics
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Insurance + Risk



Secure your knowledge advantage now!

Title
Hierarchical Verification of Speculative Beams for Accelerating LLM Inference
Authors
Jaydip Sen
Harshitha Puvvala
Subhasis Dasgupta
Copyright Year
2026
DOI
https://doi.org/10.1007/978-3-032-07735-6_19
This content is only visible if you are logged in and have the appropriate permissions.

Premium Partner

    Image Credits
    Neuer Inhalt/© ITandMEDIA, Nagarro GmbH/© Nagarro GmbH, AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, USU GmbH/© USU GmbH, Ferrari electronic AG/© Ferrari electronic AG