1 Introduction
2 Background and Related Work
3 Access Interval Prediction
3.1 System Overview
access
and adr
. access
is set whenever an access (read or write) on the address adr
takes place. In addition, the AIP unit intercepts the program counter pc
of the core.pred
whenever it determines that the HPM will request the data memory in the next cycle. When this happens, the arbiter pre-allocates the memory to the HPM, and to the LPM otherwise.3.2 Theory and Notation
pred
is set at cycle \(k = K_i + \hat{I}_i - 1\).4 Prediction by Partial Matching
i | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
\(a_i\) | \(a^\textrm{a}\) | \(a^\textrm{a}\) | \(a^\textrm{b}\) | \(a^\textrm{a}\) | \(a^\textrm{b}\) | \(a^\textrm{a}\) | \(a^\textrm{b}\) | \(a^\textrm{a}\) | \(a^\textrm{b}\) | \(a^\textrm{a}\) | \(a^\textrm{b}\) | \(a^\textrm{a}\) | \(a^\textrm{b}\) | \(a^\textrm{a}\) |
\(I_i\) | 1 | 1 | 2 | 1 | 2 | 3 | 4 | 1 | 1 | 1 | 2 | 3 | 4 |
5 TAGE Access Interval Predictor
5.1 Design
tag
and the useful counter u
of bit-widths \(w_\mathrm {\iota }\), \(w_\textrm{t}\) and \(w_\textrm{u}\), respectively, are read. The subprediction \(\hat{I}_i(t)\) of width \(w_\textrm{I}\) is computed from \(\iota _i(t)\) by means of the subpredictor \(\Pi \). In contrast to the TAGE-BP, we consider a variety of subpredictors (c.f. section 5.3).u
is used as in [5] to keep track of entries that can be overwritten because they provide the same prediction as another entry with a shorter context. This is akin to the removal or deactivation of redundant nodes in the PPM suffix tree detailed in Sect. 4.pred
when the next cycle is \(\hat{I}_i\) cycles after the last access.5.2 Interval Representation
5.3 Subprediction and Update Method
6 Performance Evaluation
Program | Variant | \(n_\textrm{access}\) | \(K_\textrm{end}\) |
---|---|---|---|
basicmath | “tiny” | 9 565 757 | 120 790 227 |
bitcount | “small” | 1 846 690 | 38 840 684 |
dijkstra | “tiny” | 3 971 887 | 20 154 946 |
fft | “tiny” | 4 997 606 | 29 598 468 |
gsm | “small” | 5 750 522 | 28 265 373 |
jpeg | “small” | 8 631 635 | 40 778 800 |
qsort | “small” | 5 082 398 | 23 157 456 |
sha | “small” | 2 109 141 | 17 489 771 |
stringsearch | “large” | 1 654 440 | 7 433 104 |
susan | “small” | 4 951 145 | 35 398 006 |
access
, adr
, and pc
during execution. MiBench offers a “small” and “large” test case for each program. We usually opted for the “small” variant. But in some cases we chose the “large” or our own reduced “small” variant dubbed “tiny” instead to keep the number of accesses of each trace in roughly the same order of magnitude. Table 2 documents the selected programs, variants, access counts, and program lengths. Technical limitations prevented the use of the entire MiBench suite.access
, adr
, and pc
according to the traces. These signals are fed into the predictor-under-evaluation. A tracker compares access
and pred
and records \(n_\textrm{hit}\) and \(n_\textrm{fp}\).6.1 Prediction by Partial Matching
6.2 Tagged Geometric History Length
Parameter | Value | Range |
---|---|---|
\(n_\textrm{T}\) | 5 | 1–15 |
L(1) | 1 | 1–50 |
\(\alpha \) | 2.88 | 0.5–50 |
\(w_\textrm{t}\) | 9 | 4–20 |
\(w_\textrm{u}\) | 2 | 1–5 |
\(n_\textrm{I}\) | 1024 | 2–4096 |
\(w_\textrm{I}\) | 6 | 2–8 |
\(\Pi \) | EXP2 | ML,AVG,KEEP,EXP1,EXP2 |
\(a_i\) | \(P_i\) | \(A_i,P_i,(A_i, P_i)\) |
Predictor | \(P_\textrm{hit}\) [%] | \(U_\textrm{I}\) [%] | M [kB] |
---|---|---|---|
CBE | 34.7 | 91.3 | 0 |
ALB (1024 entries) | 51.4 | 92.9 | 0.63 |
ALB (16384 entries) | 56.4 | 93.6 | 10.00 |
PPM | 99.0 | 99.9 | Not applicable |
TAGE-AIP | 97.0 | 99.6 | 9.75 |