1 Extended version
-
The evaluation now covers both fixed priority and EDF scheduling.
-
We examined how the schedulability of a group of tasks sharing a partition depends upon partition size.
-
We present an alternative cache partitioning algorithm which both optimises schedulability and minimises processor utilization. We examine the improvement in processor utilization obtained using this algorithm as compared to the original cache partitioning algorithm, and the tradeoff in terms of increased analysis time.
2 Introduction
3 System model, terminology and notation
3.1 Static timing analysis
3.2 Pre-emption costs
4 Schedulability tests
4.1 Fixed priority pre-emptive scheduling
4.1.1 Pre-emption cost aware schedulability test
4.1.2 Pre-emption cost computation
4.2 EDF scheduling
4.2.1 Pre-emption cost aware schedulability test
4.2.2 Pre-emption cost computation
4.3 Optimal task layout
5 Review of cache partitioning for real-time systems
6 Partition-size sensitivity
6.1 Partition-size sensitivity (task level)
6.1.1 Monotonicity
6.2 Partition-size sensitivity (task group level)
7 Optimal cache partitioning
7.1 Schedulability
7.2 Schedulability and minimal utilization
8 Case study
-
The default taskset size was 10.
-
Task utilizations were generated using the UUnifast (Bini and Buttazzo 2005) algorithm.
-
Task periods were set based on the utilization and execution times: \(C_i = U_i \cdot T_i\).
-
Task deadlines were implicit,5 i.e., \(D_i = T_i\).
-
For fixed priority scheduling, priorities were assigned in Rate Monotonic priority order.
Description | UCBs | ECBs | WCET\(^1\)
| WCET\(^2\)
| Period | |
---|---|---|---|---|---|---|
I4 | Interrupt-modem | 2 | 10 | 303 \(\upmu \)s | 520 \(\upmu \)s | – |
I5 | Interrupt-spi-1 | 1 | 10 | 251 \(\upmu \)s | 447 \(\upmu \)s | – |
I6 | Interrupt-spi-2 | 1 | 4 | 151 \(\upmu \)s | 228 \(\upmu \)s | – |
I7 | Interrupt-gps | 3 | 26 | 283 \(\upmu \)s | 493 \(\upmu \)s | – |
T5 | Altitude-control | 20 | 66 | 1478 \(\upmu \)s | 1660 \(\upmu \)s | 250 ms |
T6 | Climb-control | 1 | 210 | 5429 \(\upmu \)s | 6241 \(\upmu \)s | 250 ms |
T7 | Link-fbw-send | 1 | 10 | 233 \(\upmu \)s | 471 \(\upmu \)s | 250 ms |
T8 | Navigation | 1 | 256 | 44, 42 ms | 54, 35 ms | 50 ms |
T9 | Radio-control | 0 | 256 | 15, 6 ms | 21, 1 ms | 50 ms |
T10 | Receive-gps-data | 22 | 194 | 5987 \(\upmu \)s | 6659 \(\upmu \)s | 25 ms |
T11 | Reporting | 2 | 256 | 12, 22 ms | 5 ms | 100 ms |
T12 | Stabilization | 11 | 194 | 5681 \(\upmu \)s | 6654 \(\upmu \)s | 50 ms |
8.1 PapaBench
Description | UCBs | ECBs | WCET\(^1\)
| WCET\(^2\)
| Period | |
---|---|---|---|---|---|---|
I4 | Interrupt-modem | 3 | 10 | 335 \(\upmu \)s | 790 \(\upmu \)s | – |
I5 | Interrupt-spi-1 | 2 | 10 | 287 \(\upmu \)s | 644 \(\upmu \)s | – |
I6 | Interrupt-spi-2 | 1 | 4 | 135 \(\upmu \)s | 338 \(\upmu \)s | – |
I7 | Interrupt-gps | 3 | 26 | 278 \(\upmu \)s | 712 \(\upmu \)s | – |
T5 | Altitude-control | 2 | 66 | 654 \(\upmu \)s | 3860 \(\upmu \)s | 250 ms |
T6 | Climb-control | 5 | 210 | 2375 \(\upmu \)s | 14, 21 \(\upmu \)s | 250 ms |
T7 | Link-fbw-send | 2 | 10 | 298 \(\upmu \)s | 634 \(\upmu \)s | 250 ms |
T8 | Navigation | 10 | 256 | 23, 38 ms | 138 ms | 50 ms |
T9 | Radio-control | 14 | 256 | 10, 2 ms | 51 ms | 50 ms |
T10 | Receive-gps-data | 4 | 194 | 3058 \(\upmu \)s | 20, 5 ms s | 25 ms |
T11 | Reporting | 6 | 242 | 12, 8 ms | 32 ms | 100 ms |
T12 | Stabilization | 6 | 194 | 2711 \(\upmu \)s | 16, 1 ms s | 50 ms |
Description | UCBs | ECBs | WCET\(^1\)
| WCET\(^2\)
| |
---|---|---|---|---|---|
M | Adpcm | 24 | 226 | 5541 s | 6521 s |
M | Compress | 25 | 114 | 3664 s | 8426 s |
M | Edn | 56 | 98 | 244, 8 ms | 458, 2 ms |
M | Fir | 28 | 50 | 21, 52 ms | 497 ms |
M | Jfdctinit | 40 | 162 | 13, 89 ms | 32, 98 ms |
M | Ns | 17 | 26 | 73, 38 ms | 168 ms |
M | Nsichneu | 53 | 256 | 77, 96 ms | 163 ms |
M | Statemate | 3 | 256 | 9757 s | 20, 07 s |
S | Cruise control system | 25 | 107 | 1959 s | 3548 s |
S | Flight control system | 70 | 256 | 2138 s | 4083 s |
S | Navigation system | 45 | 82 | 1409 s | 3712 s |
S | Stopwatch | 58 | 130 | 3786 s | 5533 s |
S | Elevator simulation | 40 | 114 | 1586 s | 2917 s |
S | Robotics systems | 68 | 256 | 4311 s | 6377 s |
8.2 Mälardalen and SCADE benchmarks
Description | UCBs | ECBs | WCET\(^1\)
| WCET\(^2\)
| |
---|---|---|---|---|---|
M | Adpcm | 7 | 242 | 5856 s | 43, 17 s |
M | Compress | 6 | 242 | 9740 s | 25, 26 s |
M | Edn | 5 | 98 | 518, 9 ms | 1422 s |
M | Fir | 5 | 50 | 42, 65 ms | 121 ms |
M | Jfdctinit | 8 | 242 | 23, 2 ms | 73, 63 ms |
M | Ns | 3 | 26 | 133, 7 ms | 466, 9 ms |
M | Nsichneu | 8 | 242 | 66, 74 ms | 178, 3 ms |
M | Statemate | 30 | 242 | 8143 s | 22, 45 s |
S | Cruise control system | 15 | 98 | 1, 77 s | 6207 s |
S | Flight control system | 12 | 242 | 3, 24 s | 11, 02 s |
S | Navigation system | 3 | 82 | 2, 96 s | 7566 s |
S | Stopwatch | 9 | 130 | 4417 s | 25, 03s |
S | Elevator simulation | 4 | 114 | 1863 s | 5432 s |
S | Robotics systems | 5 | 242 | 3427 s | 22, 45 s |
8.3 Utilization versus analysis time
9 Synthetic tasksets
-
The default taskset size was 10.
-
Task utilizations were generated using the UUnifast (Bini and Buttazzo 2005) algorithm.
-
Task periods were generated according to a log-uniform distribution with a factor of 1000 difference between the minimum and maximum possible task period and a minimum period of 5 ms. This represents a spread of task periods from 5 ms to 5 s, thus providing reasonable correspondence with real systems.
-
Task execution times were set based on the utilization and period selected: \(C_i = U_i \cdot T_i\).
-
Task deadlines were implicit
-
For fixed priority scheduling, priorities were assigned in Deadline Monotonic priority order.
-
The number of cache-sets (\(CS = 256\)).
-
The block-reload time (\(BRT = 8\,\upmu \)s)
-
The cache usage of each task, and thus, the number of ECBs, were generated using the UUnifast (Bini and Buttazzo 2005) algorithm (for a total cache utilization \(CU = \sum _i \vert {\hbox {ECB}}\vert /CS = 4\)). UUnifast may produce values larger than 1 which means a task fills the whole cache.
-
For each task, the UCBs were generated according to a uniform distribution ranging from 0 to the number of ECBs times a reuse factor: \([0, RF \cdot \vert ECB \vert ]\). The factor RF was used to adapt the assumed reuse of cache-sets to account for different types of real-time applications, for example, from data processing applications with little reuse up to control-based applications with heavy reuse (default \(RF =0.3\)).
9.1 Pre-emption costs
9.2 Cache utilization
9.3 Number of tasks
9.4 Cache size
9.5 Precision of the simplified execution-time model
10 Conclusions and future work
-
Sensitivity analysis of WCET with respect to partition size, showing how the precise WCET bound as a function of the size of the partition can be effectively upper and lower bounded by monotonic functions.
-
Sensitivity analysis of the schedulability of groups of tasks with respect to the size of a shared partition, showing that the precise schedulability of the task group is sustainable with respect to the size of the partition whereas the schedulability tests are not sustainable.
-
The introduction of optimal algorithm for cache partitioning which finds a schedulable partioning whenever such a partitioning exists. This algorithm makes use of the monotonic WCET functions.
-
The introduction of an optimal algorithm for cache partitioning which finds a schedulable partitioning with the minimum processor utilisation whenever a schedulable partitioning exists. This algorithm also makes use of the monotonic WCET functions.
-
A thorough evaluation of the relative performance of optimal per task cache partitioning versus no partitioning for static and dynamic priority assignment.
-
An evaluation of the trade-off of mininal processor utilization against increased analysis time.