Introduction
Motivation
Applications | WordCount | Grep | Inverted index |
---|---|---|---|
Average-CPU usage | 68% | 45% | 82% |
Variance CPU usage | 42 | 17.5 | 30.5 |
Coefficient of variation of CPU usage | 0.51 | 0.65 | 0.64 |
Average–average processing time | 14.9 | 6.97 | 2283.5 |
Variance of processing time | 31.3 | 4.12 | 6,881,847.7 |
Coefficient of variation of processing time | 2.1 | 0.6 | 3013.8 |
Approach
Contributions
-
Why using DVFS in big data processing?
Organization
Related works
Using DVFS to energy reduction
Using other techniques to reduce energy consumption
Methods
Notation | Description |
---|---|
D | Deadline |
EC | Energy consumption |
FT | Finish time |
UF | Utilize factor |
TS | Time slot |
PTi | The processing time of i-th block |
TPT | Total processing Time |
SFBi | Suitable frequency for processing Bi |
Ui | Utilization of server i |
Pi | Processing power of server i |
NDP | Number of data portions |
Bi | Data block i |
Problem statement
Problem formulation
Our algorithm
Implementation
Results and discussion
Applications
-
WordCount: This application Counts the number of words in the file.
-
Grep: It searches and counts a pattern in a file.
-
Inverted Index: This application is an index data structure storing a mapping from content to its locations in a database file.
-
We also consider AVG (average) for TPC-H datasets and SUM for Amazon datasets.
Comparison
Sensitivity analysis
Sensitivity to the data variety
Modeling data variety
Sensitivity to the Deadline
Benchmarks | Tight deadline(s) | Firm deadline(s) |
---|---|---|
Wordcount | 1350 | 1500 |
Grep | 670 | 730 |
Inverted index | 27,000 | 30,000 |
TPC | 1250 | 1400 |
Amazon | 1150 | 1350 |
-
Discussion on the overhead. Our approach is a very low overhead solution. Sampling has less than 1% overhead for generating a 5% error margin and a 95% confidence interval. For this issue, we have a wide approach and description in [9].
-
Discussion on the usages.
-
This approach is applicable for cloud service provider and every cloud user that can manage the infrastructure.
-
Based on the variety that is one of the features of big data, this approach could be used for processing big data applications.
-
This approach reduces energy consumption and the cost of energy. So, cloud providers clearly can benefit from it.
-