1 Introduction
2 Background
2.1 The experiment setup and creation of test sequences
2.2 Data collection
2.2.1 Design of the questionnaire
2.2.2 The test subjects
-
the survey is anonymous;
-
the participation in the survey is not mandatory;
-
those who wish to participate will be asked to:
-
take one envelope and the attached video;
-
keep the envelope sealed and open it only after watching the video;
-
watch the video only once in the conditions they would normally watch television;
-
open the envelope immediately after watching the video and read the instructions;
-
complete the questionnaire;
-
pass the second questionnaire to one person in their company (if applicable and if that person also watched the video with them);
-
return the completed questionnaire(s);
-
-
the video content is 1-h documentary film about the solar system;
-
the questionnaire takes approximately 10 min to complete;
-
the illustration printed on the envelopes reminds them about the steps of participation in the survey;
-
the survey lasts 2 weeks.
2.3 Discussion about the obtained results of the subjective evaluation
2.4 Critical overview of the methodology
-
the primary objective was to collect the rating data and use it to develop the QoE assessment model that would be able to produce more lifelike QoE assessments; thus, the data had to be collected from uncontrolled experiments (in a home environment);
-
the test sequences had to be distributed to the subjects for rating in a manner that bypasses downloading or streaming of the video (due to its size);
-
the accuracy of the model depends on the fuzzification and defuzzification processes, i.e., obliquely on the size of the data set used for the model development; that is, a sufficient number of test sequences of different properties had to be generated and evaluated by a sufficient number of test subjects;
-
it was unfeasible to conduct the interviews with such a large number of test subjects (602); thus, the subjects’ opinions were collected using hard copy questionnaires.
3 Related work
4 Test results that are used to develop the inference system of the model
TS no. | TS properties | MOS | Linguistic meaning |
---|---|---|---|
1 | (0.05%; 1; 1 s; 1 s) | 8.41 | Excellent quality |
2 | (0.1%; 1; 1 s; 1 s) | 8.08 | Excellent quality |
3 | (0.5%; 1; 1 s; 1 s) | 8.10 | Excellent quality |
4 | (1%; 1; 1 s; 1 s) | 8.81 | Excellent quality |
5 | (1.5%; 1; 1 s; 1 s) | 8.96 | Excellent quality |
6 | (2%; 1; 1 s; 1 s) | 8.40 | Excellent quality |
7 | (0.05%; 1; 4 s; 4 s) | 8.42 | Excellent quality |
8 | (0.1%; 1; 4 s; 4 s) | 7.89 | Good quality |
9 | (0.5%; 1; 4 s; 4 s) | 8.10 | Excellent quality |
10 | (1%; 1; 4 s; 4 s) | 8.75 | Excellent quality |
11 | (1.5%; 1; 4 s; 4 s) | 8.26 | Excellent quality |
12 | (2%; 1; 4 s; 4 s) | 8.17 | Excellent quality |
13 | (0.05%; 1; 7 s; 7 s) | 8.32 | Excellent quality |
14 | (0.1%; 1; 7 s; 7 s) | 7.90 | Good quality |
15 | (0.5%; 1; 7 s; 7 s) | 7.46 | Good quality |
16 | (1%; 1; 7 s; 7 s) | 8.63 | Excellent quality |
17 | (1.5%; 1; 7 s; 7 s) | 7.85 | Good quality |
18 | (2%; 1; 7 s; 7 s) | 8.19 | Excellent quality |
19 | (0.05%; 4; 1 s; 4 s) | 8.41 | Excellent quality |
20 | (0.1%; 4; 1 s; 4 s) | 7.84 | Good quality |
21 | (0.5%; 4; 1 s; 4 s) | 8.59 | Excellent quality |
22 | (1%; 4; 1 s; 4 s) | 8.23 | Excellent quality |
23 | (1.5%; 4; 1 s; 4 s) | 7.66 | Good quality |
24 | (2%; 4; 1 s; 4 s) | 7.67 | Good quality |
25 | (0.05%; 4; 4 s; 16 s) | 8.20 | Excellent quality |
26 | (0.1%; 4; 4 s; 16 s) | 7.73 | Good quality |
27 | (0.5%; 4; 4 s; 16 s) | 6.83 | Good quality |
28 | (1%; 4; 4 s; 16 s) | 7.56 | Good quality |
29 | (1.5%; 4; 4 s; 16 s) | 7.06 | Good quality |
30 | (2%; 4; 4 s; 16 s) | 6.70 | Good quality |
31 | (0.05%; 4; 7 s; 28 s) | 7.74 | Good quality |
32 | (0.1%; 4; 7 s; 28 s) | 7.60 | Good quality |
33 | (0.5%; 4; 7 s; 28 s) | 7.60 | Good quality |
34 | (1%; 4; 7 s; 28 s) | 6.54 | Good quality |
35 | (1.5%; 4; 7 s; 28 s) | 6.29 | Good quality |
36 | (2%; 4; 7 s; 28 s) | 5.88 | Fair quality |
37 | (0.05%; 7; 1 s; 7 s) | 8.07 | Excellent quality |
38 | (0.1%; 7; 1 s; 7 s) | 7.89 | Good quality |
39 | (0.5%; 7; 1 s; 7 s) | 7.03 | Good quality |
40 | (1%; 7; 1 s; 7 s) | 7.80 | Good quality |
41 | (1.5%; 7; 1 s; 7 s) | 6.84 | Good quality |
42 | (2%; 7; 1 s; 7 s) | 6.02 | Good quality |
43 | (0.05%; 7; 4 s; 28 s) | 8.20 | Excellent quality |
44 | (0.1%; 7; 4 s; 28 s) | 7.13 | Good quality |
45 | (0.5%; 7; 4 s; 28 s) | 6.84 | Good quality |
46 | (1%; 7; 4 s; 28 s) | 6.33 | Good quality |
47 | (1.5%; 7; 4 s; 28 s) | 5.88 | Fair quality |
48 | (2%; 7; 4 s; 28 s) | 5.63 | Fair quality |
49 | (0.05%; 7; 7 s; 49 s) | 7.63 | Good quality |
50 | (0.1%; 7; 7 s; 49 s) | 6.75 | Good quality |
51 | (0.5%; 7; 7 s; 49 s) | 6.74 | Good quality |
52 | (1%; 7; 7 s; 49 s) | 6.23 | Good quality |
53 | (1.5%; 7; 7 s; 49 s) | 5.04 | Fair quality |
54 | (2%; 7; 7 s; 49 s) | 4.84 | Fair quality |
55 | (0.05%; 10; 1 s; 10 s) | 7.89 | Good quality |
56 | (0.1%; 10; 1 s; 10 s) | 7.87 | Good quality |
57 | (0.5%; 10; 1 s; 10 s) | 7.67 | Good quality |
58 | (1%; 10; 1 s; 10 s) | 6.89 | Good quality |
59 | (1.5%; 10; 1 s; 10 s) | 6.01 | Good quality |
60 | (2%; 10; 1 s; 10 s) | 6.03 | Good quality |
61 | (0.05%; 10; 4 s; 40 s) | 7.87 | Good quality |
62 | (0.1%; 10; 4 s; 40 s) | 7.18 | Good quality |
63 | (0.5%; 10; 4 s; 40 s) | 6.62 | Good quality |
64 | (1%; 10; 4 s; 40 s) | 6.41 | Good quality |
65 | (1.5%; 10; 4 s; 40 s) | 5.17 | Fair quality |
66 | (2%; 10; 4 s; 40 s) | 5.35 | Fair quality |
67 | (0.05%; 10; 7 s; 70 s) | 7.59 | Good quality |
68 | (0.1%; 10; 7 s; 70 s) | 7.10 | Good quality |
69 | (0.5%; 10; 7 s; 70 s) | 5.93 | Fair quality |
70 | (1%; 10; 7 s; 70 s) | 4.68 | Fair quality |
71 | (1.5%; 10; 7 s; 70 s) | 4.57 | Fair quality |
72 | (2%; 10; 7 s; 70 s) | 4.16 | Fair quality |
5 Fuzzification of scalar values
-
The test was conducted with a large number of test subjects, using a large number of test sequences of varying quality; thus, the natural ambiguity of human opinions surfaced.
-
The subjects watched the sequences in real-life and uncontrolled test conditions.
-
The design of the questionnaire and the use of 11-point numerical scales allowed collecting continuous data.
-
The combined impact of all three objective parameters on test subjects’ perception.
5.1 Defining the clusters and fuzzy membership functions for the input parameters
Figure | First membership function (dashed line) | Second membership function (solid line) | Third membership function (dashed–dotted line) | |||
---|---|---|---|---|---|---|
\(\bar {x}\)
|
\(\sigma\)
|
\(\bar {x}\)
|
\(\sigma\)
|
\(\bar {x}\)
|
\(\sigma\)
| |
Figure 5b | 0.4545 | 0.6574 | 0.8758 | 0.5398 | 1.3937 | 0.4887 |
Figure 6b | 1.6513 | 2.4 | 6.5083 | 1.748 | 9.3728 | 2.061 |
Figure 7b | 6.4254 | 13.73 | 33.0713 | 10.92 | 67.1134 | 16.33 |
5.2 Development of fuzzy membership functions for the output parameter
Membership function of the cluster |
\(\bar {x}\)
|
\(\sigma\)
|
---|---|---|
Bad quality | 1.42 | 0.648 |
Poor quality 1 | 2.5 | 0.5308 |
Poor quality 2 | 3.5 | 0.5308 |
Fair quality 1 | 4.5 | 0.5308 |
Fair quality 2 | 5.5 | 0.5308 |
Good quality 1 | 6.5 | 0.5308 |
Good quality 2 | 7.5 | 0.5308 |
Excellent quality | 8.44 | 0.648 |
6 Defuzzification to the output of the model
6.1 A set of fuzzy rules of the model
First input parameter: PLR | Second input parameter: number of PLOs | Third input parameter: total duration of PLOs | Output: QoE | ||||
---|---|---|---|---|---|---|---|
IF | Imperceptible quality distortion | AND | Negligible frequency | AND | Negligible duration | THEN | Excellent quality |
IF | Imperceptible quality distortion | AND | Slightly annoying frequency | AND | Negligible duration | THEN | Good quality 2 |
IF | Imperceptible quality distortion | AND | Very annoying frequency | AND | Negligible duration | THEN | Good quality 2 |
IF | Imperceptible quality distortion | AND | Negligible frequency | AND | Slightly annoying duration | THEN | Good quality 2 |
IF | Imperceptible quality distortion | AND | Slightly annoying frequency | AND | Slightly annoying duration | THEN | Good quality 2 |
IF | Imperceptible quality distortion | AND | Very annoying frequency | AND | Slightly annoying duration | THEN | Good quality 2 |
IF | Imperceptible quality distortion | AND | Slightly annoying frequency | AND | Very annoying duration | THEN | Good quality 2 |
IF | Imperceptible quality distortion | AND | Very annoying frequency | AND | Very annoying duration | THEN | Good quality 2 |
IF | Slightly annoying quality distortion | AND | Negligible frequency | AND | Negligible duration | THEN | Excellent quality |
IF | Slightly annoying quality distortion | AND | Slightly annoying frequency | AND | Negligible duration | THEN | Excellent quality |
IF | Slightly annoying quality distortion | AND | Very annoying frequency | AND | Negligible duration | THEN | Good quality 2 |
IF | Slightly annoying quality distortion | AND | Negligible frequency | AND | Slightly annoying duration | THEN | Good quality 1 |
IF | Slightly annoying quality distortion | AND | Slightly annoying frequency | AND | Slightly annoying duration | THEN | Good quality 1 |
IF | Slightly annoying quality distortion | AND | Very annoying frequency | AND | Slightly annoying duration | THEN | Good quality 1 |
IF | Slightly annoying quality distortion | AND | Slightly annoying frequency | AND | Very annoying duration | THEN | Good quality 1 |
IF | Slightly annoying quality distortion | AND | Very annoying frequency | AND | Very annoying duration | THEN | Fair quality 1 |
IF | Very annoying quality distortion | AND | Negligible frequency | AND | Negligible duration | THEN | Good quality 2 |
IF | Very annoying quality distortion | AND | Slightly annoying frequency | AND | Negligible duration | THEN | Good quality 1 |
IF | Very annoying quality distortion | AND | Very annoying frequency | AND | Negligible duration | THEN | Fair quality 2 |
IF | Very annoying quality distortion | AND | Negligible frequency | AND | Slightly annoying duration | THEN | Fair quality 2 |
IF | Very annoying quality distortion | AND | Slightly annoying frequency | AND | Slightly annoying duration | THEN | Fair quality 2 |
IF | Very annoying quality distortion | AND | Very annoying frequency | AND | Slightly annoying duration | THEN | Fair quality 1 |
IF | Very annoying quality distortion | AND | Slightly annoying frequency | AND | Very annoying duration | THEN | Poor quality 2 |
IF | Very annoying quality distortion | AND | Very annoying frequency | AND | Very annoying duration | THEN | Poor quality 2 |
6.2 The model output
7 Results of the model
Reference | Number of test subjects | Number of test sequences | Number of input parameters | Mechanism of the inference system | FR, RR or NR | Pearson correlation coefficient |
---|---|---|---|---|---|---|
Chan et al. [31] | 21 | 30 video clips | Two models were developed The first model is based on averaged PSNR values of all distorted frames The second model is based on the ratio of distorted frame rate and averaged PSNR of distorted frames | Linear regression | FR | 0.8664 and 0.935 (for two types of the PSNR modification) |
Leszczuk et al. [33] | 24 | 59 video clips | Three parameters (SSIM index averaged over the worst second, number of separate losses, spatial activity) | Multiple linear regression analysis | FR | 0.933 |
da Silva Cruz et al. [37] | 35 | 2962 video clips, each lasting 10 s, used to train artificial neural network (first stage of the cascaded estimator) 20 video clips, each lasting 10 s, used for subjective evaluation (second stage of the cascaded estimator) | Three parameters (PLR, frame degradation ratio and information loss ratio) | Two-stage cascaded estimator consisted of the artificial neural network (ANN) used to estimate SSIM Index and Logistic model (ITU-R BT.500 − 13) used to estimate differential MOS | NR | 0.894 |
77 | 18 video clips | Three parameters (PLR, delay and encoding bit rate) | Offline construction of a QoE (k-dimensional Euclidean) space | NR | 0.9964 | |
Robalo and Velez [45] | Uses the results from [44] | Uses the results from [44] | Uses the parameters from [44] | Sixth-degree polynomial equation | NR | 0.9154 |
Anegekuh et al. [47] | 97 | 6 video clips | Three parameters (content type metric combined with quantization parameter and PLR) | Combination of encoding quality prediction (depends on quantization parameter and content type in relation to MOS) and impact of PLR on the perceived quality Model coefficients are obtained through regression fittings | FR | 0.92 |
Konuk et al. [48] | Not available. MOS values were taken from different video databases | Not reported | Two parameters (MOS for video and MOS for audio) | Linear combination of video and audio quality and their product. Model also includes video spatio-temporal characteristics | NR | 0.8132 and 0.8443 for subjective and objective video MOS, respectively |
Hameed et al. [51] | 100 | 288 video clips | Seven parameters extracted from a video stream (the parameters describe video content characteristic, encoding distortions and packet loss effects) | A decision tree | NR | Not available. The accuracy of the model is calculated by comparing the model output (poor, bad, fair, good, very good or excellent quality) with the subjects rating which was categorized in the same groups. The model evaluated 128 out of 144 videos correctly (88.9%) |