01.12.2019 | Research | Ausgabe 1/2019 Open Access

# Performance bottleneck analysis and resource optimized distribution method for IoT cloud rendering computing system in cyber-enabled applications

## 1 Introduction

## 2 Related work

## 3 Methods

_{0}and t

_{1}as constants in this section) are as follows: (1) t

_{0}means a time required that users start an operation and balance loading to resource management server. (2) t

_{1}means t

_{0}plus query, get pictures, and return results time. (3) t

_{dispatch}means send instructions time. (4) t

_{render}means execute the render instruction time at the render machine. (5) t

_{upload}means the upload render results to the file server and the database server time.

### 3.1 Instructions distribution time consumption

_{dispatch}= t

_{dis _ wait}+ t

_{ask}+ t

_{send}. The t

_{dis _ wait}means the commands waiting time in distribution queue, which can be ignored when the queue is empty. The t

_{ask}means time required of scheduling system to query all rendering machine performance status. The t

_{send}means the time required of the scheduling system send out commands.

### 3.2 Rendering command processing time consumption

_{render}= t

_{rend _ wait}+ t

_{scene _ create}+ t

_{take _ photo}. The t

_{rend _ wait}expresses waiting time in render queue, which can be ignored when there is no queuer in the render queue. t

_{scene _ create}means the time required to execute instructions when a scene is created. t

_{take _ photo}means the time required to shoot all the pictures.

## 4 The theory analysis of cloud rendering system business process

### 4.1 Single task rendering process

_{dispatch}part, t

_{dis _ wait}is omitted if there is no wait in queue; t

_{ask}is parallel TCP request time. Each part is a long connection, so \( {t}_{\mathrm{ask}}=2\left(\frac{\Delta _d}{W}+{\Delta }_w\right) \); it needs two times to determine whether the instructions communications is a success or not, so \( {t}_{\mathrm{dispatch}}=4\left(\frac{\Delta _d}{W}+{\Delta }_w\right) \).

_{render}: t

_{dis _ wait}is omitted if there is not wait in queue; scene creation process needs consume time t

_{scene _ create}= P∆

_{p}; scene requires time t

_{scene _ create}= P∆

_{p}; finally, t

_{render}= P∆

_{p}+ C∆

_{c}.

_{upload}means all pictures uploaded in time. Among them, the TCP long connection time requirement \( {t}_{\mathrm{up}\_\mathrm{pic}}=\mathrm{C}\left(\frac{\mathrm{D}}{W}+{\Delta }_w\right)+{\Delta }_{tcp} \) and the results upload database time requirement \( {t}_{\mathrm{up}\_\mathrm{database}}=\frac{\Delta _d}{W}+{\Delta }_w+{\Delta }_{\mathrm{tcp}} \).

_{SR}are mainly in scene creation time ∆

_{p}, picture shoot time, and transmission time \( \mathrm{C}\left({\Delta }_c+\frac{D}{W}\right) \).

### 4.2 Multi-task rendering process

_{MR}can be expressed as

_{render}part, because there are multiple render machines parallel rendering, tasks will be evenly distributed to each machine according to task scheduling strategy, and rendering machine mostly receives \( \left\lceil \frac{\mathrm{S}}{M}\right\rceil \) tasks. Because there are M renderer, each renderer task distribution cycle \( \overline{t_d} \) is as follows, \( \overline{t_d}=\frac{M}{N}\left({t}_{\mathrm{ask}}+{t}_{\mathrm{send}}\right)=\frac{4M}{N}\left(\frac{\Delta _d}{W}+{\Delta }_w\right) \). The definition of an average time of task occupied space \( \overline{t_o} \) is as follows: \( \overline{t_o}=P{\Delta }_{\mathrm{p}}+\mathrm{C}{\Delta }_{\mathrm{c}} \). In the premise of the above definition, the paper gives an important conclusion. For each renderer, if \( K\overline{t_d}\ge \overline{t_o} \) or S ≤ M · K, the rendering engine will never have tasks waiting in render queue. Therefore, as long as the condition \( K\overline{t_d}\ge \overline{t_o} \) or S ≤ M · K is established, any tasks that will arrive will be assigned immediately to site for rendering without congestion. On the contrary, if these two conditions are not established, namely, the condition of \( K\overline{t_d}<\overline{t_o} \) and S > M ∙ K is established, rendering system congestion will occur.

#### 4.2.1 Multi-task rendering process in blocking status

_{r2}is

_{s}(i) represents required time function before each rendering site achieves full load working status. The parameter i indicates the number of rendering site, then t

_{s}(i) is expressed as the following formula:

_{s}(i) of each rendering site is not equal and increases with the growth of i, so the number of each site assigned can be defined as dis

_{c}(i) function. It is expressed as following formula:

^{+}in the paper. It will become an increasing function at this time

#### 4.2.2 Multi-task rendering results upload time consumption

_{upload}part, since all upload tasks are carried out by the parallel way, this section is consistent with the single task system as show in the following formula,

_{MR(A)}is defined as the time consumption of non-blocking status, T

_{MR(B)}is defined as the time consumption of the blocking status. Non-blocking task scheduling is a single master processor and there are worker/client processors. Each task has all the data it needs to compute, but gets the index to work on from the master. After the computation, the worker returns some data to the master. The bottom line is if a task takes too long to compute then it becomes the limiting factor and the master cannot move on to assign an index to the next worker even if non-blocking techniques are used. Is it possible to skip assigning to a worker and move on to next. T

_{MR(A)}can be expressed as

_{w}≈ 0, so the formula can be further simplified as \( {\mathrm{T}}_{MR(A)}=4\left\lceil \frac{\mathrm{S}}{N}\right\rceil \frac{\Delta _d}{W}+\left(\mathrm{P}{\Delta }_p+\mathrm{C}{\Delta }_c\right)+\frac{\mathrm{CD}}{W}+{\mathrm{Q}}_1.{\mathrm{T}}_{\mathrm{MR}(B)} \) can be further simplified as T

_{MR(B)}= t

_{0}+ t

_{r2}+ t

_{upload}+ t

_{1}. It can achieve good communication when ∆

_{w}≈ 0, so we can obtain after the expansion and simplification,

_{MR}can be summarized as

_{MR}is linear. (2) The relationship of web server number N and multi-task rendering time T

_{MR}is inversely proportional. (3) The rendering time P∆

_{p}+ C∆

_{c}is the coefficient of parameter S in the blocking status. (4) The number of render machine M and the number of sites K will have a direct contribution to the T

_{MR}in the blocking status. (2) The number of concurrent users is mainly restricted by the average response delay, because the increase of the number of concurrent users will directly lead to the increase of average response delay.

## 5 System performance optimization results and discussion

### 5.1 The first test experiments

### 5.2 The second test experiments

Model name | Patch number | Vertex number |
---|---|---|

Virtual park scene | 3,319,336 | 105,072 |

Vertex number | 1,707,673 | 106,898 |

Frame number | 12,035 | 142,379 |

### 5.3 Status transition function

### 5.4 Average response time analysis

_{MR(B)}substract T

_{MR(A)}can obtain the time difference \( {T}_{\Delta }={T}_{MR(B)}-{T}_{MR(A)}=\frac{4{\Delta }_d\left(M\left(\mathrm{K}-2\right)-\mathrm{S}\right)}{N\bullet W}+\frac{S}{M\bullet K}\left(P{\Delta }_p+C{\Delta }_c\right) \). The average response time difference can be obtained by T

_{∆}divided by S, namely, \( \overline{T_{\Delta }}=\frac{T_{\Delta }}{S}=\frac{4M{\Delta }_d\left(\mathrm{K}-2\right)}{N\bullet W\bullet S}+\frac{1}{M\bullet K}\left(P{\Delta }_p+C{\Delta }_c\right)-\frac{4{\Delta }_d}{N\bullet W} \). According to the status, transfer function shows that when G(N, M, K) = 0, the system at the critical points. So the status \( G\left(N,M,K\right)=0\to \frac{\mathrm{M}\bullet \mathrm{K}}{N}\bullet \frac{4{\Delta }_d}{W}-\left(P{\Delta }_p+C{\Delta }_c\right)=0\to \frac{1}{M\bullet K}\left(P{\Delta }_p+C{\Delta }_c\right)-\frac{4{\Delta }_d}{N\bullet W}=0 \). Because of K ≥ 2, so \( \frac{4M{\Delta }_d\left(K-2\right)}{N\bullet W\bullet S}\ge 0 \) was established. So we can get the conclusion that when the system is in the critical points of the blocking and non-blocking status, the average response time T

_{MR(A)}is better than T

_{MR(B)}.

### 5.5 System performance decline rate analysis

_{MR}(S, N, M, K) can be used to represent the time consumed by the multi-task rendering process. S can be seen as the main variable function. Because the value of the variable with the number of users will change at any time. While the remaining variables can be regarded as the secondary variables, these variables will set the default values in the cloud rendering system T

_{MR}calculate partial derivation for S. In order to facilitate the discussion, the following variables are further defined:

_{MR}in (M ∙ K, +∞) interval growth are discussed, while the conditions ∆

_{w}= 0, the G(N, M, K) < 0 are satisfied, and the \( {T}_{s2}^{\prime } \),

_{MR}in (M ∙ K, +∞) status interval change will always tend to growth faster aspect. However, after determining the growth rate of T

_{MR}, it still can adjust the parameters to improve the performance of the system. In order to test the algorithm of this paper, we choose the current popular algorithm [21–23] to test system pressure. After blocking, the average response time of the algorithm is shown in the following Table 2.

### 5.6 System performance tuning strategy

_{MR}is \( {T}_{s1}^{\prime } \), and there is an upper limit of N,

_{MR}is \( {T}_{s2}^{\prime } \), we can reduce the growth rate by the following ways: increase the product size of the parameters M ∙ K and reduce the size P∆

_{p}+ C∆

_{c}. So we can choose to upgrade M ∙ K to the upper limit

## 6 Conclusion

_{MR}. This is a new performance optimization scheme for cloud rendering system.