Skip to main content
main-content
Top

About this book

This book constitutes the refereed proceedings of the 22nd CCF Conference on Computer Engineering and Technology, NCCET 2018, held in Yinchuan, China, in August 2018.

The 17 full papers presented were carefully reviewed and selected from 120 submissions. They address topics such as processor architecture; application specific processors; computer application and software optimization; technology on the horizon.

Table of Contents

Frontmatter

Design and Application for Sealing of Strengthening Computer for Anti-hard Environment

Abstract
Reasonable sealing is an effective protective measure for the reinforced computer from corrosion and electromagnetic interference in harsh environment. In this paper, we analysed the sealing problem of the reinforced computer from the aspects of sealing material selection and sealing structure design. As a result, it is proposed that a solution to the problem of computer sealing against harsh environment. Taking a certain type of reinforced computer as an example, the sealing design was completed and relative tests were perfectively accomplished, in accordance with the requirements of the national military standard. The test results show that the problems, such as the leakage and electromagnetic interference did not take place in this type of reinforced computer. Fortunately, this solution improves the reliability of the reinforced computer.
Qianqian Yang

An Independent VGA Controller Based on SOPC with Three Pixel-Mapped Schemes

Abstract
As a standard display interface, VGA (Video Graphics Array) has been widely used. In this paper, we propose an independent VGA controller, and the CPU (Central Processing Unit) does not need to control and transmit data, which can save hardware resource and enhance data processing speed, compared to the regular VGA design. Specifically, the controller consists of a synchronizing module, a memory module, and a palette module. We implement three pixel-mapped schemes, including bit-mapped scheme, block-mapped and object-mapped scheme, compared with the traditional mapping scheme. Their signal activities are 5.00 \(\times \) \(10^7\), 1.67 \(\times \) \(10^7\), 3.33 \(\times \) \(10^7\) and 5.67 \(\times \) \(10^7\) respectively which measure the display efficiency of the VGA controller. Their synthesized registers are 2348, 2412, 2560 and 2072, which reflect different resource utilization. Our functional simulations and logic syntheses prove that the proposed VGA controller design has strong flexibility, short design cycle, and low production cost under the provided circumstances of application.
Zerun Li, Yande Jiang, Yang Guo

An Inter-Layer-Distance Based Routing Algorithm for 3D Network-on-Chip

Abstract
The three-dimensional Network-on-Chip (3D NoC) has been proposed to resolve the complex on-chip communication issues in multicore systems by using die stacking technology in recent years. It is more difficult to guarantee performance in 3D NoC system than 2D because of stacking dies and the unequal thermal conductance of different logic layers. To ensure the system performance and availability, we proposes an Inter-Layer-Distance based Routing (ILDR) algorithm, which distributes the traffic according to the inter-layer-distance from source node to destination node. We simultaneously consider the buffer status and node temperature of neighbors on path to determine the horizontal route of the next hop. The simulation results show that the proposed ILDR algorithm can apparently reduce network latency and improve network throughput in different experimental traffic patterns. Although the energy consumption is increased, the Energy delay product (EDP) is reduced, so ILDR is a power-efficient solution for 3D NoC.
Tong Zou, Chengyi Zhang, Xuefeng Peng, Yuanxi Peng

Impact of Temperature Characteristics on High-Speed Optical Communication Modules

Abstract
This paper presents a method to evaluate the impact of temperature characteristics on vertical cavity surface emitting laser (VCSEL) module. As one of the core modules in the optical communication system, the performance of VCSEL strongly influences the communication quality of the high-speed optical communication system. However, it is difficult to directly analyze the temperature change of VCSEL. In order to solve this problem, batches of laser sources have been integrated into the optical communication module, the physical properties of the laser beams then can be easily measured at different temperatures (low temperature −5 °C, room temperature 25 °C and high temperature 70 °C). By analyzing the wavelength, ext. ratio and the margin of eye diagram of these laser beams, we calculate the percentage value which referrers to an engineering experience standard value as the evaluator, to describe the quality of the optical communication system. The performance of communication quality is evaluated under different parameters, including amplitude, emphasis, mode and bias etc. Several tests have been preceded which all obtained the satisfactory results.
Tengyue Li, Libing Liu, Yifan Song, Mingche Lai

A Parallel 1-D FFT Implementation Method for Multi-core Vector Processors

Abstract
This paper presents an efficient parallel 1-D FFT implementation method based on the architecture features of multi-core vector processor. It divides the parallel computation of large-point 1-D FFT into the (n-m)-level parallel FFT computation and M-point parallel FFT computation according to the number of data points M that can be accommodated in the global cache (GC). The parallel FFT computation for each stage are performed using a shared DDR data method in (n-m)-level FFT computation. In the M-point parallel FFT computation, a parallel FFT computation method based on the matrix Fourier algorithm is designed, it converts the original M-point 1-D FFT computation into a 2-D FFT computation, and achieves parallel FFT computation using a shared GC data method, which avoids multiple data transfers between GC and AM and reduces data transmission overhead. Merge Column FFT computation with factor matrix multiplication and column FFT computation results in the AM, which further reduces the number of data transfer between AM and GC, and can significantly improve the efficiency of M-point FFT computation. The experimental results on Matrix show that the average speedup of the single-core single-precision 1-D FFT is 8.26 times and the average speedup of the dual-core single-precision 1-D FFT is 6.78 times compared with the TMS320C6678 with the same frequency.
Zhong Liu, Xi Tian

The Analysis and Countermeasures of Mobile Terminal RE or RSE Problem

Abstract
Considering the electronic product stability and personal safety, most countries strictly formulate a lot of EMC compulsory certification standards in the field of information technology. However, during the authentication testing of the mobile terminals, the problem of Radiated Emission (RE) or Radiated Spurious Emission (RSE) occurs often and it is usually difficult to be solved. In this paper, reasons that result the RE and RSE problems of mobile terminals are analyzed at first. The architecture of related test system is introduced. A new method to solve the problems is proposed. And then, detailed experimental countermeasures and process are illustrated to solve the problem. Finally, some design guidance for RE/RSE problem is concluded. This paper has shown that our method is an effective way to eliminate or decrease the probability of RE/RSE problem for mobile terminal design.
Liangliang Kong, Lin Chen

Robust Option Pricing Under Change of Numéraire

Abstract
In this paper, we consider the problem of option pricing from the perspect of minimax algorithm, an online learning framework. We introduce numéraire, which is a unit of account in economics, to the market dynamic as a multi-round game between two players: the investor and the nature. In this way, we are able to apply the online learning framework namely minimax algorithm in game theory. We model the repeated games between the investor and the nature as a price process under different numéraires, thus permit arbitrary choice of numéraire, and study this model under no arbitrage condition of a complete market. We also relax the constraint of convex payoff functions in previous works by characterizing the explicit mixed-strategy Nash equilibrium in a single-round game, and then generalize this result to multi-round games.
Guyue Hu, Weixia Xu

Numerical Simulation Study on Heat Exchange Effect of Open Computer

Abstract
This article conducts a thermal simulation analysis of an open computer. Through the simulation results, the module structure and the chassis structure are optimized. And verify the reliability of thermal design of the chassis. It provides reference for thermal simulation analysis and thermal optimization design of other similar electronic devices.
Xiangci Meng

A High-Matching Low Noise Differential Charge Pump for PLL

Abstract
This paper presents a high matching charge pump with low noise. Two pairs of charge pumps in differential structure alleviate charge sharing and improve static mismatch. This simple structure with minimum number of transistors can reduce the noise of CP. A differential low amplitude buffer stage is proposed to reduce the dynamic mismatch of CP. A 3.125 GHz PLL is implemented with the proposed charge pumps in 65 nm CMOS process. In simulation, the proposed CP achieved good static mismatch and dynamic mismatch in a dynamic range larger than half VDD. The noise simulated at 1 kHz achieved −227 dB. The reference spur measured at 25 MHz was lower than −51.5 dBc. The test results show the good performance of proposed CPPLL.
Hengzhou Yuan, Yang Guo

A 12.5-Gb/s Equalizer with CTLE and a 4-Tap Quarter-Rate DFE in 40 nm Technology

Abstract
On account of finite channel bandwidth and reflection, receiver cannot receive data accurately resulting from ISI. To satisfy the transmission requirements of PCIE3.1 and Rapid IO3.2, this paper presents a 12.5 Gb/s equalizer based on 40 nm CMOS. It uses Continuous-Time Linear Equalizer (CTLE) and a quarter-baud-rate decision feedback equalizer (DFE) with 4 taps. Finally, the receiver can effectively balance data and restore eye diagram with a channel loss of 28 dB at 12.5 Gb/s. The layout area of equalizer is 0.66 mm2, and its consumption is 33.08 mW from a 1.1-V supply.
Qing Xu, Jianjun Chen, Yueyue Chen, Bin Liang, Bo Xiong, Yuan Luo, Jizuo Zhang

Design and Implementation of a Domain Specific Rule Engine

Abstract
Security strings are often needed in identity authentication mechanism. Security strings recovery is a reverse process, which does much calculations on a large amount of possible strings to find the right one, so that we can recover lost or forgotten strings and regain access to valuable information. In this reverse process, we need first process basic strings based on transformation rules, so as to generate new ones quickly. Rule processing is complex, which has high requirements for computing power, processing time, especially system power consumption. In response to the above requirements, this work puts forward the idea of accelerating the processing of rules using hardware for the first time, and a domain specific rule engine is designed and implemented on the existing FPGA platform. The experimental results show that the performance of the rule engine on a single Xilinx Zynq 7z030 FPGA is better than that of CPU, its performance power ratio is 3 times higher than that of GPU, and 50 times higher than that of CPU. The speed and energy efficiency of the rule processing is improved effectively.
Mengdong Chen, Xinjian Zhou, Dong Wu, Xianghui Xie

High-Speed Circuit Power Integrity Design Based on Impedance Characteristic Analysis

Abstract
Power integrity (referred to as PI) issues have become increasingly important in today’s high-speed circuit designs, at the same time, the complexity of power integrity analysis has increased. Based on the two-port network model, paper establishes the small signal model of the power system and the transmission matrix model of the PCB power supply ground plane system, innovatively combines power supply design and PCB design to realize the impedance control of the power distribution system and improve the power integrity of the circuit. Taking the design of a certain type of network card as an example, based on the impedance model, through simulation, the target impedance is controlled in different frequency bands. The method is verified by measuring the dynamic characteristics and noise of the chip power supply.
Guangming Zhang

Applying Convolutional Neural Network for Military Object Detection on Embedded Platform

Abstract
Object detection has always been an important part in the field of image processing. The traditional object detection algorithm has complex structure and operations. With the continuous development of deep learning technology, Convolutional Neural Network (CNN) has become an advanced object detection method. Because of its high accuracy, stability, and speed of operation, this method is widely used in many fields. In this work, we use CNN to achieve the detection of military objects. It uses the idea of regression to build a model, which is fast and accurate and can achieve detection in real-time. Unlike image classification, image detection requires more parameters and calculations, and therefore it is difficult to be placed on a small embedded platform. We analyzed some of state-of-the-art object detection network, replace the traditional fully connected layer with global average pool layer, generate region proposals using the anchor boxes, and apply it to military object detection. Finally, we deployed it successfully on TMS320C6678, which is a low-cost, low-power embedded platform. A well-performing and easy-to-deploy military object detection system is realized, which helps to improve the accuracy and efficiency of military operations.
Guozhao Zeng, Rui Song, Xiao Hu, Yueyue Chen, Xiaotian Zhou

An Optimization Scheme for Demosaicing Algorithm on GPU Using OpenCL

Abstract
With the popularity of GPU which has the high performance computing feature, more and more algorithms have been successfully transplanted to the GPU platform and achieved high efficiency. But existing videos or images processing methods, such as demosaicing algorithm, have not fully exploited the parallel computing capacity of heterogeneous processing platform and the video frame rates can’t meet real-time requirements. In order to take full advantage of the computing power of GPU under the heterogeneous processing platform, an optimization scheme is proposed in this paper. We use the demosiacing algorithm as a case and modify the algorithm. By exploiting the GPU’s memory hierarchy, the optimization scheme improves the parallelism of the algorithm while reducing the memory access latency, and greatly reduces the execution time. Then we achieve the zero-copy at the same time. The experimental results show that optimization version has a significant performance improvement, the optimized OpenCL version is up to 6x comparing with the basic OpenCL version about kernel execution.
Tongli Wang, Wei Guo, Jizeng Wei

The Various Graphs in Graph Computing

Abstract
The world is full of relationships, and graph is the most evident representation for them. With the increasing of data scale, graphs become larger and have encountered a new world of analyzing. What can we learn from a graph? How many kinds of graphs are there? How different is graph from one area to that from another? All these questions need answers, but previous research on graph computing mainly focused on computing frameworks and systems, paying little attention to graph itself.
In this paper, we studied graphs of different kinds, different scales and different mining methods, trying to give a sketcher and classification of graph categories. Besides, we studied characters and analyzed algorithms in each category. We researched public graph datasets to show current graph scale and its trend for future infrastructure.
Rujun Sun, Lufei Zhang

The Implementation and Evaluation of High-Speed Link Monitoring Tool for Supercomputer

Abstract
With the increase of system scale and link speed, the link failure has become the most important type of interconnect fault in supercomputers, which has brought great challenges to the maintenance of high-performance interconnect networks. In order to meet the needs of operation and maintenance personnel to monitor the status and performance of all high-speed links of supercomputer in real-time, this paper designs a high-speed link monitoring tool based on in-band network, which has good scalability and robustness for real-time monitoring of high-speed link status and performance information. The tool has been practically utilized in the operation and maintenance of domestic supercomputers to speed up the process of locating and troubleshooting link failures, effectively reducing the downtime of supercomputers.
Jiaqing Xu, Jie He, Xiaotao Hu, Jijun Cao, Lei Zhang, Chongfeng Wang

A Survey of Approaches for Promoting Honest Recommendations in Reputation Systems

Abstract
The efficiency of the reputation mechanism fully depends on the number of received recommendations and the quality of each of them, but a peer may not be willing to provide honest recommendations actively in order to pursue its own interest. To address this problem, a number of schemes have been proposed. It is therefore necessary to give an overview of the representative schemes. In this paper, we present a comprehensive discussion on approaches for promoting honest recommendations in reputation systems. We first classify the existing schemes into two categories: protecting the privacy of recommenders and providing incentive to recommenders. The latter can then be sub-divided into two categories: market-based incentive schemes and policy-based incentive schemes. We then survey some representative schemes in the literature belonging to each category, and summarize their unique characteristics and working principles. Moreover, some open problems in each category are also discussed.
Junsheng Chang, Liquan Xiao, Weixia Xu

Backmatter

Additional information