Skip to main content

Über dieses Buch

This book constitutes the refereed proceedings of the 9th International Forum on Digital TV and Wireless Multimedia Communication, IFTC 2012, Shanghai, China, November. The 69 revised full papers presented were carefully reviewed and selected from numerous submissions. The papers are organized in topical sections on image processing and pattern recognition; image and video analysis; image quality assessment; text image and speech processing; content retrieval and security; source coding; multimedia communication; new advances in broadband multimedia; human computer interface; 3D video.



Image Processing and Pattern Recognition

Research on Networked Integration Technology of Remote Sensing Image Processing

Traditional image process method cannot share data efficiently and realize process interoperately, this paper presents a server-side image process method based on JNI package the local process function, combined with the Web Processing Service standard which is formulated by OGC to design and realize a three-tier network architecture. This architecture is a network integration framework for image process function. The experimental demonstrates that the image process algorithms of local library can package as a WPS service and clients can call the service through the network.

Chao Jiang, Ze-xun Geng, Xiao-feng Wei, Chen Shen

Steganalysis in High-Dimensional Feature Space Using Selective Ensemble Classifiers

Today, modern steganalysis needs to start using high-dimensional feature spaces, which makes the complexity of traditional classifiers such as support vector machine (SVM) increase rapidly. This paper proposes a frame of selective ensemble classifiers as an alternative to SVM for steganalysis by applying the selective theory that ensemble some instead of all the available base learners. A family of weak classifiers is built on random subspaces of the high-dimensional feature spaces. Then, assign a random weight to each classifier and employ genetic algorithm to evolve those weights based on a validation set. The Final classifier is constructed by fusing the decisions of individual classifiers whose weight is bigger than a pre-set threshold


. Experiments with the steganographic algorithms nsF5 and MBS demonstrate the usefulness of the approach over current popular methods.

Bin Chen, Guorui Feng, Fengyong Li

Parallel-Friendly Patch Match Based on Jump Flooding

In this paper, we propose a parallel-friendly algorithm for k-nearest neighbor based patch match. Based on jump flooding algorithm, an efficient pattern of communication, our algorithm is fully parallelized at patch-level. To improve the performance, we propose and analyze its variants, and implement them with GPU. Compared with state-of-the-art approximate patch match algorithm, the GPU implementation of our algorithm achieves up to 100 times speedup over its CPU implementation, and 5 times faster than the GPU implementation of Barnes’s algorithm, a most recently benchmark algorithm.

Pei Yu, Xiaokang Yang, Li Chen

Skew Estimation Based on Haar-Like Features

This paper presents a novel approach for skew estimation of scanned documents. Haar-like features are firstly proposed to construct objective function and then a modified coarse-to-fine search strategy is implemented to reduce computation. Experimental results show that our skew estimation algorithm performs well on general printed documents with different contents, languages and layouts. The accuracy of skew angle estimation is comparable with or better than state-of-the-art methods.

Bing Liu, Li Song

An Efficient Synthesizing and Tone-Mapping System for High Dynamic Range Images

Current research works on high dynamic range (HDR) images put emphasis on the perception quality of the reconstructed image, where an enhanced low dynamic range (LDR) image is directly output as an HDR image from a sequence of LDR images. These works are useful to improve the limited ability of display devices. However, the dynamic range is not actually expanded and the physical properties of real scenes are unavailable in these works. For example, the radiance map of the surrounding scene cannot be recovered in such a direct way, which is an important issue in many industrial and aerospace applications. This paper proposes an efficient synthesizing and displaying system for HDR images. It focus on providing solutions of the following open problems: 1) LDR image registration under camera shaking and object motion; 2) HDR image reconstruction for physical property analysis of real scenes; 3) Structure preservation when compressing dynamic range of HDR image for LDR display devices.

Tianle Zhao, Yi Xu

Pathological Signal Quantitative Analysis Using Quaternion-Based Image Processing

Nowadays, IVF-ET (In Vitro Fertilization-Embryo Transfer) has become one of the best choices to treat infertility. For IVF-ET, the successful rate largely depends on the receptivity of the endometrium, while there is potential application of digital image processing to help the automatic assessment of the receptivity of the endometrium. This paper studies proper methods to get quantitative analysis of HE (Hematoxylin-Eosin) staining medical images in the womb, including the number, density, area and perimeter of the glands. Instead of using monochrome analysis for color images, we represent the color medical images in the form of quaternion matrix to well preserve interrelationship between color channels. Quaternion Singular Value Decomposition (QSVD) is imposed on quaternion matrix to conduct dimension reduction of color vector data and to obtain color texture segmentation using k-means technique. Quaternion-based edge detection and pattern matching are then implemented on segmentation results to locate glands. Consequently, the quantitative analysis is available based on the extracted glands. Rather than reading a large number of medical images all by themselves, doctors can provide diagnosis more efficiently and more accurately using the proposed automatic quantitative analysis of pathological signal.

Mingjia Han, Lin Zhou, Yi Xu

The De-nosing Algorithm of ECG Signal Based on Wavelet Packet

There is always noise mixed with cardio-electric signal via hardware acquisition, which impede patients’ diagnosis. Wavelet transform can well analyze regional signals, and achieve good result in separating signal and noise by taking advantage of their differences shown in both time and frequency domain. Yet wavelet decomposition focuses on approximate coefficient vector, which leads to loss of original features in time domain during the de-noising preprocess as it continues decomposing in lower frequency direction. Wavelet packet transform is proposed in order to overcome wavelet decomposition’s low frequency resolution ratio in high-frequency section and weak time resolution ratio in low-frequency section. It’s a more elaborate method for signal analysis, and well enhances resolution ratio in both time and frequency domain. This paper mainly presents application of wavelet packet transform in cardio-electric signal de-noising.

Gang Wang, Hairun Wang, Yue Zhao, Longlong Zhao

Image and Video Analysis

Improved HSV-Based Gaussian Mixture Modeling for Moving Foreground Segmentation

It is crucial to get the moving foreground for variety video processing system in complex scenes. An improved GMM-based method is developed that can real-time segment moving foreground efficiently. The Gaussian mixture model is improved to effectively detect motion foreground objects even if the object moves slowly. Some relationships between




components in HSV space are adopted to suppress shadow caused by moving objects. The shortcoming in literature that more parameters are needed to remove shadow. Experimental results highlight that the proposed method is computationally cost-effective and robust to segment foreground by comparison.

Yepeng Guan, Jinhui Du, Changqi Zhang

An Improved Algorithm Based on IHS Transform for the Fusion of Object and Scene

Since the existing object-scene fusion method based on IHS transform has two defects: emergence of the scene image’s illusion in target area and the illumination inconformity of the fusion object, this paper proposes an improved algorithm through rearranging the hue component and saturation component of the new scene image which are used to carry out IHS inverse transform. Subjective quality evaluation and objective quality assessment data of comparative fusion experiment results show that the new algorithm not only can get rid of the two drawbacks of the existing object-scene fusion method based on IHS transform effectively and achieve better fusion result in the application of object-scene fusion, but also performs fast.

Youdong Ding, Xiaocheng Wei, Jianliang Lan

A New Method Used in Moving Vehicle Information Acquisition from Aerial Surveillance with a UAV

In this paper, a multiple moving vehicle detecting and tracking framework on aerial range data provided by the unmanned aerial vehicle (UAV) which is installed with a video camera is proposed. The system consists of two modules: moving vehicle detection, multi vehicle tracking. First of all, detect moving vehicles by clustering the singular points obtained after the motion estimation. Then, build a specific data structure to store multi-vehicle’s data and track several vehicles in the shaking video sequence from the UAV. After tracking the vehicle in the video sequence, the speed of the vehicle and record them as traffic flow information would be estimated. Finally, the method on real aerial data and the experiments are estimated and demonstrate the effectiveness of approach.

Fuqiang Liu, Xiaofeng Liu, Pingting Luo, Yingqian Yang, Danqing Shi

A Robust Bus Detection and Recognition Method Based on 3D Model and LSD Method for Public Security Road Crossing Application

Bus detection and recognition in real transportation scenes is a fundamental task for public security road crossing application. In this paper, a novel system is proposed to overcome the high computation complexity and the hard task of training large set of 3D models of the current algorithms. In the proposed system, the 3D model is built according to the contour information of the vehicle itself so that the system is more robust and practical. Meanwhile, the line features of the vehicle are extracted using the LSD (line segment detector) method. Finally, the line features are matched with the 3D model using a combined matching algorithm which reduces the computational complexity of the matching process. Experiments on real videos show the proposed method has a good performance in terms of the high recall ratio and low fall-out ratio.

Wenqi Ma, Hua Yang, Yingkun Wang

Real-Time Human Intrusion Detection Using Audio-Visual Fusion

Human intrusion detection is widely used in intelligent video surveillance systems. It requires not only high accuracy but also real-time performance. In this paper, a real-time human intrusion detection algorithm is proposed to achieve good trade-off between detection accuracy and real-time performance: Firstly, fast HOG-based human recognition is designed, where HOG feature based human recognition is used to increase the detection accuracy, and one spatial-temporal joint detection region shrinking method is developed to reduce the computational load. Considering that the recognition accuracy of HOG-based human detection will drop markedly under occlusion, footstep recognition and a Bayesian Network based video-audio fusion model are proposed to achieve joint decision, which can improve the detection robustness further. Experimental results show that: compared with the existing methods, the proposed scheme can achieve better balance between the time consumption and detection accuracy.

Defu Wang, Shibao Zheng, Chongyang Zhang

Interactive Image Segmentation Based on Grow Cut of Two Scale Graphs

This paper proposes a novel interactive image segmentation algorithm based on the Grow Cut of two different scale graphs. Firstly, Watershed algorithm based on color information has been used to partition the image into many different regions which will be considered as the cells of Grow Cut, instead of image pixels. Then a segmentation result can be obtained by using Grow Cut on the aforementioned regions. Finally an automatic edge correction can be used on the segmentation result by Grow Cut of pixel-scale graph. Because the number of nodes and edges for the Grow Cut algorithm is reduced by more than fifty times compared to the pixel based method, the running time of our proposed algorithm is much less than the original Grow Cut. The segmentation performance of our proposed is much better than the original Grow Cut. Experimental results on Berkeley image dataset demonstrated the effectiveness of proposed method.

Xiaoqiang Li, Jingsong Chen, Huafu Fan

An Improved Video Retargeting Technique Based on Discontinuous Seam Caving

We introduce a new method for video retargeting based on discontinuous seam carving. Existing seam carving based video retargeting method is difficult to maintain spatial and temporal coherence simultaneously. In order to solve this limitation, our proposed method is able to use the optical flow to help keep the temporal coherence. In addition, a new scheme is designed to select the initial seam for video retargeting, which is able to help keep the spatial coherence. Experimental results prove that our proposed method outperforms others in terms of maintaining spatial and temporal coherence significantly.

Yue Chen, Bo Yan, Bo Yang

Interactive Object Segmentation Using Graph Cut and Contour Refinement

This paper presents an interactive object segmentation approach using graph cut and contour refinement, which can accurately extract any user-interested objects from natural images. Using the user-specified scribbles as the interactive input, the initial object segmentation result is obtained under the framework of graph cut. However, due to the problem of color distribution in some images, in which the color distributions of foreground and background are similar, it is nontrivial to achieve an acceptable segmentation quality using one-shot graph cut. Then, an interactive contour refinement scheme is exploited to correct inaccurate object contours to meet the user’s requirement. Experimental results on a variety of images demonstrate the better segmentation performance of our approach.

Minghua Shen, Lin Zha, Zhi Liu, Shuhua Luo

Image Quality Assessment

Generalized Quality Assessment of Blocking Artifacts for Compressed Images

Yim and Bovik proposed a referenced quality assessment method named PSNR-B. Base on PSNR, PSNR-B introduced a blocking effect factor (BEF) to measure the blocking effects of the test images, which is more effective for assessing the quality of deblocked images than PSNR. However, PSNR-B ignores the probable blocking artifacts of reference images, and disallows interchange of the test image and the reference. On these aspects, we find PSNR outperforms PSNR-B on assessing the quality of the test images. This paper aims at improving PSNR-B to a generalized metric system for compressed images. We present an improved quality assessment method of blocking artifacts for compressed images. A blocking effect ratio (BER) factor is proposed to evaluate the differences of blocking effects in both the test and the reference images. The proposed method integrates the merits of PSNR and PSNR-B, and Simulations prove the better performances.

Zhenxing Qian, Wenwen Wang, Xinpeng Zhang

Visual Comfort Assessment Metric Based on Motion Features in Salient Motion Regions for Stereoscopic 3D Video

Visual comfort assessment for stereoscopic 3D video is of great importance for stereoscopic safety and health issue. In order to investigate visual discomfort induced by motion features in salient motion regions, we propose a visual comfort assessment metric that focuses on pixel-level motion features in salient motion regions. In our framework, we propose the pixel-level motion features extraction method based on point detector, Kanade-Lucas-Tomasi(KLT) feature tracker, and Salient Motion Depth Extraction (SMDE) approach. The motion features are spatially pooled and temporally pooled to predict visual comfort score. Subjective assessments have been conducted to evaluate our proposed visual comfort metric using natural stereoscopic videos. The experiment results have been demonstrated that our proposed visual comfort metric improves the correlation with subjective assessments.

Ye Bi, Jun Zhou

A Comparison of Testing Metrics between 3D LCD TV and 3D PDP TV

3D display technologies offer attractive solutions for enriching the multimedia experience. However, both characterization and comparison of 3D displays have been challenging when displays with similar specifications may appear quite different. This paper reviews and compares a series of measurement methods of luminance, turn-off ratio, contrast, cross-talk, angle cross-talk and some color measurements in various stereoscopic displays, including: LCDs with active shutter glasses, LCDs with polarized glasses and PDPs with polarized glasses.

Jing Zhang, Sumei Li, Lili Shen, Chunping Hou

Primary Quality Factor Estimation in Double Compressed JPEG Images Using Quantization Error

In this paper, we propose a new method to estimate the primary quality factor of double JPEG compressed image using quantization error. The method includes three steps: requantize DCT coefficients with a set of possible quantization matrixes, compute


(normalized mean square quantization error), and find the position of the smallest local minimum. Experimental results show that this method can estimate the primary quality factor effectively.

Yang Shuang, Fang Zhen

Measurement Algorithm for Image Structure Noise on Hardcopy

Image structure noise sometimes is a key factor influencing the image quality of hardcopies and widely exists in presswork using halftone technology. This paper describes an algorithm for measuring image structure noise, including moiré and screen dot effect. In order to objectively meter the degree of structure noise,


spectrum of colorful flat-area patch is mainly utilized to represent the structure noise of images, according to the subtraction principle and frequency prediction algorithm of moiré. With human evaluation and rating, a linear model of structure noise is regressed by moiré index and screen dot index. Finally the high correlation between human rating and metrics proves the effectiveness of the model.

Cheng Xia, Tsuyoshi Saito, Li Song

Reduced Reference Image Quality Assessment Based on Image Statistics in Pixel Domain

In this paper, we propose a new reduced reference image quality assessment (RR IQA) algorithm based on the image statistics. The image statistics is modeled in pixel domain, which is based on the gradient distribution of image. Compared with frequency domain coefficients, gradients are more easily calculated. The change of statistics in the gradient domain is measured to evaluate image distortion. To solve this problem, we fit the marginal distribution of image gradients to the integrated Weibull distribution locally. Then the estimated model parameters are extracted as the quality feature. We further propose a new RR IQA metric by quantifying the similarity between the original and the distorted quality features. Experimental results show that the proposed metric outperforms the well known RR IQA metric and has a comparable performance with the widely used full reference IQA metric Peak Signal to Noise Ratio (PSNR).

Xiaolin Chen, Shibao Zheng, Rui Zhang

Text Image and Speech Processing

Text Detection from Natural Scene Images Using Scale Space Model

A scale space based approach is proposed to detect text from natural scene images with complicated background. An edge map containing the edge information of four directions is obtained by Sobel operators. Character areas are detected by connected components analysis and are merged into candidate text regions. We construct a N-level scale space model and compute spatial responses to the Laplacian-of- Gaussian operator at these scale levels. The distribution of some strongest responses obtained from scale space model is employed to verify whether a candidate is a true text region or not. The experimental results demonstrate that the proposed method is able to effectively filter the nontext regions and locate text regions in natural scene images with complicated background.

Qiaoyu Sun, Yue Lu

Text-Independent Writer Identification Using Texture Feature

This paper proposes an efficient method based on texture feature for text-independent writer identification. In order to extract texture feature, we use the modified 2-D Gabor filter, which can decompose the image into sub-bands with different frequencies and orientations. Nearest neighbor classifier based on weighted chi-square distance is utilized in classification. The experiments on a database containing 203 writers of address images demonstrate that the performance of our modified 2-D Gabor filter is better than that of the traditional 2-D Gabor filter and our proposed method achieves promising results.

Dongli Wang, Ying Wen, Yue Lu

Segmentation of Overlapped and Touching Handwritten Chinese Strings

In this paper, we present segmentation of handwritten Chinese strings in presence of overlapped and touching characters. A contour tracing based method is proposed to segment the overlapped characters. To segment touching characters, a corner point analysis method is carried out to identify the cutting positions. Experimental results on 564 Chinese character strings captured from postal mail pieces show the effectiveness of the proposed methods on the segmentation of handwritten Chinese character strings.

Lei Ai, Shujing Lu, Ying Wen, Yue Lu

Word Detecting in Document Image Based on Two-Stage Model

This paper proposes a word detecting method for document image using character models and word models to evaluate the features of single-character and between-character. First, the text line is segmented into several fragments. Second, the candidate character, which is generated by merging some consecutive fragments, will be identified to be the right one if it conforms to the query word character models. Third, the path search strategy is used to search the candidate words constructed with candidate characters. The word model is used to identify the matching cost. Our experimental results on a dataset of document images demonstrate the effectiveness of the proposed method.

Xiujuan Li, Zhimin Huang, Ying Wen, Yue Lu

Competing Model Based Tone Evaluation for Mandarin Speech

Tone is a distinctive feature in Mandarin. This paper describes an attempt to automate the tone evaluation for continuous Mandarin speech. An HMM forced alignment based tone model is used to get tone score for Mandarin syllables. The competing model based approach is introduced to get tonal syllable score. Especially, we generate the syllable-based competing models by using the simplified linguistic knowledge based initial/final net. For the purpose of getting more objective tone evaluation, we integrate tonal syllable score and tone score together to acquire the overall tone scoring results. The experimental results demonstrate that this proposed competing models based method gives an accurate tone evaluation.

Yang Qu, Yue Lu, Patrick S. P. Wang, Xin He

Content Retrieval and Security

Line-Type Moving Object Detection for Sonar Images

This paper proposes a novel solution to process sonar images. It uses intensity Hough transformation to find out line-type moving objects in B-mode images of sonar. Considering that objects in sonar B-mode images always have enough values of intensity and are shown as local peaks, mathematical morphology is adopted to restrain noises, and extract the peaks. The intensity images are involved, which are different from the binary images used by standard Hough transformation. Intensity accumulation is performed in accumulation space. Line-type moving objects are discovered when the accumulation exceeds the preset threshold. The approach is suitable for a variety of underwater environments due to the independence on the model of reverberation. The experimental result illustrates the effectiveness and robustness of the novel solution.

Qiuyu Zhu, Yichun Li, Yilong Jiang

Spatial Detection of Line Scratch Based on Histogram

This paper presents a histogram degradation model of line scratch. The degradation of line scratch results in compression of its column’s histogram, which can be expressed by histogram’s key features. Based on the histogram model, a histogram-based approach of line scratch detection is proposed. By combining the scratch histogram model and a serious of constraints such as

Weber’s Law

, width selection and symmetry selection, the proposed algorithm performs well automatically. The experimental results show that the proposed algorithm work better in terms of false alarms rejection with low computing complexity.

Xi Hu, Xiaokang Yang, Li Chen

An Efficient Isolation Method for Contextual Object Detection

Recent object detection systems utilize contextual information to boost recognition performance. A state-of-the-art contextual object detection method [7] adopts a structural model with greedy forward search inference algorithm. In this paper, we propose an isolation method for contextual object detection. It decomposes a complicated structural learning problem into several “local” ones, which can be efficiently solved by standard SVMs, to boost the speed of training and inference processes. Moreover, such isolation can readily deal with additional real-valued features to further improve the performance. The experimental results on PASCAL VOC 2007 dataset demonstrate the superiority of our method relative to other state-of-the-art ones both in computational cost and detection accuracy.

Yukun Zhu, Jun Zhu, Rui Zhang

A Novel and Adaptive Method for Image Search Reranking

In this paper, we propose a novel and adaptive method for image search reranking. We firstly evaluate different visual features based on the results of image classification on object and scene separately. And visual features are chosen adaptively to rerank the initial image search result. For a given query, it can be classified into either object or scene using the trained classifier on text features. Then, low-level visual features are adaptively selected and fused for image search reranking. Experimental results on large scale image dataset of WebQueries demonstrate the efficacy of the proposed method.

Hong Lu, Guobao Jiang, Zhuohong Cai, Xiangyang Xue

Video Structured Description—Vitalization Techniques for the Surveillance Video Data

Video structured description was proposed for video surveillance system in this paper, and a prototype system was also introduced. As a novel video data vitalization technique, it employs knowledge model of certain surveillance scenario and image/video understanding to distill visual knowledge from video. Benefit from knowledge modeling, visual information distilled by the proposed method could be accessed by other information systems much easier than before. A technical overview was given and some key characteristics were also discussed.

Lin Mei, Xuan Cai, Hongzhou Zhang, Chuanping Hu, Li Qi

Vehicle Trajectory Description for Traffic Events Detection

The trajectory of moving object is a significant feature for events detection in intelligent video surveillance. In this paper, a novel method of trajectory description is proposed to establish the semantic model for automatic traffic violation events detection. Firstly, using polynomial fitting, we classify a trajectory into two shapes: straight line and parabola, which is used to determine the vehicle’s route type: straight, left/right-turn, or U-turn. In the meantime, a region description scheme is also developed to explore the path that one vehicle has passed through, which can be taken as the evidence for traffic event decision. Experiments results showed that the proposed scheme was more efficient and more accurate than the traditional MPEG-7 method.

Chao Yu, Chongyang Zhang, Guang Tian, Longfei Liang

An Automatic TV LOGO Detecting Method

This paper outlines a simple and practical technique for detecting logos characterizing a broadcast station in TV Programme. Traditional detecting algorithms such as feature extraction and template matching has been applied in many applications. However the limitation of these method cannot recognize the transparent logo effectively. More sophisticated learning-based methods have been address these issues, but they typically involve very high computational complexity. We present an automatic TV logo detection method based statistical property of video sequences. Different other approaches, a transparency factor is introduced firstly in this paper. It is a symbol that responses the level of TV logo transparency and indispensability as a part of the logo information. Combined with the statistical property of video sequences, the logo information can be obtained and detected clearly. Experimental results show that the proposed method not only simple, but gives performs well.

Tao Fan, Jianhua Peng, Haiwu Zhao, Guozhong Wang

Source Coding

Quantization Matrix Coding for High Efficiency Video Coding

Quantization matrix (QM) has been adopted in image coding standards such as JPEG and JPEG-2000, as well as in video standards such as MPEG2, MPEG4 and H.264/AVC. QM can improve the subjective quality through frequency weighting on different frequency coefficients. In the latest high efficiency video coding (HEVC) standard, the quantization block sizes can go up to 32x32. To apply the frequency weighting techniques to HEVC, it needs multiple sizes (4x4, 8x8, 16x16 and 32x32) QMs. The bits to signal the multiple matrices will result in a huge overhead. In this paper, a predictive coding method for the quantization matrix is proposed. The bits consumption for QMs can be reduced significantly. Experimental results show that the proposed method is 28x times efficient (96.4% bit saving) than the quantization matrix compression method used in H.264/AVC. Moreover, the proposed method will only introduce negligible complexity on encoder and decoder.

Yijun Mo, Jiaji Xiong, Jianwen Chen, Feng Xu

A Framework of Building Complexity Scalable and Cost-Effective Algorithms for HEVC

More and more inter-frame technologies are adopted in the latest video coding standard HEVC in the aim of improving coding efficiency, which also greatly increase the encoder complexity as well. For application scenarios with constrained and varying computing resources, e.g. portable devices or real-time visual communications, degradation on rate-distortion (R-D) performance is inevitable. How to maximize the R-D performance with constrained and varying computing resources is our main concern. In this paper, we will provide a new approach of building a complexity scalable and cost-effective framework consisting of three hierarchical levels.

Huang Li, Jiyuan Lu, Hongyang Chao

Deblocking Filtering Based on View Correlation for Multiview Video Decoding

A well-designed deblocking filtering scheme should make tradeoff between both coding efficiency and coding complexity. However, the existing multiview video deblocking filtering algorithm has adopted H.264 deblocking scheme to improve the video quality with high coding complexity without consideration the view correlation. In this paper, we propose a low complexity deblocking filtering algorithm based on motion skip mode. It makes use of view correlation to simplify the BS of an MB computation time when the coded macroblock is in motion skip mode. Experimental results show that the proposed algorithm can achieve decoding time saving significantly without any perceptual quality loss in comparison to the deblocking filtering in JMVC.

Yongfang Wang, Wei Zhang, Ping An, Zhaoyang Zhang

Compressive Sensing Image Coding with Perceptual Weighting Measuring Matrix

Compressive sensing is a new technology, which combines data sampling with compressing. Many applications of compressive sensing in image processing and computer vision are being explored. In this paper, we propose a compressive sensing image coding scheme with weighting measuring matrix based on just noticeable distortion, where image coefficients have been adaptively weighted according to their different visual significances. Simulation results demonstrate that the proposed method can greatly improve the quality of the reconstructed image compared with the existing algorithm.

Yundong Song, Yongfang Wang, Xiwu Shang, Zhaoyang Zhang

Error Concealment for Whole Frame Loss in HEVC

The next-generation video coding standard, called High Efficiency Video Coding (HEVC), obtains high efficiency compression by using variable size coding block based on quad-tree structure. However, when HEVC stream is transmitted through the network, the burst loss of data packet may lead to the loss of the whole frame. By analysis of the texture features of video, motion vector (MV) extrapolation based on variable size coding blocks is proposed in this paper to recovery the lost frame fast and effectively. Experimental results show that the proposed method can reduce computational complexity effectively, compared to traditional error conceal algorithms in case of the similar PSNR.

Chang Liu, Ran Ma, Zhaoyang Zhang

Error Concealment for B-View in Multi-View Video

As one of effective prediction structures of multi-view video (MVV), the Joint Multi-view Video Model (JMVM) has introduced much more prediction structures to reduce the inter-view correlations as well as spatio-temporal correlations. Therefore, this makes compressed bit-streams very sensitive to transmission errors and leads to error propagation easily. In order to against the error propagation, JMVM based error concealment (EC) algorithms are necessary. In this paper, a fast efficient B-view based error concealment algorithm for MVV is proposed, which takes full use of the characteristic of the hierarchical B prediction structure and adjusts the searching range of B-view in neighboring views. Experiment shows that the proposed algorithm can effectively reduce the computation complexity with no reducing the video quality at the same time.

Ran Ma, Deyang Liu, Liang Liang, Ping An

Texture and Correlation Based Fast Intra Prediction Algorithm for HEVC

As the newest video coding standard, HEVC adopts plenty of state-of-art techniques to improve the coding efficiency. Thus its performance is better than that of all the existing standards. However, these modifications also generate huge computational complexity. To solve this problem, we proposed two fast intra prediction algorithms. By judging the texture complexity of coding unit (CU), a fast CU size decision algorithm can remove some unnecessary CU sizes. Besides, based on the statistic character of intra prediction modes and the correlation of adjacent CUs, a fast intra prediction mode decision algorithm can directly select the optimal intra prediction mode or remove some candidate modes. Compared with the exhaustive intra mode decision method in HEVC, our fast algorithms can effectively save the coding time, while the decrease of coding quality is negligible.

Wenqiang Zhao, Liquan Shen, Zhiming Cao, Zhaoyang Zhang

A Fast Intra Prediction Algorithm for HEVC

A new fast intra-prediction algorithm for HEVC is proposed in this paper. The algorithm makes full use of spatial correlations among video sequences, decides the best size of CU by referring neighboring coded CUs’ size and the best prediction mode of PU by referring the up layer PU modes. The experimental results show that the algorithm averagely achieves 45.2% reduction of intra coding time compared to the default encoding scheme in HM2.0 with drops of 0.05dB PSNR and increases of 0.10% bits rate.

Yilong Cheng, Guowei Teng, Xuli Shi, Huosheng Li

Multimedia Communication

A Review of Information Content Metric for Semantic Similarity

All along, Information Content (IC) of concept is a hot topic. It is an important dimension of accessing semantic similarity between two concepts or word senses. Much work has been done. This paper illustrated the use of IC in semantic similarity computing and then focuses on IC metric. It reviews and analyses Corpora-dependent and Corpora-independent IC approach. Hyponym-based, Leaves-based and Relation-based IC Metric is presented respectively. The important related issues are highlighted. Finally further research is outlined for the improvement of IC.

Lingling Meng, Junzhong Gu, Zili Zhou

Training Sample Acquisition Strategy Based Digital Pre-distortion

Digital pre-distortion (PD) is a promising technique for power amplifier (PA) linearization and efficiency enhancement. Parameter estimation for both PA and PD models are quite important in all pre-distortion schemes. In this paper we introduce the concept of RCBL (Region Can Be Linearized) and highlight the difference of training sample acquisition strategies (TSAS) for three different pre-distortion schemes. We prove that the improper samples may lead to the inaccuracy of the coefficients, or even an error. Thus we propose to choose the PA modeling and inverse structure as our scheme to do the DPD construction. A 5MHz Wimax signal sampled from a real PA is used to do the loop delay estimation, the PA model identification and the PD construction. Simulation results show that the AM/AM, AM/PM characteristics have been improved much and more than 15 dB of ACPR (Adjacent Channel Power Ratio) has been achieved.

Biao Peng, Lin Gui, Bo Liu

Stream Maximization Transmission for MIMO System with Limited Feedback Unitary Precoding

Limited feedback precoding significantly improves multiple-input multiple-output (MIMO) link reliability with a small amount of feedback from the receiver back to the transmitter. One of the key problems linked to LFP is about the selection criteria of the optimal precoder index (PMI) from a codebook. In this paper, we find that the conventional PMI selection criteria under the stream maximization transmission (SMT) will be ineffective when the linear equalizers are employed at the receiver. Inspired by this discovery, a new singular value decomposition (SVD) based criterion activating the PMI selection for the SMT is proposed, which is accessible to both zero-foring (ZF) and minimum mean square error (MMSE) equalizer. Extensive numerical results are provided to substantiate the effectiveness of the proposed criterion.

Qiu Chen, Chen Peng, Gui Lin

The Application of Digital Elevation Model in the Coverage Network Planning Software

This paper introduces an analysis method of the digital elevation model map based on the Geotiff format, through the algorithm of data read, coordinate transformation, numerical interpolation of map, convert to the new map format which can be executed efficiently in broadcasting network planning software, to achieve the accurate prediction coverage field strength with radio waves diffractions correction which base on digital elevation model.

Ning Ding, Yi-he Dai

A Novel Dynamic Spectrum Management and Sharing Approach for the Secondary Networks

For the better improvement of spectrum utilization, in this paper, we propose a novel spectrum management and sharing models. For specification, the concept of spectrum database and anxiety level of idle spectrum resource are introduced in dynamic spectrum management model. Accordingly, the QoS requirements are formulated in the spectrum sharing model. In addition, we introduce a multi-hop spectrum relay service scheme that can decrease the spectrum sharing complexity in applications and can be regarded to be more practical and scalable. To improve the spectrum utilization and revenue, the spectrum sharing strategy is combined with the anxiety and matching level of idle spectrum. The simulation results validate that our approach can achieve a high level performance, especially the spectrum sharing revenue.

Haibo Zhou, Bo Liu, Lin Gui, Yongfei Ding, Jianmin Wu, Lifeng Wang

Wireless Multimedia Communication Requirements for Police and PDT+LTE+3G Solution

The private wireless multimedia communications (WMC) required by police department is increasing for applications like dispatching, disaster relief, rescue, incident response, traffic management and police’s routine patrols. In this paper, we describe the requirements of wireless multimedia communication for public security applications, analyze the difference between the public and private network, and put forward the PDT (Police Digital Trunking) + LTE (Long Term Evolution) + 3G hybrid network solution. This paper only introduces the network frame of solution,the technical details of engineering are out of the scope of the paper.

Qianjin Tang, Xiaoming Xu

Modeling and Simulation of Multi-hop Time Synchronization for Underwater Acoustic Networks Based on OPNET

Time synchronization is important for Underwater Acoustic Networks (UANs) to achieve precise scheduling, localization and low energy consumption. Although time synchronization for UANs has been studied for years, corresponding simulation has not been researched in detail. UANs simulation faces some special challenges, such as the accurate channel model and the complex state machine of UANs synchronization algorithm. This paper provides details of the implementation of a multi-hop time synchronization scheme for Underwater Acoustic Networks (MSUAN) based on OPNET. By modifying the pipeline stages of Propdel-Stage, Power-Stage and Bkgnoise-Stage, we make our scheme applicable for underwater channel, and design a project to evaluate synchronization performance of MSUAN.

Chuan Sun, Feng Yang, Lianghui Ding, Liang Qian, Cheng Zhi

A Novel Queue Scheduling Scheme for Video Transmission over IEEE 802.11e WLAN

With the widespread applications of mass media and the rapid development of wireless network, much attention has been paid to wireless video transmission. In this paper, we consider video transmission in IEEE 802.11e WLAN, and propose an adaptive cross-layer scheduling scheme to reduce video distortion. The cross-layer scheduling algorithm propose a scheme, namely, Relative Minimum Distortion Buffer Management (RMDBM). In RMDBM, network loads and service rates are taken into account to realize the cross-layer dynamic queue access. We evaluate the cross-layer scheduling in ns-2 and simulation results demonstrate significant performance gain of the cross-layer scheduling algorithm. In the case of different network loads, our proposed scheduling algorithm achieves lower video distortion than pure 802.11e EDCA, static mapping scheme and conventional adaptive scheme. In the case of non-video influence, the proposed scheduling scheme can effectively reduce the negative impacts on video transmission performance from non-video stream data.

Dong Shao, Lianghui Ding, Feng Yang, Liang Qian, Xiangzhong Fang

New Advances in Broadband Multimedia

Incentive-Based Bandwidth Auction for Scalable Streaming in Peer-to-Peer Networks

This paper proposes an incentive-based bandwidth allocation for scalable streaming in P2P networks. It is modeled as decentralized dynamic auction games, in which the peers sell and bid the upload bandwidth for the maximum individual benefit. With the unique characteristic of SVC, the content-aware prioritization of SVC is imposed on the underlying bandwidth auction in order to avoid bandwidth wastage and improve the overall video quality at all peers. Also, an incentive mechanism is employed to resolve


issue. Finally, the performance of the proposed scheme is verified by the simulation results.

Lin Chen, Junni Zou

A Client-Driven Selective Streaming System for Multi-view Video Transmission

This paper presents a client-driven selective streaming system for multi-view video transmission. For minimizing the total video distortion of all clients, an optimal rate allocation algorithm is proposed in which the views are delivered based on client selections as well as network conditions. In order to achieve a compromise between compression efficiency and view random access, KS-IPP encoding structure is employed where inter-view prediction are only performed for anchor frames. Also, a view and frame streaming priority calculation method is proposed, by which both the frames and views can be dropped according to their streaming priority when the bandwidth is insufficient. Finally, the performance of the proposed scheme is verified by the simulation results.

Zheng Shi, Junni Zou

Clear Browser: A Dedicated Embedded Browser for Quality-Affordable Smart TVs

In the recent and following several years, Smart TVs popularly used in most families are likely to be quality-affordable.This kind of Smart TV has the following features: low-performance mainboard, large TV screen, diversified application support and high user-experience demand. As one of the most important fundamentalsoftware, the browser faces great challenges. We designed a dedicated embedded browsernamed Clear Browser and proposed a methodology named SDCBto fit the features of the quality-affordable Smart TV. SDCB is consisted of four steps: tailor the browser’s engineers; setup a proxy server for webpages; buildfeedback control module anddiscard secondary information.The experiment results show SDCB can help browserpresent webpages friendly on large screen with limited resources.

Yuan Liu, Dejian Ye

Model-Based Robust Prediction of Cumulative Participant Curve in Large-Scale Events

In this paper, we propose a robust piecewise parametric model for predicting the cumulative number of participants in large-scale events. Based on the analysis of arriving patterns in such events, we establish parametric models for different periods and design an efficient fitting strategy to achieve model parameters from incomplete current data. Moreover, based on historical data, we can train parameters by neural network and get relation prior among parameters and data. With the help of relation prior, we can update the parameters of current data and achieve robust prediction for outlier. Simulation results on the database of Expo 2010 Shanghai show the good performance of our proposed method even in abnormal situations.

Dixin Luo, Rong Xie, Jia Wang, Wenjun Zhang

Tourists Flow Prediction by Clustering-Based GRNN

A new prediction algorithm of tourists flow based on clustering-based generalized regression neural network (GRNN) is proposed in this paper. In order to analyze tourists’ behavior, we use the clustering-based GRNN method to estimate the entering rate of each pavilion at Zone D of Shanghai Expo site. The extensive experimental results show that the proposed algorithm exceeds other prediction methods of neural network like back propagation (BP) method on efficiency and correctness.

Yuting Hu, Rong Xie, Wenjun Zhang

Collision Decoding to Improve the Throughput of OFDM-Based WLAN

When network is congested, packet collisions will happen and lead to severe throughput decrease. In this paper we propose a novel collision decoding method to decode collided packets, and thus increase network throughput. Our method is specifically designed for OFDM-based WLAN network. IFFT/FFT, channel influence and packet collisions effects are formulated as linear processes. Taking into account network features as random-jitter and retransmission, collision can be seen as a group of linear equations and solved. Simulation results show that by adopting collision decoding method, the network throughput can be improved significantly, especially when the network is congested.

Jingye Cao, Feng Yang, Lianghui Ding, Liang Qian, Cheng Zhi

3D Webpage Rendering by Canvas

A novel method is proposed to build stereoscopic 3D webpages. This method can bring 3D web experience to 3DTV users with web-based IPTV services. In this method, we use a new HTML5 element, canvas, to accomplish easy downsampling of shapes, images and text. A new JavaScript library is built to accelerate the development process for web designers. A demo webpage with animation is presented to illustrate the capability and potential of this method.

Shaobo Zhang, Jun Zhou, Jun Sun

Context Based Universal Service Recommendation Algorithm

Selecting high quality of service to meet user preferences has become a key issue with the increasing number of services. In this paper, we use the context based universal service recommendation(CBUSR) way to solve the problem of high-quality service. Algorithm proposes a new calculation equation related context factors, such as location and price etc. And this paper also uses the combination of factors to recommend services. Using CBUSR algorithm we can select high quality service that meeting the user preferences. Experiment results gives a strong evidence.

Zhao Zheng-de, Xu Ning, Zhan Hai-ming

Human Computer Interface

Pointing Hand Distinction by Improved HOG and Wavelet Multi-scale Transform

In order to improve the accuracy of distinguish left hands and right ones from a single fixed camera, this paper proposes a new method to deal with it. The method is based on the combination of improved histograms of oriented gradient (HOG) and wavelet multi-scale transform (WMT). By using multi-scale transform to extract the image edge information, the image global edge can be described well. HOG descriptor is improved to simplify the calculating procedure. We use it to extract the detail features of the image edge. Supported vector machine is used to classify left pointing hand, right pointing hand and negative one. Experiment result demonstrates that the proposal increases 23.1% in accuracy rate comparing with the performance of edge orientation histogram. Meanwhile, it is not only less time-consuming than that of HOG, but also has higher performance.

Yongjie Shi, Yepeng Guan, Jinhui Du

Ontology–Based Personalized Service Discovery Algorithm

To resolve the shortages of current web services discovery algorithm, such as poor matching and difficult to meet personalized needs of users, this paper presents the personalized semantic web service discovery algorithm. Based on the OWL-S semantic web service theory, build personalized domain ontology, and according to its characteristics, to increase semantic factor by gradually, produced three semantic distance calculation methods. Based on this, we designed three personalized semantic web service discovery algorithms, and applied the Online Shopping Mall system. Experimental results show that the algorithm 3 can achieve better service discovery effects.

Zheng-de Zhao, Hai-ming Zhan, Ning Xu

A Circle Detection Method Based on Optimal Parameter Statistics in Embedded Vision

In this paper, we propose a circle detection method based on the optimal parameter statistics (OPSCD). Our method employs fast median filtering based Canny edge detection algorithm (FMFCanny) to obtain edge information. In addition, real-time three points determination circle detection is implemented in FPGA circuit which takes use of the optimal parameters statistics. In this algorithm, the pipeline processing of FIFO and parallelize operation of registers in FPGA detects single circle in videos. Experimental results show that the proposed method is more accurate and robust than the previous algorithm.

Xiaofeng Lu, Xiangwei Li, Sumin Shen, Kang He, Songyu Yu

Design and Implementation of an Embedded Face Recognition System on DM642

In this paper an embedded system based on TMS320DM642 DSP to implement face recognition in real environment is designed. An AdaBoost based face detection algorithm using Haar features is designed to detect the face. After an active face is detected, cubic interpolation is employed to scale the facial image to the predefined size, and histogram equalization will also be performed to enhance the contrast of the facial image. An embedded hidden Markov model with seven super states and total 36 embedded states is constructed to recognize the detected face. The hardware framework and software design are also illustrated in this paper, and experiments based on simulator and designed hardware platform are performed. The results show that the proposed system can achieve 83% recognition rate under normal lighting condition and meets the requirements of real environment applications.

Jianling Hu, Lei Shao, Honglong Cao

3D Video

A Local Stereo Matching Algorithm Based on Region Growing

Stereo matching is an important part in stereo vision. For traditional matching algorithm having difficulties to satisfy both accuracy and speed, a novel local stereo matching algorithm is presented in this paper. Firstly, an initial disparity estimation is obtained by using dynamic window of region growing algorithm based on color constraint for matching. On the other side, a simple but efficient way is proposed to further improve matching accuracy without adding additional computational work. Experimental results show that the algorithm we presented can not only get a more accurate disparity map at repetitive areas and depth discontinuities but also meet the need of real-time.

Pei Wang, Fang Wu

A Key-Frame Selection Method for Semi-automatic 2D-to-3D Conversion

During 2D to 3D conversion, key-frame selection is a very important step as it can directly affect the visual quality of the 3D video. In this paper, a novel key-frame selection method for 2D-to-3D conversion is presented to get fewer errors and much better photorealistic perception. Firstly, the occlusion areas between two consecutive frames are detected and SURF-feature points of the frames are extracted. Secondly, the ratio of feature points to the correspondence is calculated, which is used to select the key-frame candidates. Finally, camera projection matrix in the projective space is computed for every key-frame candidate, and the key-frame candidate that has the least re-projection error is selected as the key-frame. Experimental results show that the propagated depth maps using the proposed method have fewer errors, which is beneficial to generate high quality stereoscopic video.

Jiande Sun, Jiangchuan Xie, Jing Li, Wei Liu

Relationship between Max Stereo Angle and Number of Views in Multi-view Stereo Acquisition/Display System

This paper establishes the model of multi-view stereo acquisition/display system and proposes the relational model between max stereo angle and number of views. Based on the simulation experiments, the relationship between max stereo angle and number of views can be analyzed qualitatively. Through the results of simulation experiments some characteristics of multi-view acquisition/display system can be found out. They can provide a theoretical basis for building and overall optimizing multi-view acquisition/display system.

Hao Cheng, Ping An, Zhaoyang Zhang

A New Method of DIBR Based on Background Inpainting

Depth image based rendering (DIBR) is key technology in stereo video system to generate virtual view image. Filling holes is a difficult task, especially to no views blending and wide baseline. This paper proposes a new method of DIBR which combines background estimation and image inpainting. Firstly, an effective division rule is presented to distinguish background and foreground in hole-areas, estimated background is used to fill the background holes. At last image inpainting is used to fill the remaining holes. Compared to other methods, our method can use real pixel of the scene to fill the background holes and improve virtual view quality greatly.

Kui Wang, Ping An, Hao Cheng, Hejian Li, Zhaoyang Zhang

Edge-Based Algorithm for Multi-view Depth Map Generation

Normalized Cross-Correlation (NCC) is a common matching measure which is insensitive to radiometric differences between stereo images. However, traditional rectangle-based NCC tends to expand the depth discontinuities. An efficient edge-based algorithm with NCC for multi-view depth map generation is proposed in this paper, which preserves depth discontinuity while remaining the advantage of robustness to radiometric differences. In addition, all pixels of initial result are classified into uncover, occlusion, reliable and unreliable by exploiting Left-Right Consistency (LRC) constraint and sequential consistency constraint. Since voting scheme will lead to errors when match windows are lack of reliable information and joint-trilateral filter will blur the depth map if employing fixed window size, especially in depth discontinuities, we combine voting scheme and joint-trilateral filter to get a better result. The experimental results show that our method achieves competitively performance.

Yifan Zuo, Ping An, Zhaoyang Zhang

Real-Time Rendering Based on GPU for Binocular Stereo System

In binocular real-time stereo system based on depth, how to accurately and quickly fill big holes generated by DIBR is a key problem. This paper presents a real-time rendering algorithm based on GPU. Hole mask image is rendered in order to get the corresponding texture information for big holes at the sending, which can be used to fill the big holes at the receiving. Small holes are filled by linear interpolation algorithm. The experiments show that the proposed method achieves better hole filling effect. In order to meet the real-time requirement, GPU acceleration is applied in this paper. Firstly, per-pixels are projected to 3D space using CUDA, Secondly, these pixels are inversely projected to 2D plane using Open GL. Running on NVIDIA Quadro 600 GPU, with the resolution of 1024 × 768 and 1920 × 1080, the proposed method reaches about 45fps and 28fps respectively, meanwhile achieves real-time and better quality.

Zhuan Zheng, Ping An, Bing Zhao, Zhaoyang Zhang

New HVS-Based Stereoscopic Image Watermarking Algorithm for 3D Media

In this paper, a human visual system (HVS) -based stereoscopic image watermarking algorithm is proposed to protect copyright of three dimensional (3D) media. The proposed algorithm makes use of the features of stereo images. Watermark is embedded in similar areas selected by global disparity. Every bit of watermark is embedded in similar blocks by exploiting relationship embedding. Similar blocks usually have the same change trend when two viewpoints are under the same attack. Experimental results show the effectiveness of the proposed algorithm.

Chunhua Bai, Mei Yu, Gangyi Jiang, Zhongju Peng, Feng Shao

Model of Relationship among Views Number, Stereo Resolution and Max Stereo Angle for Multi-view Acquisition/Stereo Display System

In the multi-view acquisition/stereo display system, the stereo angle, number of views and stereo image resolution (simply called three points) are very important factors. They can impact the performance of system. This paper introduces stereo angle and explores the model of relationship among number of views, stereo resolution, and max stereo angle. Based on the model, the relationship among this three points is shown. Through the results of simulation experiments, we can get the overall optimizing methods for multi-view acquisition/stereo display system and the model provides a theoretical basis for building and overall optimizing multi-view stereo acquisition/ display system.

Hao Cheng, Ping An, Zhaoyang Zhang


Weitere Informationen

Premium Partner