Skip to main content
main-content

Über dieses Buch

This book provides a novel method for topic detection and classification in social networks. The book addresses several research and technical challenges that are currently being investigated by the research community, from the analysis of relations and communications between members of a community, to quality, authority, relevance and timeliness of the content, traffic prediction based on media consumption, spam detection, to security, privacy and protection of personal information. Furthermore, the book discusses innovative techniques to address those challenges and provides novel solutions based on information theory, sequence analysis and combinatorics, which are applied on real data obtained from Twitter.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Social networks have undergone a dramatic growth in recent years. Such networks provide an extremely suitable space to instantly share multimedia information between individuals and their neighbours in the social graph. Social networks provide a powerful reflection of the structure, the dynamics of the society and the interaction of the Internet generation with both people and technology. Indeed, the dramatic growth of social multimedia and user generated content is revolutionizing all phases of the content value chain including production, processing, distribution and consumption. It also originated and brought to the multimedia sector a new underestimated and now critical aspect of science and technology, which is social interaction and networking. The importance of this new rapidly evolving research field is clearly evidenced by the many associated emerging technologies and applications, including (a) online content sharing services and communities, (b) multimedia communication over the Internet, (c) social multimedia search, (d) interactive services and entertainment, (e) health care and (f) security applications. It has generated a new research area called social multimedia computing, in which well-established computing and multimedia networking technologies are brought together with emerging social media research.
Dimitrios Milioris

Chapter 2. Background and Related Work

Topic detection and tracking aims at extracting topics from a stream of textual information sources, or documents, and to quantify their “trend” in real time. These techniques apply on pieces of texts, i.e. posts, produced within social media platforms. Topic detection can produce two types of complementary outputs: cluster output or term output are selected and then clustered. In the first method, referred to as document-pivot , a topic is represented by a cluster of documents, whereas in the latter, commonly referred to as feature-pivot , a cluster of terms is produced instead. In the following, we review several popular approaches that fall in either of the two categories. Six state-of-the-art methods: Latent Dirichlet Allocation (LDA) , Document-Pivot Topic Detection (Doc-p) , Graph-Based Feature-Pivot Topic Detection (GFeat-p) , Frequent Pattern Mining (FPM) , Soft Frequent Pattern Mining (SFPM) , BNgram are described in detail, as they serve as the performance benchmarks to the proposed system.
Dimitrios Milioris

Chapter 3. Joint Sequence Complexity: Introduction and Theory

In this chapter we study joint sequence complexity and we introduce its applications for topic detection and text classification, in particular source discrimination. The mathematical concept of the complexity of a sequence is defined as the number of distinct factors of it. The Joint Complexity is thus the number of distinct common factors of two sequences. Sequences containing many common parts have a higher Joint Complexity. The extraction of the factors of a sequence is done by suffix trees, which is a simple and fast (low complexity) method to store and retrieve them from the memory. Joint Complexity is used for evaluating the similarity between sequences generated by different sources and we will predict its performance over Markov sources. Markov models describe well the generation of natural text, and their performance can be predicted via linear algebra, combinatorics and asymptotic analysis. This analysis follows in this chapter. We exploit datasets from different natural languages, for both short and long sequences, with promising results on complexity and accuracy. We performed automated online sequence analysis on information streams in Twitter.
Dimitrios Milioris

Chapter 4. Text Classification via Compressive Sensing

In this chapter we apply the theory of Compressive Sensing (CS) to achieve low dimensional classification. According to Compressive Sensing theory, signals that are sparse or compressible in a suitable transform basis can be recovered from a highly reduced number of incoherent linear random projections, which overcomes the traditional signal processing methods. Traditional methods are dominated by the well-established Nyquist–Shannon sampling theorem, which requires the sampling rate to be at least twice the maximum bandwidth. We introduce a hybrid classification and tracking method, which extends our recently introduced Joint Complexity method, which was tailored to the topic detection and trend sensing of user’s tweets. First we employ the Joint Complexity, already described in detail in the previous chapter to perform topic detection, and then, based on the nature of the data, we apply the methodology of Compressive Sensing to perform topic classification by recovering an indicator vector. Finally, we combine the Kalman filter, as a refinement step for the update of the tracking process.
Dimitrios Milioris

Chapter 5. Extension of Joint Complexity and Compressive Sensing

In this chapter, the theory of Joint Complexity and Compressive Sensing has been extended to three research subjects, (a) classification encryption via compressed permuted measurement matrices, (b) dynamic classification completeness based on Matrix Completion and (c) encryption based on the Eulerian circuits of original texts. In the first additional research subject we study the encryption property of Compressive Sensing in order to secure the classification process in Twitter without an extra cryptographic layer. The measurements obtained are considered to be weakly encrypted due to their acquisition process, which was verified by the experimental results. In the second additional research subject we study the application of Matrix Completion (MC) in topic detection and classification. Based on the spatial correlation of tweets and the spatial characteristics of the score matrices, we apply a novel framework which extends the Matrix Completion to build dynamically complete matrices from a small number of random sample Joint Complexity scores. In the third additional research subject, we present an encryption system based on Eulerian circuits , that destructs the semantics of a text while retaining it in correct syntax. We study the performance on Markov models , and perform experiments on real text.
Dimitrios Milioris

Chapter 6. Conclusions and Perspectives

This book introduced and compared two novel topic detection and classification methods based on Joint Complexity and Compressive Sensing . In the first case, the joint sequence complexity and its application was studied, towards finding similarities between sequences up to the discrimination of sources. We exploited datasets from different natural languages using both short and long sequences. We provided models and notations, presented the theoretical analysis, and we applied our methodology to real messages from Twitter , where we evaluated our proposed methodology on topic detection , classification and trend sensing , and we performed automated online sequence analysis.
Dimitrios Milioris

Backmatter

Weitere Informationen

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

Whitepaper

- ANZEIGE -

Globales Erdungssystem in urbanen Kabelnetzen

Bedingt durch die Altersstruktur vieler Kabelverteilnetze mit der damit verbundenen verminderten Isolationsfestigkeit oder durch fortschreitenden Kabelausbau ist es immer häufiger erforderlich, anstelle der Resonanz-Sternpunktserdung alternative Konzepte für die Sternpunktsbehandlung umzusetzen. Die damit verbundenen Fehlerortungskonzepte bzw. die Erhöhung der Restströme im Erdschlussfall führen jedoch aufgrund der hohen Fehlerströme zu neuen Anforderungen an die Erdungs- und Fehlerstromrückleitungs-Systeme. Lesen Sie hier über die Auswirkung von leitfähigen Strukturen auf die Stromaufteilung sowie die Potentialverhältnisse in urbanen Kabelnetzen bei stromstarken Erdschlüssen. Jetzt gratis downloaden!

Bildnachweise