Coupling learning of complex interactions

doi:10.1016/j.ipm.2014.08.007

Information Processing & Management

Volume 51, Issue 2, March 2015, Pages 167-186

https://doi.org/10.1016/j.ipm.2014.08.007 Get rights and content

Highlights

•
The concept of coupling and major coupling relationships.
•
Coupling layers and forms appearing in complex data and applications.
•
Modeling couplings, measuring couplings and the curse of couplings.
•
A new theoretical framework for the next-generation recommender systems.
•
Case studies of learning couplings in data mining and recommendation.

Abstract

Complex applications such as big data analytics involve different forms of coupling relationships that reflect interactions between factors related to technical, business (domain-specific) and environmental (including socio-cultural and economic) aspects. There are diverse forms of couplings embedded in poor-structured and ill-structured data. Such couplings are ubiquitous, implicit and/or explicit, objective and/or subjective, heterogeneous and/or homogeneous, presenting complexities to existing learning systems in statistics, mathematics and computer sciences, such as typical dependency, association and correlation relationships. Modeling and learning such couplings thus is fundamental but challenging. This paper discusses the concept of coupling learning, focusing on the involvement of coupling relationships in learning systems. Coupling learning has great potential for building a deep understanding of the essence of business problems and handling challenges that have not been addressed well by existing learning theories and tools. This argument is verified by several case studies on coupling learning, including handling coupling in recommender systems, incorporating couplings into coupled clustering, coupling document clustering, coupled recommender algorithms and coupled behavior analysis for groups.

Introduction

Complex interactive and unstructured/semi-structured data and applications, especially in big data, present major challenges to the current analytic and learning theories and systems. Big data, in particular, presents specific complexities of weakly structured and unstructured data distribution, dynamics, interactions, and structures, which challenge the existing theoretical and commercial systems in mathematics, statistics, and computer science. Examples include the connections between gene combinations and physical and psychological consequences, between one’s personal traits or preferences in social media and one’s social, behavioral, attitudinal and interest attributes.

This results in a situation where learning big data is analogous to the ancient Indian parable of seven blind men encountering an elephant for the first time. Each touches a different part of the animal, so when the seven share their experiences, each has a completely different idea of what the whole animal must look like. Similarly, when confronted with a big data set, a data modeler or learner may only see a partial set or aspect, hence often only a partial story is told by a learner. Why does this happen? There are many reasons, one of which is the invisibility of sophisticated coupling relationships (coupling for short, see Definition 2.1) hidden between the heterogeneous parts that are ‘visible’ to blind people. They do not have the ability to recognize the visible and invisible couplings between parts to connect those heterogeneous parts to form a global picture as sighted people do. This is representative of certain major challenges of complex relations hidden in complex data (particularly referring here to data with complex couplings and/or mixed distributions, formats, types and variables, and unstructured and weakly structured data). Learning visible and especially invisible coupling relationships can complement and assist in understanding weakly structured and unstructured data.

In many cases, such inherent, locally visible but globally invisible (or vice versa) couplings are presented in a range of forms, structures, and layers and on diverse entities. Often individual learners cannot tell the whole story due to their inability to identify to such complex coupling. Effectively learning the widespread, various, visible and invisible couplings is thus crucial for obtaining a true and total picture of the underlying problem.

This is not a trivial task, however. The difficulty in learning complex couplings lies not only with invisible couplings – even visible couplings are often overlooked. Taking the design of recommender algorithms as an example, our ability to recognize them is limited, even though these interactions and structures are embedded in applications such as social media networks. For example, there have been several recent cases in which researchers have started to incorporate inherent couplings between items and between users into a recommender system (RS) (Jannach et al., 2010, Ricci et al., 2011), after a long period of focusing on rating-based exploration, whereas the item-item couplings and user-user couplings (see Fig. 2) have been always intrinsic to the systems.

One reason for this is that visibility is relative to opportunity and capability. The same couplings are implicit to some people, while explicit to others. For instance, in social media recommendation, the friendship between twitters (Cheng) has only recently been recognized as enhancing social recommendation, yet it has always been a natural built-in feature of social media systems. There is a need to develop our ability to capture and convert as many invisible couplings as possible to visible coupling, and to effectively capture visible couplings in complex data.

In reviewing the existing literature, we unfortunately cannot find systematic methodologies and techniques in learning theories to address the above coupling issues. This raises a fundamental question: how much do we know about coupling? and many other basic questions, including: what are couplings, where they are, and in what forms are they present, which we need to address before we can think about how to capture and embed couplings in learning systems. Once these problems have been satisfactorily addressed, more issues follow, such as: how to represent couplings, how to test whether and to what extent couplings exist in a dataset, how to incorporate them into learning models, and how to evaluate the difference they make once they are incorporated into learning systems. These challenges form the basis of the need to study coupling learning, a fundamental but undeveloped area in computer science, to address the intricate coupling relationships embedded in complex data and increasingly seen in information retrieval, data mining and machine learning in particular. This is crucial for big data analytics because most existing analytics and learning theories and systems have been built on the assumption that data is independent and identically distributed (IID), while big data is essentially non-IID (Cao, 2013b). Coupling is one critical aspect of non-IIDness (Cao, 2013b) (the other is heterogeneity or so-called personalization, which is not the main concern in this paper, although coupling may be heterogeneous and involve heterogeneity in data).

Learning the above characteristics of complex couplings in big data fundamentally challenges existing learning theories and systems, including pairwise coupling (Moreira and Mayoraz, 1998, Wu et al., 2004), statistical relation learning (Dzeroski and Lavra, 2001, Getoor and Taskar, 2007), dependency learning (Neville and Jensen, 2007, Wei et al., 2014), association learning (Ceglar and Roddick, 2006, Lu et al., 2000), correlation analysis (Hair et al., 2009, Székely et al., 2007), linkage analysis (Faloutsos et al., 2011, Miller et al., 2009), community analysis (Arenas et al., 2004, Girvan and Newman, 2002), social network analysis (Arenas et al., 2004, Girvan and Newman, 2002, Knoke and Yang, 2007, Wasserman and Faust, 1994), multivariate time series (Székely et al., 2007), causality analysis (Gujarati & Porter, 2009) and graph analysis (Cook & Holder, 2006). They either essentially treat data as IID or only address specific forms or levels of couplings. No general and competent theories, frameworks, algorithms or tools are available to handle the coupling complexities discussed above.

The above observations motivate this work, namely to systematically state the coupling learning problem, which clearly involves interactive, unstructured and semi-structured data. The aim of this paper is multi-fold:

•
High-level: build a conceptual system of coupling learning (Sections 2 Coupling: an important perspective, 3 Ubiquitous couplings, 4 Learning coupling) towards a generic and comprehensive understanding of the broad-based coupling relationships that exist in complex data and applications (especially in big data related business).
•
Middle-level: illustrate how to advance classic problems to another generation by incorporating coupling learning into a specific existing scientific problem such as recommender systems (Section 5).
•
Low-level: showcase specific examples in recommender systems to demonstrate how couplings can be managed in practice to improve analytic outcomes (Section 6).

The purpose of this paper is therefore not to specify one particular technique for learning a particular type of coupling (instead we provide citations to our related work for such discrete discussions), but to disclose the whole nature of the problem and build generic frameworks and examples to show possible ways to address the problem.

Accordingly, the organization of this work is as follows. Section 2 discusses the concept of coupling and major coupling relationships often addressed in current big data communities. Section 3 presents a high-level picture of coupling layers and forms appearing in complex data and applications. In Section 4, the issues of modeling and measuring couplings and the curse of couplings are introduced. An example of comprehensive couplings in recommender systems is discussed in Section 5, which presents a new theoretical framework for next-generation recommender systems. Two case studies are given in Section 6, one in which a coupled K-mode algorithm to identify items with strong coupling relationships is presented, and one in which couplings are utilized to improve Matrix Factorization-based recommendation. Section 7 explores the opportunities for learning couplings in data mining, text mining, information retrieval, and complex behavior analysis. The paper is concluded in Section 8.

Section snippets

Coupling: an important perspective

In this section, we discuss the concept of coupling, and the relevant work in statistics, mathematics and computer science. The following key concepts are used in this paper:

•
Coupling: refers to any relationship or interaction that connects two or more aspects (which could be between inputs or between inputs and outputs).
•
Aspect: a term broadly referring to entity, entity property (or characteristics such as variations), property value, context, learner or analytic model, learning objective

Ubiquitous couplings

In this section, we expand the above discussions on couplings, aiming to provide an overall picture of couplings widespread in comprehensive learning tasks.

Learning coupling

Learning coupling refers to understanding, formalizing and quantifying the coupling aspects, entities, interactions, layers, forms and strength. This includes extracting, discovering and estimating the interactions and relationships between learning components, including method, objective, task, level, dimension, process, measure and outcome, especially when the learning involves multiples of one of the above components, for instance, multi-methods or multi-tasks. Recently, the concept of

An example: couplings in recommendation

In recommender systems such as online shopping websites, online broadcasting systems, IPTV, and social media, there are different types of intrinsic interactions: user-user couplings, item-item couplings, and user-item couplings. A user’s behavior may influence his/her friends, which further affects the behaviors of others. Item attributes such as item price and quantity are often associated with each other. The price of one item may affect the price of another. An item may influence the sale

Case study: coupled recommender systems

The discussions about learning different types of couplings in recommender systems in Section 5 inspire us to incorporate couplings into recommendation algorithms. In this section, we discuss two preliminary studies in this direction. The first (for more details, see Yu, Wang, Gao, Cao, & Sun, 2013) considers coupled item recommendation, which incorporates couplings into items and creates a new coupled collaborative filtering (CCF) algorithm: Coupled K-modes (CK-modes). The second (for more

Discussions

Coupling learning is a very promising direction in learning complex relationships between objects, properties, processes, facts, events, and states of affairs which are beyond correlation, association and dependency. Complex couplings are a major characteristic of big data, and together with heterogeneity form the phenomenon of non-IIDness, namely non-independent and non-identically distributed characteristics. Non-IIDness greatly challenges the existing theories and systems in statistics, data

Conclusions

In the real world, diverse coupling relationships are embedded in every business and are associated with objects, properties, processes, events, and states of affairs. Such couplings may present characteristics, which are far beyond the association, correlation and dependency relationships that usually concern statistics, data mining and machine learning communities. Triggered by behavioral, economic, social, cultural, or other driving forces, they may be explicit vs. implicit, syntactic vs.

References (71)

L. Cao
In-depth behavior understanding and use: The behavior informatics approach
Information Science
(2010)
Al Mamunur Rashid, S. K. L., Karypis, G., & Riedl, J. (2006). ClustKNN: A highly scalable hybrid model-& memory-based...
A. Arenas et al.
Community analysis in social networks
The European Physical Journal B-Condensed Matter and Complex Systems
(2004)
Breese, J., Heckerman, D., & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering....
Cao, L., Ou, Y., Yu, P. S., & Wei, G. (2010). Detecting abnormal coupled sequences and sequence changes in group-based...
L. Cao
Combined mining: Analyzing object and pattern relations for discovering and constructing complex yet actionable patterns
WIREs Data Mining and Knowledge Discovery
(2013)
L. Cao
Non-IIDness learning in behavioral and social data
The Computer Journal
(2013)
Cao, L., Luo, D., & Zhang, C. (2009). Ubiquitous intelligence in agent mining. In Proceedings of ADMI 2009 (pp....
Cao, W., Cao, L., & Song, Y. (2013). Coupled market behavior based financial crisis detection. In...
L. Cao et al.
Coupled behavior analysis with applications
IEEE Transactions on Knowledge and Data Engineering
(2012)

L. Cao et al.

Behavior computing: Modeling, analysis, mining and decision

(2012)

L. Cao et al.

Domain driven data mining

(2010)

L. Cao et al.

Combined mining: Discovering informative knowledge in complex data

IEEE Transactions SMC Part B

(2011)

L. Cao et al.

Mining impact-targeted activity patterns in imbalanced data

IEEE Transactions on Knowledge and Data Engineering

(2008)

A. Ceglar et al.

Association mining

ACM Computing Surveys

(2006)

Cheng, X., Miao, D., Wang, C., & Cao, L. (2013) Coupled term-term relation analysis for document clustering. In...

Cheng, A. Six degrees of separation, Twitter style...

Dietrich, C. F. (1991). Uncertainty, calibration and probability: The statistics of scientific and industrial...

S. Dumais

Latent semantic analysis

Annual Review of Information Science and Technology

(2005)

S. Dzeroski et al.

Relational data mining

(2001)

C. Faloutsos et al.

Link mining: Models, algorithms and applications

(2011)

L. Getoor et al.

Introduction to statistical relational learning

(2007)

M. Girvan et al.

Community structure in social and biological networks

Proceedings of the National Academy of Sciences

(2002)

D. Gujarati et al.

Causality in economics: The Granger causality test

(2009)

J. Hair et al.

Multivariate data analysis

(2009)

T. Hastie et al.

The elements of statistical learning: Data mining, inference, and prediction

(2011)

P. Holland

Statistics and causal inference

Journal of the American Statistical Association

(1986)

Hu, L., Cao, J., Xu, G., Cao, L., Gu, Z., & Zhu, C. (2013). Cross-domain collaborative filtering triadic factorization....

Hu, L., Cao, J., Xu, G., Wang, J., Gu, Z., & Cao, L. (2013). Cross-domain collaborative filtering via Bilinear...

Hu, L., Cao, J., Xu, G., Cao, L., Gu, Z., & Cao, W. (2014). Deep modeling of group preferences for group-based...

D. Jannach et al.

Recommender systems an introduction

(2010)

G. Klir et al.

Fuzzy set theory: Foundations and applications

(1997)

D. Knoke et al.

Social network analysis

(2007)

Koren, Y. (2008). Factorization meets the neighborhood: A multifaceted collaborative filtering model. In KDD (pp....

Cited by (134)

Flexible wearable sensors: An emerging platform for monitoring of bacterial infection in skin wounds
2024, Engineered Regeneration
Persistent inflammatory responses often occur when bacteria and other microorganisms frequently invade and colonize open wounds and eventually result in the formation of chronic wounds. Therefore, achieving real-time detection of invasive bacteria accurately and promptly is essential for efficient wound management and accelerating the healing process. Recently, flexible wearable sensors have garnered significant attention, especially those designed for monitoring real-time biophysical or biochemical signals in wound sites in a minimally invasive manner. They provide more precise and continuous monitoring data, making them as emerging tools for clinical diagnostics. In this review, we first discuss the species and community distribution of different types of bacteria in chronic wounds. Next, we introduce currently developed techniques for detecting bacteria at wound sites. Following that, we discuss the recent progress and unresolved issues of various flexible wearable sensors in detecting bacteria at wound sites. We believe that this review can provide meaningful guidance for the development of flexible wearable sensors for bacteria detection.
BiT-MAC: Mortality prediction by bidirectional time and multi-feature attention coupled network on multivariate irregular time series
2023, Computers in Biology and Medicine
Mortality prediction is crucial to evaluate the severity of illness and assist in improving the prognosis of patients. In clinical settings, one way is to analyze the multivariate time series (MTSs) of patients based on their medical data, such as heart rates and invasive mean arterial blood pressure. However, this suffers from sparse, irregularly sampled, and incomplete data issues. These issues can compromise the performance of follow-up MTS-based analytic applications. Plenty of existing methods try to deal with such irregular MTSs with missing values by capturing the temporal dependencies within a time series, yet in-depth research on modeling inter-MTS couplings remains rare and lacks model interpretability. To this end, we propose a bidirectional time and multi-feature attention coupled network (BiT-MAC) to capture the temporal dependencies (i.e., intra-time series coupling) and the hidden relationships among variables (i.e., inter-time series coupling) with a bidirectional recurrent neural network and multi-head attention, respectively. The resulting intra- and inter-time series coupling representations are then fused to estimate the missing values for a more robust MTS-based prediction. We evaluate BiT-MAC by applying it to the missing-data corrupted mortality prediction on two real-world clinical datasets, i.e., PhysioNet’2012 and COVID-19. Extensive experiments demonstrate the superiority of BiT-MAC over cutting-edge models, verifying the great value of the deep and hidden relations captured by MTSs. The interpretability of features is further demonstrated through a case study.
Doubled coupling for image emotion distribution learning
2023, Knowledge-Based Systems
Image emotion prediction has a great impact on wide applications, such as social network analysis, advertising, and human–computer interaction. Recently, image emotion distribution learning (IEDL) has attracted increasing attention as it holds the potential to tackle the challenging emotion ambiguity problem for image emotion prediction. Existing efforts focus more on the emotion distribution learning with the assumption of independently identically distribution. However, we observe that the connections between objects in an image (e.g., butterfly and flower) and the connections between different images (e.g., the images taken in the same place), commonly exist in real-world datasets. Coupling information has been proved greatly helpful for many tasks, and also is crucial for image emotion analysis. Such observations motivate us to explore the above two coupling relations for better IEDL. With this in mind, we propose DoubledIEDL, a novel IEDL approach that consists of two sub-modules for object and image coupling learning, respectively. Specifically, our IEDL relies on a unified framework equipped with densely connected graph convolutional networks (DCGCN) for both coupling learning. The learning of our proposed framework has two stages: static stage and dynamic stage. In the first stage, a static graph is constructed to extract the shallow coupling information with DCGCN. Then, in the second stage, the deep coupling information is further mined via DCGCN on dynamically updated graphs in an iterative manner. The sub-modules for object and image coupling learning share this framework, but differ in the static graph constructing strategy. Extensive experiments on the two public benchmarks, FlickrLDL and TwitterLDL, demonstrate the effectiveness of the proposed DoubledIEDL, yielding significant improvement against previous state-of-the-art models. On FlickrLDL, CoupledIEDL achieves 0.8596 in $C o s i n e$ and 0.4356 in Kullback–Leibler Divergence (K–L). On TwitterLDL, CoupledIEDL achieves 0.8717 in $C o s i n e$ and 0.4705 in K–L.
A Multi-View Deep Metric Learning approach for Categorical Representation on mixed data
2023, Knowledge-Based Systems
It is an important and challenging task to represent the categorical values in mixed data as numerical vectors with intrinsic features, by revealing the complex coupling relationships between the categorical values, attributes and samples. The majority of extant studies expose only one particular coupling relationship in depth, or fuse multiple coupling relationships by using shallow learning based on kernels. The former may not fully mine the essential features of the categorical data. The latter typically has some limitations, for example, difficulty in expanding the spatial structure and difficulty in determining the optimal kernel function. Therefore, this paper proposes a Multi-view Deep Metric Learning for Categorical Representation on mixed data (MvDML-CR). Specifically, first, based on the principle of information complementarity, multiple coupled views are extracted from the complex interaction relationships of the categorical data. Then, in each coupled view, a new proxy loss function is designed to build a deep metric learning sub-model with strong separability, which represents the categorical values as numerical vectors with discrimination. Last, we employ the Hilbert–Schmidt independence criterion to maximize the dependency between the views, and then fuse the sub-models trained in the different views to enhance the complementarity and consistency of the categorical representations. Extensive experiments on 34 mixed datasets with diversified characteristics demonstrate that the classification performance of MvDML-CR is significantly improved, compared with the state-of-the-art competitors.
IntegrateCF: Integrating explicit and implicit feedback based on deep learning collaborative filtering algorithm
2022, Expert Systems with Applications
Due to the expansion of e-business, the availability of products on the internet has massively increased. Finding suitable stuff from the vast array of products available on the internet is a time-consuming task. Collaborative Filtering (CF) is the most effective recommendation method for providing users with the ability to identify relevant content and, therefore, increase engagement. However, CF has several flaws, including data sparsity and cold start problems. These are ongoing research questions that pose major hurdles to the precision of the algorithms. Therefore, in this work, a novel neural recommendation model is proposed based on non-independent and identically distributed (Non-IID) for CF by incorporating explicit and implicit coupling interaction. The explicit interactions consist of two models, namely Intra-coupling interactions within users and items, and Inter-coupling interactions between different users and items concerning the attributes of users and items. The Intra-coupled model learns using deep learning convolutional neural networks and is combined with the Inter-coupled model. Besides explicit coupling interactions, we present a Generalized Matrix Factorization Bias (GMFB) model that systematically trains the implicit user-item coupling. Finally, we combined with explicit and implicit coupling interactions within and between users and items accompanying the extra information about users and items under a framework called “IntegrateCF.” Extensive experiments on two large real-world datasets have shown that the proposed model performs better than existing methods.
Deep Multidilation Temporal and Spatial Dependence Modeling in Stereoscopic 3-D EEG for Visual Discomfort Assessment
2024, IEEE Transactions on Systems, Man, and Cybernetics: Systems

View all citing articles on Scopus

View full text

Coupling learning of complex interactions

Highlights

Abstract

Introduction

Section snippets

Coupling: an important perspective

Ubiquitous couplings

Learning coupling

An example: couplings in recommendation

Case study: coupled recommender systems

Discussions

Conclusions

Information Science

Community analysis in social networks

The European Physical Journal B-Condensed Matter and Complex Systems

Combined mining: Analyzing object and pattern relations for discovering and constructing complex yet actionable patterns

WIREs Data Mining and Knowledge Discovery

Non-IIDness learning in behavioral and social data

The Computer Journal

Coupled behavior analysis with applications

IEEE Transactions on Knowledge and Data Engineering

Behavior computing: Modeling, analysis, mining and decision

Domain driven data mining

Combined mining: Discovering informative knowledge in complex data

IEEE Transactions SMC Part B

Mining impact-targeted activity patterns in imbalanced data

IEEE Transactions on Knowledge and Data Engineering

Association mining

ACM Computing Surveys

Latent semantic analysis

Annual Review of Information Science and Technology

Relational data mining

Link mining: Models, algorithms and applications

Introduction to statistical relational learning

Community structure in social and biological networks

Proceedings of the National Academy of Sciences

Causality in economics: The Granger causality test

Multivariate data analysis

The elements of statistical learning: Data mining, inference, and prediction

Statistics and causal inference

Journal of the American Statistical Association

Recommender systems an introduction

Fuzzy set theory: Foundations and applications

Social network analysis