Top

2023 | Book

Read chapter Read first chapter

Computational Data and Social Networks

11th International Conference, CSoNet 2022, Virtual Event, December 5–7, 2022, Proceedings

Editors: Thang N. Dinh, Minming Li

Publisher: Springer Nature Switzerland

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the refereed proceedings of the 11th International Conference on Computational Data and Social Networks, CSoNet 2022, held as a Virtual Event, during December 5–7, 2022. The 17 full papers and 7 short papers included in this book were carefully reviewed and selected from 47 submissions. They were organized in topical sections as follows: Machine Learning and Prediction, Security and Blockchain, Fact-checking, Fake News, and Hate Speech, Network Analysis, Optimization.

Frontmatter

Correction to: A Community Detection Algorithm Using Random Walk

Rajesh Vashishtha, Anurag Singh, Hocine Cherifi

Machine Learning and Prediction

Frontmatter

Incorporating Neighborhood Information and Sentence Embedding Similarity into a Repost Prediction Model in Social Media Networks

Abstract

Predicting repost behaviors within social media networks plays an important role in human activities analysis and influence maximization decision making. Traditional methods for repost prediction can be categorized into stochastic diffusion based models and user profile or content features based machine learning models. In this paper, we propose a new framework combining user profile, content similarity and the neighborhood information around each target link as input features to make the prediction. Here neighborhood information can be interpreted as the combination of neighbors’ user profile. Two different kinds of graph based combination models are introduced in the article. After collecting the input features, we implement the state-of-the-art machine learning methods, e.g., Logistic Regression, K-nearest Neighbors, Gaussian Naive Bayes, Deep Neural Network, Random Forest, XGBoosting and Stacking Model to predict repost probability. We evaluate our model on real dataset Weibo to compare the performance with different features and machine learning methods.

Zhecheng Qiang, Eduardo L. Pasiliao, Alexander Semenov, Qipeng P. Zheng

Driving Factors of Polarization on Twitter During Protests Against COVID-19 Mitigation Measures in Vienna

Abstract

We conduct the analysis of the Twitter discourse related to the anti-lockdown and anti-vaccination protests during the so-called 4th wave of COVID-19 infections in Austria (particularly in Vienna). We focus on predicting users’ protest activity by leveraging machine learning methods and individual driving factors such as language features of users supporting/opposing Corona protests. For evaluation of our methods we utilize novel datasets, collected from discussions about a series of protests on Twitter (40488 tweets related to 20.11.2021; 7639 from 15.01.2022 – the two biggest protests as well as 192 from 22.01.2022; 8412 from 11.12.2021; 3945 from 11.02.2022). We clustered users via the Louvain community detection algorithm on a retweet network into pro- and anti-protest classes. We show that the number of users engaged in the discourse and the share of users classified as pro-protest are decreasing with time. We have created language-based classifiers for single tweets of the two protest sides – random forest, neural networks and a regression-based approach. To gain insights into language-related differences between clusters we also investigated variable importance for a word-list-based modeling approach.

Marcus Röckl, Maximilian Paul, Andrzej Jarynowski, Alexander Semenov, Vitaly Belik

Categorizing Memes About the Ukraine Conflict

Abstract

The Russian disinformation campaign uses pro-Russia memes to polarize Americans, and increase support for the Russian invasion of Ukraine. Thus, it is critical for governments and similar stakeholders to identify pro-Russia memes, countering them with evidence-based information. Identifying broad meme themes is crucial for developing a targeted and strategic counter response. There are also a range of pro-Ukraine memes that bolster support for the Ukrainian cause. As such, we need to identify pro-Ukraine memes and aid with their dissemination to augment global support for Ukraine. We address the indicated issues through the following contributions: 1) Creation of an annotated dataset of pro-Russia (N = 70) and pro-Ukraine (N = 121) memes regarding the Ukraine conflict; 2) Identification of broad themes within the pro-Russia and pro-Ukraine meme categories. Broadly, our findings indicated that pro-Russia memes fall into thematic categories that seek to undermine specific elements of US and their allies’ policy and culture. Pro-Ukraine memes are far more diffuse thematically, highlighting admiration for Ukraine’s people and its leadership. Stakeholders may utilize our findings to develop targeted strategies to mitigate Russian influence operations - possibly reducing effects of the conflict.

Keyu Chen, Ashley Feng, Rohan Aanegola, Koustuv Saha, Allie Wong, Zach Schwitzky, Roy Ka-Wei Lee, Robin O’Hanlon, Munmun De Choudhury, Frederick L. Altice, Kaveh Khoshnood, Navin Kumar

Analyzing Scientometric Indicators of Journals and Chief Editors: A Case Study in Artificial Intelligence (AI) Domain

Abstract

This study investigates the relationship between the rankings of artificial intelligence (AI) journals and the chief editor’s scholarly reputations by mining various scientometric data. The associations between these two types of entities are studied with respect to the top AI journals (selected based on Google Scholar ranking) and journals from various quartiles (based on Scimago quartile ranking). Three quantitative reputation metrics (i.e., citation count, h-index, and i10-index) of editor-in-chief (EiC) and four journal ranking metrics (i.e., h5-index, h5-median, the impact factor (IF), and Scimago Journal Rank (SJR)) of journals are considered to find any relationships. To determine the correlation between various pairs of scholarly metrics of EiC and top AI journals, we employ the Spearman and Kendall correlation coefficients. Furthermore, we investigate whether machine learning (ML) classifiers can predict the SJR and IF of journals utilizing EIC’s scholarly reputation metrics. It is observed that the comparative rankings (based on various metrics) of top AI journals do not correlate with the EiC’s scholarly achievements. The high prediction errors of ML classifiers indicate that the EiC’s scholarly indices are not comprehensive enough to build a good model for predicting the IF or SJR of top AI journals. Nevertheless, when AI journals of various qualities are analyzed, we observe that Q1 journals usually have EiCs with a much higher number of citations and h-index compared to the EiCs from the journals from the bottom two quarterlies (Q3 and Q4). The Mann-Whitney U test indicates the differences between the scholarly metrics of EiCs of Q1 journals and journals from Q3 and Q4 are significant. The results imply that while selecting the EiC of a journal, scientometric indices should be considered prudently.

Salim Sazzed

Link Prediction of Complex Networks Based on Local Path and Closeness Centrality

Abstract

Due to evolving nature of complex network, link prediction plays a crucial role in exploring likelihood of new relationships among nodes. There exist a great number of techniques to apply the similarity-based metrics for estimating proximity of vertices in the network. In this work, a novel similarity-based metric based on local path and closeness centrality is proposed, and the Local Path and Centrality based Parameterized Algorithm (LPCPA) is suggested. The proposed method is a new variant of the well-known index of Common Neighbor and Centrality based Parameterized Algorithm (CCPA). Extensive experiments are conducted on thirteen real networks originating from diverse domains. The experimental results indicate that the proposed index improves the prediction accuracy measured by AUC and has achieved a competitive result on Precision compared to the existing state-of-the-art link prediction methods.

Min Li, Shuming Zhou, Gaolin Chen

The Influence of Color on Prices of Abstract Paintings

Abstract

Determination of price of an artwork is a fundamental problem in cultural economics. In this work we investigate what impact visual characteristics of a painting have on its price. We construct a number of visual features in CIELAB color space measuring complexity of the painting, its points of interest using Discrete symmetry transform, segmentation-based features using Felzenszwalb segmentation and Regions adjacency graph merging, local color features from segmented image, features based on Itten and Kandinsky theories, and utilize mixed-effects model with authors bias as fixed effect to study impact of these features on the painting price. We analyze the influence of the color on the example of the most complex art style - abstractionism, created by Kandinsky, for which the color is the primary basis. We use Itten’s theory - the most recognized color theory in art history, from which the largest number of subtheories was born. For this day it is taken as the base for teaching artists. We utilize novel dataset of 3,885 paintings collected from Christie’s and Sotheby’s and find that color harmony has a little explanatory power, color complexity metrics are impact price negatively and color diversity explains price well.

Maksim Borisov, Valeria Kolycheva, Alexander Semenov, Dmitry Grigoriev

ELA: A Time-Series Forecasting Model for Liner Shipping Based on EMD-LSTM and Attention

Abstract

The development of modern information technology has led to the digital and intelligent transformation of many traditional industries. In order to face the challenge of difficult transition, liner shipping industry is in urgent need of prediction methods for strategy adjustment, and much research remains to be done in this area. In our paper, we put forward a time-series forecasting model named ELA based on modified EMD-LSTM (Empirical Mode Decomposition and Long Short-Term Memory), AR (Autoregressive) and the attention mechanism. Tailored for handling the market characteristics of the liner shipping industry, our method effectively reduces the negative impact of noise on model training and learns the short-term dependencies and long-term trend in time series. Experiments on real-world liner shipping datasets show the powerful prediction performance of ELA.

Jiadong Chen, Xiaofeng Gao, Guihai Chen

Knowledge Transfer via Word Alignment and Its Application to Vietnamese POS Tagging

Abstract

It is not difficult to build a linguistic tagger with a large annotated corpus. Labeled data becomes a big problem with low-resource languages such as Vietnamese. Due to the development and investment in research, there is no large and high-accuracy annotated corpus. This paper proposes a transfer learning strategy to build a high-quality tagger for Vietnamese using a bilingual corpus Vietnamese-English. Particularly, We inherit the strength of a POS tagger in English, which is constructed from a large-scale corpus, then transfer the knowledge to the Vietnamese POS tagger via word alignment. Experimental results show that the proposed method achieves high accuracy with \(94.97\%\). The transfer strategy depends mostly on the source tagger’s accuracy and the word alignment process’s performance, so the proposed strategy can be extended and applied to another low-resource language as long as there is a large bilingual corpus with a rich resource language.

Hao D. Do

A Group Clustering Recommendation Approach Based on Energy Distance

Abstract

Recommendation system focus on the individual recommendation and using relationships between users and users or between items and items by distance or similarity measures. In reality, there are many situations when recommending to individuals is not as important as recommending to groups. Researches on the relationship between a group and a group have not yet been interested in the recommendation. This paper mainly focus on applying the energy distance for group recommendation system. The proposed recommendation model is evaluated on the Jester5k and the MovieLens datasets. The experiment result shows the feasibility of applying the potential energy for the group recommendation problems.

Tu Cam Thi Tran, Lan Phuong Phan, Hiep Xuan Huynh

Security and Blockchain

Frontmatter

An Implementation and Evaluation of Layer 2 for Ethereum with zk-Rollup

Abstract

Ethereum is a very popular blockchain platform. However, due to the limit of the Transaction Per Second (TPS) of this platform, the transaction processing time in Ethereum is very slow, which greatly affects the user experience and wastes time and other fees. Therefore, the scalability of Ethereum becomes an urgent problem to be solved. In this study, we try to improve the scalability problem of Ethereum by building a layer 2 with the zk-Rollup protocol. An evaluation of the implementation is also conducted. Experimental results show that the cost of transactions decreases depending on the batch size, with the gas cost decreasing by more than 85% for a batch size of 50 transactions. Other evaluation results reveal that deposits incur the most cost and increase faster with the batch size.

An Cong Tran, Vu Vo Thanh, Nghi Cong Tran, Hai Thanh Nguyen

Targeted Attack of the Air Transportation Network Global Component

Abstract

Targeted attacks aim to disintegrate a network by removing nodes or links. Classically, in node attacks, one removes nodes in descending order of a given centrality measure. Although most real-world networks exhibit a mesoscopic structure, strategies proposed in the literature do not exploit this ubiquitous property. We introduce an attack strategy based on the network component structure to overcome these drawbacks. The component structure of a network consists of local components and global components. The local components are the dense parts of the network, and the global components are the nodes and links connecting them. We investigate this attack on the world air transportation network using Degree and Betweenness centrality measures. Results show that the degree-based attack on the global component is more effective than the classical attack on the entire network. In contrast, the classical Betweenness attack slightly outperforms the Betweenness attack on the global component. However, the latter is more efficient.

Issa Moussa Diop, Chantal Cherifi, Cherif Diallo, Hocine Cherifi

Measuring Cryptocurrency Mining in Public Cloud Services: A Security Perspective

Abstract

Cryptocurrencies, arguably the most prominent application of blockchain systems, have been on the rise with wide mainstream acceptance. A central entity in cryptocurrencies is “mining pools”, groups of cooperating cryptocurrency miners who agree to share block rewards in proportion to their contributed mining hash power. Despite the many promised benefits of cryptocurrencies, they are equally utilized for malicious activities, e.g., ransomware payments, stealthy command, and control, etc. Thus, understanding the interplay between cryptocurrencies, particularly the mining pools, and other essential infrastructures for profiling and characterization is necessary.

In this paper, we initiate the study of the interplay between mining pools and public clouds by analyzing their communication association through passive domain name system (pDNS) traces. We observe that 24 cloud providers have some association with mining pools as observed from the pDNS query traces, where popular public cloud providers, namely Amazon and Google, have almost 48% of such an association. Moreover, we found that the cloud provider presence and cloud provider-to-mining pool association exhibit a heavy-tailed distribution, emphasizing an intrinsic preferential attachment model with both mining pools and cloud providers. We measure the security risk and exposure of the cloud providers, as that might aid in understanding the intent of the mining. Among the top two cloud providers, we found almost 35% and 30% of their associated endpoints are positively detected to be associated with malicious activities, per the virustotal.com scan. Finally, we found that the mining pools presented in our dataset are predominantly used for mining Metaverse currencies, highlighting a shift in cryptocurrency use and demonstrating the prevalence of mining using public clouds.

Ayodeji Adeniran, David Mohaisen

Do Content Management Systems Impact the Security of Free Content Websites?

Abstract

This paper investigates the potential causes of the vulnerabilities of free content websites to address risks and maliciousness. Assembling more than 1,500 websites with free and premium content, we identify their content management system (CMS) and malicious attributes. We use frequency analysis at both the aggregate and per category of content (books, games, movies, music, and software), utilizing the unpatched vulnerabilities, total vulnerabilities, malicious count, and percentiles to uncover trends and affinities of usage and maliciousness of CMS’s and their contribution to those websites. Moreover, we find that, despite the significant number of custom code websites, the use of CMS’s is pervasive, with varying trends across types and categories. Finally, we find that even a small number of unpatched vulnerabilities in popular CMS’s could be a potential cause for significant maliciousness.

Mohamed Alqadhi, Abdulrahman Alabduljabbar, Kyle Thomas, Saeed Salem, DaeHun Nyang, David Mohaisen

Fact-checking, Fake News, and Hate Speech

Frontmatter

BNnetXtreme: An Enhanced Methodology for Bangla Fake News Detection Online

Abstract

In the last couple of years, the government, and the public have shown a lot of interest in fake news on Bangladesh’s fast-growing online news sites, as there have been significant events in various cities due to unjustifiable rumors. But the overall progress in study and innovation in the detection of Bangla fake and misleading news is still not adequate in light of the prospects for policymakers in Bangladesh. In this study, an enhanced methodology named BNnetXtreme is proposed for Bangla fake news detection. Applying both embedding based (i.e. word2vec, Glove, fastText) and transformer-based (i.e. BERT) models, we demonstrate that the proposed BNnetXtreme achieves promising performance in detection of Bangla fake news online. After a further comparative analysis, it is also discovered that BNnetXtreme performed superior to BNnet-one of the state-of-the-art architectures for Bangla fake news detection introduced previously. The BNnetXtreme especially BERT Bangla base model performed with an accuracy score of 91% and an AUC score of 98%. Our proposed BNnetXtreme has been successful in improving the performance by an increase of 1.1% in accuracy score, 5.6% in precision, 1.1% in F1 score, and about 9% in AUC score.

Zaman Wahid, Abdullah Al Imran, Md Rifatul Islam Rifat

Heuristic Gradient Optimization Approach to Controlling Susceptibility to Manipulation in Online Social Networks

Abstract

Manipulation through inferential attacks in online social networks (OSN) can be achieved by learning the user’s interests through their network and their interactions with the network. Since some users have a higher propensity for disclosure than others, a one-size-fits-all technique for limiting manipulation proves insufficient. In this work, we propose a model that allows the user to adjust their online persona to limit their susceptibility to manipulation based on their preferred disclosure threshold. Our experiment, using real-world data provides a way to measure manipulation gained from a single tweet. We then proffer solutions that show that manipulation gain derived as a result of participating in OSNs can be minimized and adjusted to meet the user’s needs and expectations, giving at least some measure of control to the user.

Abiola Osho, Shuangqing Wei, George Amariucai

Identifying Targeted and Generalized Offensive Speech from Anti-asian Social Media Conversations

Abstract

During the Covid-19 pandemic Asian-Americans have been targets of prejudice and negative stereotyping. There has also been volumes of counter speech condemning this jaundiced attitude. Ironically, however, the dialogue on both sides is filled with offensive and abusive language. While abusive language directed at Asians encourages violence and hate crimes against this ethnic group, the use of derogatory language to insult alternative points of view showcases utter lack of respect and exploits people’s fears to stir up social tensions. It is thus important to identify and demote both types of offensive content from anti-Asian social media conversations. The goal of this paper is to present a machine learning framework that can achieve the dual objective of detecting targeted anti-Asian bigotry as well as generalized offensive content. Tweets were collected using the hashtag #chinavirus. Each tweet was annotated in two ways; either it condemned or condoned anti-Asian bias, and whether it was offensive or non-offensive. A rich set of features both from the text and accompanying numerical data were extracted. These features were used to train conventional machine learning and deep learning models. Our results show that the Random Forest classifier can detect both generalized and targeted offensive content with around 0.88 accuracy and F1-score. Our results are promising from two perspectives. First, our approach outperforms contemporary efforts on detecting online abuse against Asian-Americans. Second, our unified approach detects both offensive speech targeted specifically at Asian-Americans and also identifies its generalized form which has the potential to mobilize a large number of people in socially challenging situations.

Payal Shah, Swapna S. Gokhale

US News and Social Media Framing Around Vaping

Abstract

In this paper, we investigate how vaping is framed differently (2008–2021) between US news and social media. We analyze 15,711 news articles and 1,231,379 Facebook posts about vaping to study the differences in framing between media varieties. We use word embeddings to provide two-dimensional visualizations of the semantic changes around vaping for news and for social media. We detail that news media framing of vaping shifted over time in line with emergent regulatory trends, such as; flavored vaping bans, with little discussion around vaping as a smoking cessation tool. We found that social media discussions were far more varied, with transitions toward vaping both as a public health harm and as a smoking cessation tool. Our cloze test, dynamic topic models, and question answering showed similar patterns, where social media, but not news media, characterizes vaping as combustible cigarette substitute. We use n-grams and LDA topic models to detail that social media data first centered on vaping as a smoking cessation tool, and in 2019 moved toward narratives around vaping regulation, similar to news media frames. Overall, social media tracks the evolution of vaping as a social practice, while news media reflects more risk based concerns. A strength of our work is how the different techniques we have applied validate each other.

Keyu Chen, Marzieh Babaeianjelodar, Yiwen Shi, Rohan Aanegola, Lam Yin Cheung, Preslav Ivanov Nakov, Shweta Yadav, Angus Bancroft, Ashiqur R. KhudaBukhsh, Munmun De Choudhury, Frederick L. Altice, Navin Kumar

Network Analysis

Frontmatter

Social Network Analysis of the Caste-Based Reservation System in India

Abstract

Being as old as human civilization, discrimination based on various grounds such as race, creed, gender, and caste has existed for a long time. To undo the impact of this long-enduring historical discrimination, governments worldwide have adopted various forms of affirmative action, such as positive discrimination, employment equity, and quota system. In India, people are considered to belong to Backward Class (BC) or Forward Class (FC), and the Indian government designed an affirmative action, locally known as the “Reservation" policy, to reduce the discrimination between both groups. Through this affirmative action, the government provides support to people from the backward class (BC). Although being one of the most controversial and frequently debated issues, the reservation system in India lacks rigorous scientific study and analysis. In this paper, we model the dynamics of the reservation system based on the cultural divide among the Indian population using social network analysis. The mathematical model, using the Erdös-Rényi network, shows that the addition of weak ties between the two groups leads to a logarithmic reduction in the social distance. Our experimental simulations establish the claim for the different clans of frequently studied social network models as well as real-world networks. We further show that a small number of links created by the reservation process are adequate for a society to live in harmony.

Akrati Saxena, Nivedita Sethiya, Jaspal Singh Saini, Yayati Gupta, S. R. S. Iyengar

Structure, Stability, Persistence and Entropy of Stock Networks During Financial Crises

Abstract

We investigate the network structures of stocks in SET100, NASDAQ100, and FTSE100 from 2006 to 2022, using the correlation distance and the time-space average of correlations as a threshold for connectivity of two stocks. Structure, stability, multifractality, and entropy of the networks are investigated to compare their behaviors before and after financial crises. The results show that during high volatility periods, such as the global financial crisis in 2008 and the COVID pandemic in 2020, the network characteristic path length decreases, while the clustering coefficient increases, suggesting that the network has shrunk in size, and stocks become tightly linked, similar to trends of price and return behaviors observed in many stocks during financial crises. Furthermore, the minimal level of network entropy implies that the market network stability decreases, and each sector has lost its ability to perform independently. We also find that the persistence of the network structure and the network entropy in SET increase during a period of high volatility as evident by a significant increase of the Holder exponent, while results from NASDAQ and FTSE do not exhibit such pronounced behavior, possibly due to having higher market fluctuation. Network features of SET and FTSE show recovery of same values after the 2008 crisis faster than NASDAQ, and in less than 100 trading days; however, they exhibit slower recovery, except for the network entropy, from the COVID-19 pandemic.

Nawee Jaroonchokanan, Teerasit Termsaithong, Sujin Suwanna

A Community Detection Algorithm Using Random Walk

Abstract

Community structure plays an essential role in analyzing networks. Various algorithms exist to find the community structure that scores high on a graph clustering index called Modularity. In divisive community structure algorithms, initially, all the nodes belong to a single community. Each iteration divides the nodes into two groups, and finally, each node belongs to a single community. The main disadvantage of a divisive algorithm is that it is not able to find whether to divide the community further or not. A divisive community detection algorithm is proposed based on the graph spectra that give the termination method for community detection. We rely on Weighted Spectral Distribution (WSD) to divide the network into small sub-network or not. Experiments with various real-world networks show that the proposed method constantly compares favorably with the popular Girvan Newman’s community detection algorithm.

Rajesh Vashishtha, Anurag Singh, Hocine Cherifi

Learning Heuristics for the Maximum Clique Enumeration Problem Using Low Dimensional Representations

Abstract

Approximate solutions to various NP-hard combinatorial optimization problems have been found by learned heuristics using complex learning models. In particular, vertex (node) classification in graphs has been a helpful method towards finding the decision boundary to distinguish vertices in an optimal set from the rest. By following this approach, we use a learning framework for a pruning process of the input graph towards reducing the runtime of the maximum clique enumeration problem. We extensively study the role of using different vertex representations on the performance of this heuristic method, using graph embedding algorithms, such as Node2vec and DeepWalk, and representations using higher-order graph features comprising local subgraph counts. Our results show that Node2Vec and DeepWalk are promising embedding methods in representing nodes towards classification purposes. We observe that using local graph features in the classification process produce more accurate results when combined with a feature elimination process. Finally, we provide tests on random graphs to show the robustness and scalability of our method.

Ali Baran Taşdemir, Tuna Karacan, Emir Kaan Kırmacı, Lale Özkahya

Optimization

Frontmatter

Competitive Influence Maximisation with Nonlinear Cost of Allocations

Abstract

We explore the competitive influence maximisation problem in the voter model. We extend past work by modelling real-world settings where the strength of influence changes nonlinearly with external allocations to the network. We use this approach to identify two distinct regimes—one where optimal intervention strategies offer significant gain in outcomes, and the other where they yield no gains. The two regimes also vary in their sensitivity to budget availability, and we find that in some cases, even a tenfold increase in the budget only marginally improves the outcome of an intervention in a population.

Sukankana Chakraborty, Sebastian Stein

Frank Wolfe Algorithm for Nonmonotone One-Sided Smooth Function Maximization Problem

Abstract

In this paper, we study the problem of maximizing a nonmonotone one-sided-\(\eta \) smooth (OSS for short) function \(\psi (x)\) under a downwards-closed convex polytope constraint. The concept of OSS was first proposed by Mehrdad et al. [1, 2] to express the properties of multilinear extension of some set functions. It is a generalization of the continuous DR submodular function. The OSS property guarantees an alternative bound based on Taylor expansion. If the objective function is nonmonotone diminishing return (DR) submodular, Bian et al. [3] gave a 1/e approximation algorithm with a regret bound \(O(\frac{LD^{2}}{2K})\). On general convex sets, D\(\ddot{u}\)rr et al. [4] gave a \(\frac{1}{3\sqrt{3}}\) approximation solution with \(O(\frac{LD^{2}}{(\ln K)^{2}})\) regrets. In this paper, we consider maximizing the more general OSS function, and by adjusting the iterative step of the Jump-Start Frank Wolfe algorithm, an approximation of 1/e can still be obtained in the case of a larger regret bound \(O(\frac{L(\mu D)^{2}}{2K})\). (where \(L,\mu , D\) are some parameters, see Table 1). The larger the parameter \(\eta \) we choose, the more regrets we will receive, because of \(\mu =\left( \frac{\beta }{\beta +1}\right) ^{-2\eta }\) \((\beta \in (0,1])\).

Hongxiang Zhang, Chunlin Hao, Wenying Guo, Yapu Zhang

Non-monotone k-Submodular Function Maximization with Individual Size Constraints

Abstract

In the problem of maximizing non-monotone k-submodular function f under individual size constraints, the goal is to maximize the value of k disjoint subsets with size upper bounds \(B_1,B_2,\ldots ,B_k\), respectively. This problem generalized both submodular maximization and k-submodular maximization problem with total size constraint. In this paper, we propose two results about this kind of problem. One is a \(\frac{1}{B_m+4}\)-approximation algorithm, where \(B_m=\max \{B_1,B_2,\ldots ,B_k\}\). The other is a bi-criteria algorithm with approximation ratio \(\frac{1}{4}\), where each subset is allowed to exceed the size constraint by up to \(B_m\), and in the worst case, only one subset will exceed \(B_m\).

Hao Xiao, Qian Liu, Yang Zhou, Min Li

A Heuristic Algorithm for Student-Project Allocation Problem

Abstract

The Student-Project Allocation problem with lecturer preferences over Students with Ties (SPA-ST) is to find a stable matching of students and projects to satisfy the constraints on student preferences over projects, lecturer preferences over students, and the maximum number of students given by each project and lecturer. This problem has attracted many researchers because of its wide applications in allocating students to projects at many universities worldwide. However, the main weakness of existing algorithms is their high computational cost. In this paper, we propose a heuristic algorithm to improve solution quality and execution time for solving the SPA-ST problem of large sizes. Experimental results on randomly generated datasets show that our algorithm outperforms the state-of-the-art algorithm regarding solution quality and execution time .

Nguyen Thi Uyen, Giang L. Nguyen, Canh V. Pham, Tran Xuan Sang, Hoang Huu Viet

Online File Caching on Multiple Caches in Latency-Sensitive Systems

Abstract

Motivated by the presence of multiple caches and the non-negligible fetching latency in practical scenarios, we study the online file caching problem on multiple caches in latency-sensitive systems, e.g., edge computing. Our goal is to minimize the total latency for all file requests, where a file request can be served by a hit locally, fetching from the cloud data center, a delayed hit, relaying to other caches, or bypassing to the cloud. We propose a file-weight-based algorithm, named OnMuLa, to support delayed hits, relaying and bypassing. We conduct extensive simulations on Google’ trace and a benchmark YCSB. The results show that our algorithms significantly outperform the existing methods consistently in various experimental settings. Compared with the state-of-the-art scheme supporting multiple caches and bypassing, OnMuLa can reduce the latency by \(14.77\%\) in Google’s trace and \(49.69\%\) in YCSB.

Guopeng Li, Chi Zhang, Hongqiu Ni, Haisheng Tan

Backmatter

Title: Computational Data and Social Networks
Editors: Thang N. Dinh
Minming Li
Publisher: Springer Nature Switzerland
Electronic ISBN: 978-3-031-26303-3
Print ISBN: 978-3-031-26302-6
DOI: https://doi.org/10.1007/978-3-031-26303-3

Springer Professional

About this book

Table of Contents

Frontmatter

Correction to: A Community Detection Algorithm Using Random Walk

Machine Learning and Prediction

Frontmatter

Incorporating Neighborhood Information and Sentence Embedding Similarity into a Repost Prediction Model in Social Media Networks

Driving Factors of Polarization on Twitter During Protests Against COVID-19 Mitigation Measures in Vienna

Categorizing Memes About the Ukraine Conflict

Analyzing Scientometric Indicators of Journals and Chief Editors: A Case Study in Artificial Intelligence (AI) Domain

Link Prediction of Complex Networks Based on Local Path and Closeness Centrality

The Influence of Color on Prices of Abstract Paintings

ELA: A Time-Series Forecasting Model for Liner Shipping Based on EMD-LSTM and Attention

Knowledge Transfer via Word Alignment and Its Application to Vietnamese POS Tagging

A Group Clustering Recommendation Approach Based on Energy Distance

Security and Blockchain

Frontmatter

An Implementation and Evaluation of Layer 2 for Ethereum with zk-Rollup

Targeted Attack of the Air Transportation Network Global Component

Measuring Cryptocurrency Mining in Public Cloud Services: A Security Perspective

Do Content Management Systems Impact the Security of Free Content Websites?

Fact-checking, Fake News, and Hate Speech

Frontmatter

BNnetXtreme: An Enhanced Methodology for Bangla Fake News Detection Online

Heuristic Gradient Optimization Approach to Controlling Susceptibility to Manipulation in Online Social Networks

Identifying Targeted and Generalized Offensive Speech from Anti-asian Social Media Conversations

US News and Social Media Framing Around Vaping

Network Analysis

Frontmatter

Social Network Analysis of the Caste-Based Reservation System in India

Structure, Stability, Persistence and Entropy of Stock Networks During Financial Crises

A Community Detection Algorithm Using Random Walk

Learning Heuristics for the Maximum Clique Enumeration Problem Using Low Dimensional Representations

Optimization

Frontmatter

Competitive Influence Maximisation with Nonlinear Cost of Allocations

Frank Wolfe Algorithm for Nonmonotone One-Sided Smooth Function Maximization Problem

Non-monotone k-Submodular Function Maximization with Individual Size Constraints

A Heuristic Algorithm for Student-Project Allocation Problem

Online File Caching on Multiple Caches in Latency-Sensitive Systems

Backmatter

Premium Partner