nach oben

2020 | Buch

Kapitel lesen Erstes Kapitel lesen

Web Information Systems Engineering

WISE 2019 Workshop, Demo, and Tutorial, Hong Kong and Macau, China, January 19–22, 2020, Revised Selected Papers

herausgegeben von: Leong Hou U, Prof. Jian Yang, Yi Cai, Kamalakar Karlapalem, An Liu, Xin Huang

Verlag: Springer Singapore

Buchreihe : Communications in Computer and Information Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the refereed proceedings, presented on the 20th International Conference on Web Information Systems Engineering, WISE 2019 and on The International Workshop on Web Information Systems in the Era of AI, held in Hong Kong and Macau, China. Due to the problems in Hong Kong, WISE 2019 has been postponed until January 2020.

The 7 workshop papers, 5 demo papers and 3 tutorial papers presented were carefully reviewed and selected from numerous submissions. The papers are organized in the following sections: tutorials; demos; the International Workshop on Web Information Systems in the Era of AI.

Inhaltsverzeichnis

Frontmatter

Correction to: Influence Maximization Based on Community Closeness in Social Networks

Qingqing Wu, Lihua Zhou, Yaqun Huang

Tutorials

Frontmatter

Knowledge Graph Data Management: Models, Methods, and Systems

Abstract

With the rise of artificial intelligence, knowledge graphs have been widely considered as a cornerstone of AI. In recent years, an increasing number of large-scale knowledge graphs have been constructed and published, by both academic and industrial communities, such as DBpedia, YAGO, Wikidata, Google Knowledge Graph, Microsoft Satori, Facebook Entity Graph, and others. In fact, a knowledge graph is essentially a large network of entities, their properties, semantic relationships between entities, and ontologies the entities conform to. Such kind of graph-based knowledge data has been posing a great challenge to the traditional data management theories and technologies. In this paper, we introduce the state-of-the-art research on knowledge graph data management, which includes knowledge graph data models, query languages, storage schemes, query processing, and reasoning. We will also describe the latest development trends of various database management systems for knowledge graphs.

Xin Wang, Weixue Chen

Local Differential Privacy: Tools, Challenges, and Opportunities

Abstract

Local Differential Privacy (LDP), where each user perturbs her data locally before sending to an untrusted party, is a new and promising privacy-preserving model. Endorsed by both academia and industry, LDP provides strong and rigorous privacy guarantee for data collection and analysis. As such, it has been recently deployed in many real products by several major software and Internet companies, including Google, Apple and Microsoft in their mainstream products such as Chrome, iOS, and Windows 10. Besides industry, it has also attracted a lot of research attention from academia. This tutorial first introduces the rationale of LDP model behind these deployed systems to collect and analyze usage data privately, then surveys the current research landscape in LDP, and finally identifies several open problems and research directions in this community.

Qingqing Ye, Haibo Hu

Intelligent Knowledge Lakes: The Age of Artificial Intelligence and Big Data

Abstract

The continuous improvement in connectivity, storage and data processing capabilities allow access to a data deluge from the big data generated on open, private, social and IoT (Internet of Things) data islands. Data Lakes introduced as a storage repository to organize this raw data in its native format until it is needed. The rationale behind a Data Lake is to store raw data and let the data analyst decide how to curate them later. Previously, we introduced the novel notion of Knowledge Lake, i.e., a contextualized Data Lake, and proposed algorithms to turn the raw data (stored in Data Lakes) into contextualized data and knowledge using extraction, enrichment, annotation, linking and summarization techniques. In this tutorial, we introduce Intelligent Knowledge Lakes to facilitate linking Artificial Intelligence (AI) and Data Analytics. This will enable AI applications to learn from contextualized data and use them to automate business processes and develop cognitive assistance for facilitating the knowledge intensive processes or generating new rules for future business analytics.

Amin Beheshti, Boualem Benatallah, Quan Z. Sheng, Francesco Schiliro

Demos

Frontmatter

Tourism Analysis on Graphs with Neo4Tourism

Abstract

Tourists’ behavior analysis has become a popular mean with Digital Tourism. Traditional ground studies has been extended with massive data analysis to confront models. Tourism actors are faced with the need to deeply understand tourists’ circulation both quantitatively and qualitatively. Thus, the challenge is to deal with data from tourist oriented social networks by integrating huge volumes of data. We propose in this paper the Neo4Tourism framework based on a graph data model specialized in digital tourism analysis. Our model is dedicated to tourists’ circulation and aims at simulating tourists’ behavior. In this demonstration we discuss how our system (1) integrates data from TripAdvisor in a Neo4j graph database, (2) produces circulation graphs, (3) enhances graphs manipulations and deep tourists’ analysis with centrality.

Gaël Chareyron, Ugo Quelhas, Nicolas Travers

Personalised Drug Prescription for Dental Clinics Using Word Embedding

Abstract

The number of drugs in drug databases is constantly expanding with novel drugs appearing on the market each year. A dentist cannot be expected to recall all the drugs available, let alone potential drug-drug interactions (DDI). This can be problematic when dispensing drugs to patients especially those with multiple medical conditions who often take a multiple medications. Any new medication prescribed must be checked against the patient’s medical history, in order to avoid drug allergies and reduce the risk of adverse reactions. Current drug databases allowing the dentist to check for DDI are limited by the lack of integration of the patient’s medical profile with the drug to be prescribed. Hence, this paper introduces a software which predicts the possible DDI of a new medication against the patient’s medical profile, based on previous findings that associate similarity ratio with DDI. This system is based conceptually on a three-tier framework consisting of a knowledge layer, prediction layer and presentation layer. The novel approach of this system in applying feature vectors for drug prescription will be demonstrated during the conference (http://r.glory.sg/main.php). By engaging with the interactive demonstration, participants will gain first-hand experience in the process from research idea to implementation. Future work includes the extension of use from dental to medical institutions, and it is currently being enhanced to serve as a training tool for medical students.

Wee Pheng Goh, Xiaohui Tao, Ji Zhang, Jianming Yong, XueLing Oh, Elizabeth Zhixin Goh

NRGQP: A Graph-Based Query Platform for Network Reachability

Abstract

This demo designs and implements a system called NRGQP that can efficiently support a variety of network reachability query services while considering the network security policies. NRGQP constructs a knowledge graph based on the network security policies and designs an algorithm over the graph for the network reachability. Furthermore, for supporting a user-friendly interface, a structural query language named NRQL is proposed in NRGQP for the network reachability query.

Wenjie Li, Peng Peng, Zheng Qin, Lei Zou

ReInCre: Enhancing Collaborative Filtering Recommendations by Incorporating User Rating Credibility

Abstract

We present ReInCre (Demo video available at https://youtu.be/MyFczz7Vefo) as a solution demo for incorporating user rating credibility in Collaborative Filtering (CF) approach to enhance the recommendation performance. The credibility values of users are calculated according to their rating behavior and they are utilized in discovering the neighbors (Code available at https://github.com/NaimeRanjbarKermany/Cred). To the best of our knowledge, it is the first work to incorporate the rating credibility of users in a CF recommendation. Our approach works as a powerful add-on to existing CF-based recommender systems in order to optimize the neighborhood. Experiments are conducted on the real-world dataset from Yahoo! Movies. Comparing with the baselines, the experimental results show that our proposed method significantly improves the quality of recommendation in terms of precision and \(F_1\)-measure. In particular, the standard deviation of the errors between the prediction values and the real ratings becomes much smaller by incorporating credibility measurements of the users.

Naime Ranjbar Kermany, Weiliang Zhao, Jian Yang, Jia Wu

SLIND: Stable LINk Detection

Abstract

Evolutionary behavior of Online Social Networks (OSNs) has not been well understood in many different aspects. Although there have been many developments around social applications like recommendation, prediction, detection and identification which take advantage of past observations of structural patterns, they lack the necessary representative power to adequately account for the sophistication contained within relationships between actors of a social network in real life. In this demo, we extend the innovative developments of SLIND [17] (Stable LINk Detection) to include a novel generative adversarial architecture and the Relational Turbulence Model (RTM) [15] using relational features extracted from real-time twitter streaming data. Test results show that SLIND\(^+\) is capable of detecting relational turbulence profiles learned from prior feature evolutionary patterns in the social data stream. Representing turbulence profiles as a pivotal set of relational features improves detection accuracy and performance of well-known application approaches in this area of research.

Ji Zhang, Leonard Tan, Xiaohui Tao, Hongzhou Li, Fulong Chen, Yonglong Luo

The International Workshop on Web Information Systems in the Era of AI

Frontmatter

Efficient Privacy-Preserving Skyline Queries over Outsourced Cloud

Abstract

In the cloud computing paradigm, data owners could outsource their databases to the service provider, and thus reap huge benefits from releasing the heavy storage and management tasks to the cloud server. However, sensitive data, such as medical or financial records, should be encrypted before uploading to the cloud server. Unfortunately, this will introduce new challenges to data utilization. In this paper, we study the problem of skyline queries in a way that data privacy for both data owner and the client is preserved. We propose a hybrid protocol via additively homomorphic encryption system and Yao’s garbled circuits. By taking advantages of Yao’s protocol, we design a highly improved protocol which can be used to determine the skyline point and exclude the points dominated by others in an oblivious way. Based on this subroutine, we construct a fully secure protocol for skyline queries. We theoretically prove that the protocols are secure in the semi-honest model. Through analysis and extensive experiments, we demonstrate the efficiency and scalability of our proposed solutions.

Lu Li, Xufeng Jiang, Fei Zhu, An Liu

Leveraging Pattern Mining Techniques for Efficient Keyword Search on Data Graphs

Abstract

Graphs model complex relationships among objects in a variety of web applications. Keyword search is a promising method for extraction of data from data graphs and exploration. However, keyword search faces the so called performance scalability problem which hinders its widespread use on data graphs.

In this paper, we address the performance scalability problem by leveraging techniques developed for graph pattern mining. We focus on avoiding the generation of redundant intermediate results when the keyword queries are evaluated. We define a canonical form for the isomorphic representations of the intermediate results and we show how it can be checked incrementally and efficiently. We devise rules that prune the search space without sacrificing completeness and we integrate them in a query evaluation algorithm. Our experimental results show that our approach outperforms previous ones by orders of magnitude and displays smooth scalability.

Xinge Lu, Dimitri Theodoratos, Aggeliki Dimitriou

Range Nearest Neighbor Query with the Direction Constraint

Abstract

In this paper, we study a direction-aware spatial data query method, i.e., range nearest neighbor query with the direction constraint (Range-DCNN query). Traditional DCNN query retrieves the top-k nearest neighbors within an angular range. Our Range-DCNN query finds all nearest neighbors within an angular range for all points in a rectangle. Dissimilar to the traditional DCNN query, the user’s location in the Range-DCNN query is abstracted to a rectangle rather than a point and the user’s location can be anywhere in the rectangle. In doing so, the user’s precise location will not be leaked, which ensures an effective privacy protection of user’s location. In Range-DCNN query, an observation is made that splitting points can be utilized to obtain all query results without having to search for all points. We propose some properties of locating splitting points. According to these properties, efficient algorithms are designed with the assistance of the R-tree. Extensive experiments have been conducted on both real and synthetic datasets. The experimental results demonstrate that our algorithms are capable of locating all results precisely and efficiently.

Xue Miao, Xi Guo, Xiaochun Yang, Zhaoshun Wang, Peng Lv

Cloud Service Access Frequency Estimation Based on a Stream Filtering Method

Abstract

Cloud service discovery forms the foundation of the efficient and agile implementation of complex business processes. The core problem of existing QoS-aware cloud service discovery mechanisms is that the process of cloud service QoS acquisition is difficult. The issue of how to obtain the number of times a cloud service has been accessed over a period of time needs to be addressed, and the access information for the cloud service needs to be fully recorded. It is difficult to adapt traditional means of data processing to the concurrent access requirements of a massive cloud service, resulting in a lack of accurate QoS information support for cloud service aggregation. This paper proposes a method based on bucket filtering to collect cloud service access flow log information. It then explores a way of abstracting cloud service access flow into a binary bit stream, and uses the DGIM algorithm to carry out an approximate evaluation of cloud service access to analyse cloud service access flow. Our approach enables an estimation of cloud service access frequency and balances the space and time overheads of cloud service access log storage and calculation. Theoretical analysis and experimental verification prove that our access has good universality and good performance.

Shiting Wen, Jinqiu Yang, Chaoyan Zhu, Genlang Chen

Influence Maximization Based on Community Closeness in Social Networks

Abstract

The research of Influence maximization (IM) has always been a hot research topic in network analysis, which aims to find the most influential users in social networks to maximize the reach of influence. In recent year, many studies have focused on the problem of IM to improve efficiency by taking advantage of the small-scale community structures. However, the existing community-based methods only consider the number of nodes in a community and ignore the density of edge connections in a community. In addition, existing method can only be applied to non-overlapping community structures. In this paper, we propose community closeness-based influence maximization algorithm (CCIM) to select most influential nodes. CCIM considers the influence of point-to-point and point-to-community, reflecting the micro-level and meso-level influence. The experimental results on synthetic and three real datasets verify CCIM outperforms the state-of-the-art baselines.

Qingqing Wu, Lihua Zhou, Yaqun Huang

Can Reinforcement Learning Enhance Social Capital?

Abstract

Social capital captures the positional advantage gained by an individual by being in a social network. A well-known dichotomy defines two types of social capital: bonding capital, which refers to welfare such as trust and norms, and bridging capital, which refers to benefits in terms of influence and power. We present a framework where these notions are mathematically conceptualized. Through the framework, we discuss the process when an individual gains social capital through building new edges. We explore two questions: (1) How would an individual optimally form new relations? (2) What are the impacts of the network structure on the individual’s social capital? For these questions, we adopt a paradigm where the individual is a utility-driven agent who acquires knowledge about the network through repeated trial-and-error. In this paradigm, we propose two reinforcement learning algorithms: one guarantees the convergence to optimal values in theory, while the other is efficient in practice. We conduct experiments over both synthetic and real-world networks. Experimental results indicate that a centralized structure can enhance the performance of learning.

He Zhao, Hongyi Su, Yang Chen, Jiamou Liu, Bo Yan, Hong Zheng

A Novel Event Detection Model Based on Graph Convolutional Network

Abstract

With the rapid development of society, economy, politics and science, there is a vast amount of collected daily news reports. How to detect news events and discover the underlying event evolution pattern has become an urgent problem. There have been many existing works to solve this problem, but most just use TF-IDF or LDA features to extract the limited semantic information, and the structural information of documents is also potential to be exploited. In this paper, we propose a novel Graph Convolutional Network based event detection model, named as NED-GCN, for news stream. The proposed model utilizes ConceptGraph to represent a document and fully takes semantic information and structural information of a document into account. Further, a Siamese Graph Convolutional Network (SiamGCN) is presented to calculate the similarity between document pair via shared weights for document embedding learning, and finally the learned document embeddings are clustered to generate events. Experimental evaluation on two real datasets shows that our method outperforms the state-of-art approaches in event detection.

Pengpeng Zhou, Baoli Zhang, Bin Wu, Yao Luo, Nianwen Ning, Jiaying Gong

Backmatter

Titel: Web Information Systems Engineering
herausgegeben von: Leong Hou U
Prof. Jian Yang
Yi Cai
Kamalakar Karlapalem
An Liu
Xin Huang
Verlag: Springer Singapore
Electronic ISBN: 978-981-15-3281-8
Print ISBN: 978-981-15-3280-1
DOI: https://doi.org/10.1007/978-981-15-3281-8