Abstract
The generation of large amounts of personal data provides data centers with sufficient resources to mine idiosyncrasy from private records. User modeling has long been a fundamental task with the goal of capturing the latent characteristics of users from their behaviors. However, centralized user modeling on collected data has raised concerns about the risk of data misuse and privacy leakage. As a result, federated user modeling has come into favor, since it expects to provide secure multi-client collaboration for user modeling through federated learning. Unfortunately, to the best of our knowledge, existing federated learning methods that ignore the inconsistency among clients cannot be applied directly to practical user modeling scenarios, and moreover, they meet the following critical challenges: 1) Statistical heterogeneity. The distributions of user data in different clients are not always independently identically distributed (IID), which leads to unique clients with needful personalized information; 2) Privacy heterogeneity. User data contains both public and private information, which have different levels of privacy, indicating that we should balance different information shared and protected; 3) Model heterogeneity. The local user models trained with client records are heterogeneous, and thus require a flexible aggregation in the server; 4) Quality heterogeneity. Low-quality information from inconsistent clients poisons the reliability of user models and offsets the benefit from high-quality ones, meaning that we should augment the high-quality information during the process. To address the challenges, in this paper, we first propose a novel client-server architecture framework, namely Hierarchical Personalized Federated Learning (HPFL), with a primary goal of serving federated learning for user modeling in inconsistent clients. More specifically, the client train and deliver the local user model via the hierarchical components containing hierarchical information from privacy heterogeneity to join collaboration in federated learning. Moreover, the client updates the personalized user model with a fine-grained personalized update strategy for statistical heterogeneity. Correspondingly, the server flexibly aggregates hierarchical components from heterogeneous user models in the case of privacy and model heterogeneity with a differentiated component aggregation strategy. In order to augment high-quality information and generate high-quality user models, we expand HPFL to the Augmented-HPFL (AHPFL) framework by incorporating the augmented mechanisms, which filters out low-quality information such as noise, sparse information and redundant information. Specially, we construct two implementations of AHPFL, i.e., AHPFL-SVD and AHPFL-AE, where the augmented mechanisms follow SVD (singular value decomposition) and AE (autoencoder), respectively. Finally, we conduct extensive experiments on real-world datasets, which demonstrate the effectiveness of both HPFL and AHPFL frameworks.
- [1] . 2006. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing 54, 11 (2006), 4311–4322.Google ScholarDigital Library
- [2] . 2019. Element level differential privacy: The right granularity of privacy. arXiv preprint arXiv:1912.04042 (2019).Google Scholar
- [3] . 2019. Analyzing federated learning through an adversarial lens. In International Conference on Machine Learning. PMLR, 634–643.Google Scholar
- [4] . 2017. Augmentor: An image augmentation library for machine learning. arXiv preprint arXiv:1708.04680 (2017).Google Scholar
- [5] . 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010. Springer, 177–186.Google ScholarCross Ref
- [6] . 2018. Federated learning of predictive models from federated electronic health records. International Journal of Medical Informatics 112 (2018), 59–67.Google ScholarCross Ref
- [7] . 2018. Data Protection: A Practical Guide to UK and EU Law. Oxford University Press, Inc.Google Scholar
- [8] . 2009. Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems (NeurIPS). 289–296.Google Scholar
- [9] . 2020. Efficient neural matrix factorization without sampling for recommendation. ACM Transactions on Information Systems (TOIS) 38, 2 (2020), 1–28.Google ScholarDigital Library
- [10] . 2020. Fine-grained privacy detection with graph-regularized hierarchical attentive representation learning. ACM Transactions on Information Systems (TOIS) 38, 4 (2020), 1–26.Google ScholarDigital Library
- [11] . 2020. Fedhealth: A federated transfer learning framework for wearable healthcare. IEEE Intelligent Systems 35, 4 (2020), 83–93.Google ScholarCross Ref
- [12] . 2019. Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation. IEEE Transactions on Neural Networks and Learning Systems 31, 10 (2019), 4229–4238.Google ScholarCross Ref
- [13] . 2019. Secureboost: A lossless federated learning framework. arXiv preprint arXiv:1901.08755 (2019).Google Scholar
- [14] . 2017. Collecting telemetry data privately. In Advances in Neural Information Processing Systems (NeurIPS). 3571–3580.Google Scholar
- [15] . 2006. Coefficient of determination. Alphascript Publishing 31, 1 (2006), 63–64.Google Scholar
- [16] . 1994. User modeling for adaptive visualization systems. In Proceedings Visualization’94. IEEE, 217–223.Google ScholarDigital Library
- [17] . 2015. Scalable preference learning from data streams. In Proceedings of the 24th International Conference on World Wide Web (WWW). 885–890.Google ScholarDigital Library
- [18] . 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image processing 15, 12 (2006), 3736–3745.Google ScholarDigital Library
- [19] . 2015. A multi-view deep learning approach for cross domain user modeling in recommendation systems. In Proceedings of the 24th International Conference on World Wide Web (WWW). 278–288.Google ScholarDigital Library
- [20] . 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. 1054–1067.Google ScholarDigital Library
- [21] . 2020. Federated multi-view matrix factorization for personalized recommendations. arXiv preprint arXiv:2004.04256 (2020).Google Scholar
- [22] . 2005. Case studies in the use of ROC curve analysis for sensor-based estimates in human computer interaction. In Proceedings of Graphics Interface 2005. 129–136.Google ScholarDigital Library
- [23] . 2017. Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557 (2017).Google Scholar
- [24] . 2020. An efficient framework for clustered federated learning. arXiv preprint arXiv:2006.04088 (2020).Google Scholar
- [25] . 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. 249–256.Google Scholar
- [26] . 2019. One-shot federated learning. arXiv preprint arXiv:1902.11175 (2019).Google Scholar
- [27] . 2020. Fedboost: A communication-efficient algorithm for federated learning. In International Conference on Machine Learning. PMLR, 3973–3983.Google Scholar
- [28] . 2020. Federated learning of a mixture of global and local models. arXiv preprint arXiv:2002.05516 (2020).Google Scholar
- [29] . 2015. Trirank: Review-aware explainable recommendation by modelifng aspects. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 1661–1670.Google ScholarDigital Library
- [30] . 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173–182.Google ScholarDigital Library
- [31] . 2020. Federated learning of user authentication models. arXiv preprint arXiv:2007.04618 (2020).Google Scholar
- [32] . 2020. LoAdaBoost: Loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data. Plos One 15, 4 (2020), e0230706.Google ScholarCross Ref
- [33] . 2020. DP-FL: A novel differentially private federated learning framework for the unbalanced data. World Wide Web (2020), 1–17.Google Scholar
- [34] . 2021. Personalized cross-silo federated learning on non-iid data. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 7865–7873.Google ScholarCross Ref
- [35] . 2017. Question difficulty prediction for READING problems in standard tests. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.Google ScholarCross Ref
- [36] . 2019. Ekt: Exercise-aware knowledge tracing for student performance prediction. IEEE Transactions on Knowledge and Data Engineering (2019).Google Scholar
- [37] . 2020. Federated semi-supervised learning with inter-client consistency. arXiv E-prints (2020), arXiv–2006.Google Scholar
- [38] . 2019. Learning private neural language modeling with attentive aggregation. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.Google ScholarCross Ref
- [39] . 2021. Emerging trends in federated learning: From model fusion to federated x learning. arXiv preprint arXiv:2102.12920 (2021).Google Scholar
- [40] . 2021. Industrial federated topic modeling. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 1 (2021), 1–22.Google ScholarDigital Library
- [41] . 2019. Quantifying the performance of federated transfer learning. ArXiv abs/1912.12795 (2019).Google Scholar
- [42] . 2006. Signal-to-noise ratio. Scholarpedia 1, 12 (2006), 2088.Google ScholarCross Ref
- [43] . 2019. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019).Google Scholar
- [44] . 2015. Audio augmentation for speech recognition. In Sixteenth Annual Conference of the International Speech Communication Association.Google ScholarCross Ref
- [45] . 2013. Test Equating: Methods and Practices. Springer Science & Business Media.Google Scholar
- [46] . 2019. Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581 (2019).Google Scholar
- [47] . 2018. Federated optimization in heterogeneous networks. arXiv preprint arXiv:1812.06127 (2018).Google Scholar
- [48] . 2019. On the convergence of fedavg on non-iid data. arXiv preprint arXiv:1907.02189 (2019).Google Scholar
- [49] . 2019. Lifelong federated reinforcement learning: A learning architecture for navigation in cloud robotic systems. IEEE Robotics and Automation Letters 4, 4 (2019), 4555–4562.Google ScholarCross Ref
- [50] . 2011. Personalized travel package recommendation. In 2011 IEEE 11th International Conference on Data Mining. IEEE, 407–416.Google ScholarDigital Library
- [51] . 2018. Fuzzy cognitive diagnosis for modelling examinee performance. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 4 (2018), 1–26.Google ScholarDigital Library
- [52] . 2021. FedCT: Federated collaborative transfer for recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 716–725.Google ScholarDigital Library
- [53] . 2020. A secure federated transfer learning framework. IEEE Intelligent Systems 35, 4 (2020), 70–82.Google ScholarCross Ref
- [54] . 2020. Federated forest. IEEE Transactions on Big Data (2020).Google ScholarCross Ref
- [55] . 2019. Quality effects on user preferences and behaviorsin mobile news streaming. In The World Wide Web Conference. 1187–1197.Google ScholarDigital Library
- [56] . 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov (2008), 2579–2605.Google Scholar
- [57] . 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1. Oakland, CA, USA, 281–297.Google Scholar
- [58] . 2020. Three approaches for personalization with applications to federated learning. arXiv preprint arXiv:2002.10619 (2020).Google Scholar
- [59] . 2018. How does domain expertise affect Users’ search interaction and outcome in exploratory search? ACM Transactions on Information Systems (TOIS) 36, 4 (2018), 1–30.Google ScholarDigital Library
- [60] . 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. 1273–1282.Google Scholar
- [61] . 2017. Learning differentially private recurrent language models. arXiv preprint arXiv:1710.06963 (2017).Google Scholar
- [62] . 2021. Cross-node federated graph neural network for spatio-temporal data modeling. arXiv preprint arXiv:2106.05223 (2021).Google Scholar
- [63] . 2019. Agnostic federated learning. arXiv preprint arXiv:1902.00146 (2019).Google Scholar
- [64] . 2019. Robust federated learning through representation matching and adaptive hyper-parameters. arXiv preprint arXiv:1912.13075 (2019).Google Scholar
- [65] . 2020. FedFast: Going beyond average for faster training of federated recommender systems. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1234–1242.Google ScholarDigital Library
- [66] . 2015. Deep learning applications and challenges in big data analytics. Journal of Big Data 2, 1 (2015), 1–21.Google ScholarCross Ref
- [67] . 2019. Private federated learning with domain adaptation. arXiv preprint arXiv:1912.06733 (2019).Google Scholar
- [68] . 2016. Smartphone ownership and internet usage continues to climb in emerging economies. Pew Research Center 22, 1 (2016), 1–44.Google Scholar
- [69] . 2020. Privacy-preserving news recommendation model training via federated learning. arXiv preprint arXiv:2003.09592 (2020).Google Scholar
- [70] . 2020. Privacy preserving text recognition with gradient-boosting for federated learning. arXiv preprint arXiv:2007.07296 (2020).Google Scholar
- [71] . 2017. Ranking lawyers using a social network induced by legal cases. Journal of the Brazilian Computer Society 23, 1 (2017), 6.Google ScholarCross Ref
- [72] . 2020. Fetchsgd: Communication-efficient federated learning with sketching. In International Conference on Machine Learning. PMLR, 8253–8265.Google Scholar
- [73] . 2019. Robust and communication-efficient federated learning from non-iid data. IEEE Transactions on Neural Networks and Learning Systems (2019).Google Scholar
- [74] . 2005. Implicit user modeling for personalized search. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. 824–831.Google ScholarDigital Library
- [75] . 2019. Deep collaborative filtering with multi-aspect information in heterogeneous networks. IEEE Transactions on Knowledge and Data Engineering 33, 4 (2019), 1413–1425.Google ScholarDigital Library
- [76] . 2020. Beyond user embedding matrix: Learning to hash for modeling large-scale users in recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 319–328.Google ScholarDigital Library
- [77] . 2019. A survey on image data augmentation for deep learning. Journal of Big Data 6, 1 (2019), 1–48.Google ScholarCross Ref
- [78] . 2017. Federated multi-task learning. arXiv preprint arXiv:1705.10467 (2017).Google Scholar
- [79] . 2020. Ldp-fl: Practical private aggregation in federated learning with local differential privacy. arXiv preprint arXiv:2007.15789 (2020).Google Scholar
- [80] . 2020. Dual learning for explainable recommendation: Towards unifying user preference prediction and review generation. In Proceedings of The Web Conference 2020. 837–847.Google ScholarDigital Library
- [81] . 2019. Towards federated graph learning for collaborative financial crimes detection. arXiv preprint arXiv:1909.12946 (2019).Google Scholar
- [82] . 2019. Federated learning with bayesian differential privacy. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2587–2596.Google ScholarCross Ref
- [83] . 2019. A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security. 1–11.Google ScholarDigital Library
- [84] . 2013. The EU general data protection regulation: Toward a property regime for protecting data privacy. Yale LJ 123 (2013), 513.Google Scholar
- [85] . 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning. 1096–1103.Google ScholarDigital Library
- [86] . 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research 11, 12 (2010).Google Scholar
- [87] . 2016. European union data privacy law reform: General data protection regulation, privacy shield, and the right to delisting. The Business Lawyer 72, 1 (2016), 221–234.Google Scholar
- [88] . 2020. Neural cognitive diagnosis for intelligent education systems. In 34nd AAAI Conference on Artificial Intelligence, AAAI 2020. 6153–6161.Google ScholarCross Ref
- [89] . 2020. Optimizing federated learning on non-iid data with reinforcement learning. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 1698–1707.Google ScholarDigital Library
- [90] . 2017. Understanding the purpose of permission use in mobile apps. ACM Transactions on Information Systems (TOIS) 35, 4 (2017), 1–40.Google ScholarDigital Library
- [91] . 2019. MCNE: An end-to-end framework for learning multiple conditional network representations of social network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1064–1072.Google ScholarDigital Library
- [92] . 2020. Federated latent dirichlet allocation: A local differential privacy based framework. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 6283–6290.Google ScholarCross Ref
- [93] . 2020. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security 15 (2020), 3454–3469.Google ScholarDigital Library
- [94] . 2019. Multi-agent visualization for explaining federated learning.. In IJCAI. 6572–6574.Google Scholar
- [95] . 2021. Fedgnn: Federated graph neural network for privacy-preserving recommendation. arXiv preprint arXiv:2102.04925 (2021).Google Scholar
- [96] . 2021. Federated deep knowledge tracing. In Proceedings of the 14th International Conference on Web Search and Data Mining.Google ScholarDigital Library
- [97] . 2021. Hierarchical personalized federated learning for user modeling. In Proceedings of the Web Conference 2021. 957–968.Google ScholarDigital Library
- [98] . 2018. Deep modeling of the evolution of user preferences and item attributes in dynamic social networks. In Companion Proceedings of the The Web Conference 2018. 115–116.Google ScholarDigital Library
- [99] . 2021. Federated graph classification over non-iid graphs. Advances in Neural Information Processing Systems 34 (2021).Google Scholar
- [100] . 2009. User language model for collaborative personalized search. ACM Transactions on Information Systems (TOIS) 27, 2 (2009), 1–28.Google ScholarDigital Library
- [101] . 2020. Federated recommendation systems. In Federated Learning. Springer, 225–239.Google ScholarCross Ref
- [102] . 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1–19.Google ScholarDigital Library
- [103] . 2019. Federated learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 13, 3 (2019), 1–207.Google ScholarCross Ref
- [104] . 2018. Artificial Intelligence and Games. Vol. 2. Springer.Google ScholarCross Ref
- [105] . 2015. Dynamic user modeling in social media systems. ACM Transactions on Information Systems (TOIS) 33, 3 (2015), 1–44.Google ScholarDigital Library
- [106] . 2020. Federated unsupervised representation learning. arXiv preprint arXiv:2010.08982 (2020).Google Scholar
- [107] . 2016. Exploiting dining preference for restaurant recommendation. In Proceedings of the 25th International Conference on World Wide Web. 725–735.Google ScholarDigital Library
- [108] . 2017. A sequential approach to market state modeling and analysis in online p2p lending. IEEE Transactions on Systems, Man, and Cybernetics: Systems 48, 1 (2017), 21–33.Google ScholarCross Ref
- [109] . 2021. ASFGNN: Automated separated-federated graph neural network. Peer-to-Peer Networking and Applications 14, 3 (2021), 1692–1704.Google ScholarCross Ref
- [110] . 2001. Predictive statistical models for user modeling. User Modeling and User-Adapted Interaction 11, 1-2 (2001), 5–18.Google ScholarDigital Library
Index Terms
- Federated User Modeling from Hierarchical Information
Recommendations
Hierarchical Personalized Federated Learning for User Modeling
WWW '21: Proceedings of the Web Conference 2021User modeling aims to capture the latent characteristics of users from their behaviors, and is widely applied in numerous applications. Usually, centralized user modeling suffers from the risk of privacy leakage. Instead, federated user modeling ...
Hierarchical Federated Learning with Gaussian Differential Privacy
AISS '22: Proceedings of the 4th International Conference on Advanced Information Science and SystemFederated learning is a privacy preserving machine learning technology. Each participant can build the model without disclosing the underlying data, and only shares the weight update and gradient information of the model with the server. However, a lot ...
User Consented Federated Recommender System Against Personalized Attribute Inference Attack
WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data MiningRecommender systems can be privacy-sensitive. To protect users' private historical interactions, federated learning has been proposed in distributed learning for user representations. Using federated recommender (FedRec) systems, users can train a shared ...
Comments