Skip to main content
Top

2016 | OriginalPaper | Chapter

Do Your Social Profiles Reveal What Languages You Speak? Language Inference from Social Media Profiles

Authors : Yu Xu, M. Rami Ghorab, Zhongqing Wang, Dong Zhou, Séamus Lawless

Published in: Advances in Information Retrieval

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In the multilingual World Wide Web, it is critical for Web applications, such as multilingual search engines and targeted international advertisements, to know what languages the user understands. However, online users are often unwilling to make the effort to explicitly provide this information. Additionally, language identification techniques struggle when a user does not use all the languages they know to directly interact with the applications. This work proposes a method of inferring the language(s) online users comprehend by analyzing their social profiles. It is mainly based on the intuition that a user’s experiences could imply what languages they know. This is nontrivial, however, as social profiles are usually incomplete, and the languages that are regionally related or similar in vocabulary may share common features; this makes the signals that help to infer language scarce and noisy. This work proposes a language and social relation-based factor graph model to address this problem. To overcome these challenges, it explores external resources to bring in more evidential signals, and exploits the dependency relations between languages as well as social relations between profiles in modeling the problem. Experiments in this work are conducted on a large-scale dataset. The results demonstrate the success of our proposed approach in language inference and show that the proposed framework outperforms several alternative methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Tucker, R.: A global perspective on bilingualism and bilingual education. In: Georgetown University Round Table on Languages and Linguistics, pp. 332–340 (1999) Tucker, R.: A global perspective on bilingualism and bilingual education. In: Georgetown University Round Table on Languages and Linguistics, pp. 332–340 (1999)
2.
go back to reference Diamond, J.: The benefits of multilingualism. Sci. Wash. 330(6002), 332–333 (2010)CrossRef Diamond, J.: The benefits of multilingualism. Sci. Wash. 330(6002), 332–333 (2010)CrossRef
3.
go back to reference Ghorab, M., Leveling, J., Zhou, D., Jones, G.J., Wade, V.: Identifying common user behaviour in multilingual search logs. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 518–525. Springer, Heidelberg (2010) Ghorab, M., Leveling, J., Zhou, D., Jones, G.J., Wade, V.: Identifying common user behaviour in multilingual search logs. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 518–525. Springer, Heidelberg (2010)
4.
go back to reference Oakes, M., Xu, Y.: A search engine based on query logs, and search log analysis at the university of Sunderland. In: Proceedings of the 10th Cross Language Evaluation Forum (2009) Oakes, M., Xu, Y.: A search engine based on query logs, and search log analysis at the university of Sunderland. In: Proceedings of the 10th Cross Language Evaluation Forum (2009)
5.
go back to reference Kontaxis, G., Polychronakis, M., et al.: Minimizing information disclosure to third parties in social login platforms. Int. J. Inf. Secur. 11(5), 321–332 (2012)CrossRef Kontaxis, G., Polychronakis, M., et al.: Minimizing information disclosure to third parties in social login platforms. Int. J. Inf. Secur. 11(5), 321–332 (2012)CrossRef
6.
go back to reference Burger, J.D., et al.: Discriminating gender on Twitter. In: EMNLP, pp. 1301–1309 (2011) Burger, J.D., et al.: Discriminating gender on Twitter. In: EMNLP, pp. 1301–1309 (2011)
7.
go back to reference Li, R., Wang, S., Deng, H., et al.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: SIGKDD, pp. 1023–1031 (2012) Li, R., Wang, S., Deng, H., et al.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: SIGKDD, pp. 1023–1031 (2012)
8.
go back to reference Dunning, T.: Statistical identification of language. Technical Report MCCS 940–273, Computing Research Laboratory, New Mexico State University (1994) Dunning, T.: Statistical identification of language. Technical Report MCCS 940–273, Computing Research Laboratory, New Mexico State University (1994)
9.
go back to reference Xia, F., Lewis, W.D., Poon, H.: Language ID in the context of harvesting language data off the web. In: EACL, pp. 870–878 (2009) Xia, F., Lewis, W.D., Poon, H.: Language ID in the context of harvesting language data off the web. In: EACL, pp. 870–878 (2009)
10.
go back to reference Martins, B., et al.: Language identification in web pages. In: SAC, pp. 764–768 (2005) Martins, B., et al.: Language identification in web pages. In: SAC, pp. 764–768 (2005)
11.
go back to reference Stiller, J., Gäde, M., Petras, V.: Ambiguity of queries and the challenges for query language detection. In: The proceedings of Cross Language Evaluation Forum (2010) Stiller, J., Gäde, M., Petras, V.: Ambiguity of queries and the challenges for query language detection. In: The proceedings of Cross Language Evaluation Forum (2010)
12.
go back to reference Carter, S., et al.: Microblog language identification: Overcoming the limitations of short, unedited and idiomatic text. Lang. Resour. Eval. 47(1), 195–215 (2013)CrossRef Carter, S., et al.: Microblog language identification: Overcoming the limitations of short, unedited and idiomatic text. Lang. Resour. Eval. 47(1), 195–215 (2013)CrossRef
13.
go back to reference Qiu, F., Cho, J.: Automatic identification of user interest for personalized search. In: WWW, pp. 727–736 (2006) Qiu, F., Cho, J.: Automatic identification of user interest for personalized search. In: WWW, pp. 727–736 (2006)
14.
go back to reference White, R.W., Bailey, P., Chen, L.: Predicting user interests from contextual information. In: SIGIR, pp. 363–370 (2009) White, R.W., Bailey, P., Chen, L.: Predicting user interests from contextual information. In: SIGIR, pp. 363–370 (2009)
15.
go back to reference Liu, J., Dolan, P., Pedersen, E.R.: Personalized news recommendation based on click behavior. In: IUI, pp. 31–40 (2010) Liu, J., Dolan, P., Pedersen, E.R.: Personalized news recommendation based on click behavior. In: IUI, pp. 31–40 (2010)
16.
go back to reference Xu, S., et al.: Exploring folksonomy for personalized search. In: SIGIR, pp. 155–162 (2008) Xu, S., et al.: Exploring folksonomy for personalized search. In: SIGIR, pp. 155–162 (2008)
17.
go back to reference Provost, F., Dalessandro, B., Hook, R., et al.: Audience selection for on-line brand advertising: privacy-friendly social network targeting. In: SIGKDD, pp. 707–716 (2009) Provost, F., Dalessandro, B., Hook, R., et al.: Audience selection for on-line brand advertising: privacy-friendly social network targeting. In: SIGKDD, pp. 707–716 (2009)
18.
go back to reference Mislove, A., Viswanath, B., Gummadi, K.P., Druschel, P.: You are who you know: inferring user profiles in online social networks. In: WSDM, pp. 251–260 (2010) Mislove, A., Viswanath, B., Gummadi, K.P., Druschel, P.: You are who you know: inferring user profiles in online social networks. In: WSDM, pp. 251–260 (2010)
19.
go back to reference Maheshwari, S., Sainani, A., Reddy, P.: An approach to extract special skills to improve the performance of resume selection. In: Kikuchi, S., Sachdeva, S., Bhalla, S. (eds.) DNIS 2010. LNCS, vol. 5999, pp. 256–273. Springer, Heidelberg (2010)CrossRef Maheshwari, S., Sainani, A., Reddy, P.: An approach to extract special skills to improve the performance of resume selection. In: Kikuchi, S., Sachdeva, S., Bhalla, S. (eds.) DNIS 2010. LNCS, vol. 5999, pp. 256–273. Springer, Heidelberg (2010)CrossRef
20.
go back to reference Wang, Z., Li, S., Kong, F., Zhou, G.: Collective personal profile summarization with social networks. In: EMNLP, pp. 715–725 (2013) Wang, Z., Li, S., Kong, F., Zhou, G.: Collective personal profile summarization with social networks. In: EMNLP, pp. 715–725 (2013)
21.
go back to reference Yang, Z., Cai, K., et al.: Social context summarization. In: SIGIR, pp. 255–264 (2011) Yang, Z., Cai, K., et al.: Social context summarization. In: SIGIR, pp. 255–264 (2011)
22.
go back to reference Dong, Y., Tang, J., Wu, S., et al.: Link prediction and recommendation across heterogeneous social networks. In: ICDM, pp. 181–190 (2012) Dong, Y., Tang, J., Wu, S., et al.: Link prediction and recommendation across heterogeneous social networks. In: ICDM, pp. 181–190 (2012)
23.
go back to reference Tang, W., Zhuang, H., Tang, J.: Learning to infer social ties in large networks. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part III. LNCS, vol. 6913, pp. 381–397. Springer, Heidelberg (2011)CrossRef Tang, W., Zhuang, H., Tang, J.: Learning to infer social ties in large networks. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part III. LNCS, vol. 6913, pp. 381–397. Springer, Heidelberg (2011)CrossRef
24.
go back to reference Tang, J., Wu, S., Sun, J.: Confluence: Conformity influence in large social networks. In: SIGKDD, pp. 347–355 (2013) Tang, J., Wu, S., Sun, J.: Confluence: Conformity influence in large social networks. In: SIGKDD, pp. 347–355 (2013)
25.
go back to reference Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices. Unpublished Manuscript (1971) Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices. Unpublished Manuscript (1971)
Metadata
Title
Do Your Social Profiles Reveal What Languages You Speak? Language Inference from Social Media Profiles
Authors
Yu Xu
M. Rami Ghorab
Zhongqing Wang
Dong Zhou
Séamus Lawless
Copyright Year
2016
Publisher
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-319-30671-1_41