Skip to main content
Erschienen in: Journal of Intelligent Information Systems 2/2009

01.04.2009

An adaptive personalized news dissemination system

verfasst von: Ioannis Katakis, Grigorios Tsoumakas, Evangelos Banos, Nick Bassiliades, Ioannis Vlahavas

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 2/2009

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the explosive growth of the Word Wide Web, information overload became a crucial concern. In a data-rich information-poor environment like the Web, the discrimination of useful or desirable information out of tons of mostly worthless data became a tedious task. The role of Machine Learning in tackling this problem is thoroughly discussed in the literature, but few systems are available for public use. In this work, we bridge theory to practice, by implementing a web-based news reader enhanced with a specifically designed machine learning framework for dynamic content personalization. This way, we get the chance to examine applicability and implementation issues and discuss the effectiveness of machine learning methods for the classification of real-world text streams. The main features of our system named PersoNews are: (a) the aggregation of many different news sources that offer an RSS version of their content, (b) incremental filtering, offering dynamic personalization of the content not only per user but also per each feed a user is subscribed to, and (c) the ability for every user to watch a more abstracted topic of interest by filtering through a taxonomy of topics. PersoNews is freely available for public use on the WWW (http://​news.​csd.​auth.​gr).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The Apache SpamAssassin Project: http://​spamassassin.​apache.​org/​
 
2
SpamBayes: Bayesian Anti-Spam Classifier: http://​spambayes.​sourceforge.​net/​
 
18
Note that all IFS enhanced methods can be applied with no initial training set. Unfortunately the three baseline methods described in the section need a set of training documents in order to construct the feature space that they use.
 
19
The respective figures for the spam corpus are similar.
 
33
As positive, we consider the characterization of a message as uninteresting.
 
Literatur
Zurück zum Zitat Androutsopoulos, I., Koutsias, J., Chandrinos, K. V., Paliouras, G., & Spyropoulos, C. D. (2000). An evaluation of naive bayesian anti-spam filtering. In Proceedings of the Workshop on Machine Learning in the New Information Age, 11th European Conference on Machine Learning (ECML 2000), Barcelona, Spain. Androutsopoulos, I., Koutsias, J., Chandrinos, K. V., Paliouras, G., & Spyropoulos, C. D. (2000). An evaluation of naive bayesian anti-spam filtering. In Proceedings of the Workshop on Machine Learning in the New Information Age, 11th European Conference on Machine Learning (ECML 2000), Barcelona, Spain.
Zurück zum Zitat Banos, E., Katakis, I., Bassiliades, N., Tsoumakas, G., & Vlahavas, I. (2006). PersoNews: A personalized news reader enhanced by machine learning and semantic filtering. In Proceedings of the 5th International Conference on Ontologies, DataBases and Applications of Semantics (ODBASE 2006). Montpellier, France: Springer. Banos, E., Katakis, I., Bassiliades, N., Tsoumakas, G., & Vlahavas, I. (2006). PersoNews: A personalized news reader enhanced by machine learning and semantic filtering. In Proceedings of the 5th International Conference on Ontologies, DataBases and Applications of Semantics (ODBASE 2006). Montpellier, France: Springer.
Zurück zum Zitat Bharat, K., Kamba, T., & Albers, M. (1998). Personalized, interactive news on the web. Multimedia Systems, 6(5), 349–358.CrossRef Bharat, K., Kamba, T., & Albers, M. (1998). Personalized, interactive news on the web. Multimedia Systems, 6(5), 349–358.CrossRef
Zurück zum Zitat Billsus, D., & Pazzani, M. (1999). A hybrid user model for news story classification. In Proceedings of the Seventh International Conference on User Modeling. Banff, Canada: Springer. Billsus, D., & Pazzani, M. (1999). A hybrid user model for news story classification. In Proceedings of the Seventh International Conference on User Modeling. Banff, Canada: Springer.
Zurück zum Zitat Carreira, R., Crato, J. M., Goncalves, D., & Jorge, J. A. (2004). Evaluating adaptive user profiles for news classification. In Proceedings of the 9th International Conference on Intelligent user Interface. Funchal. Madeira, Portugal: ACM. Carreira, R., Crato, J. M., Goncalves, D., & Jorge, J. A. (2004). Evaluating adaptive user profiles for news classification. In Proceedings of the 9th International Conference on Intelligent user Interface. Funchal. Madeira, Portugal: ACM.
Zurück zum Zitat Chan, C.-H., Sun, A., & Lim, E.-P. (2001). Automated online news classification with personalization. In Proceedings of the 4th International Conference of Asian Digital Library (ICADL2001), Bangalore, India. Chan, C.-H., Sun, A., & Lim, E.-P. (2001). Automated online news classification with personalization. In Proceedings of the 4th International Conference of Asian Digital Library (ICADL2001), Bangalore, India.
Zurück zum Zitat Chin, J. P., Diehl, V. A., & Norman, K. L. (1988). Development of an instrument measuring user satisfaction of the human-computer interface. In Proceedings of SIGCHI Conference on Human factors in computing systems. Washington, DC: ACM. Chin, J. P., Diehl, V. A., & Norman, K. L. (1988). Development of an instrument measuring user satisfaction of the human-computer interface. In Proceedings of SIGCHI Conference on Human factors in computing systems. Washington, DC: ACM.
Zurück zum Zitat Dumais, S., Platt, J., Heckerman, D., & Sahami, M. (1998). Inductive learning algorithms and representations for text categorization. In Proceedings of the seventh international conference on Information and knowledge management. Bethesda, MD: ACM. Dumais, S., Platt, J., Heckerman, D., & Sahami, M. (1998). Inductive learning algorithms and representations for text categorization. In Proceedings of the seventh international conference on Information and knowledge management. Bethesda, MD: ACM.
Zurück zum Zitat Fan, W. (2004). Systematic data selection to mine concept-drifting data streams. In Proceedings of the Tenth ACM SIGKDD international conference on knowledge discovery and data mining. Seattle, WA: ACM. Fan, W. (2004). Systematic data selection to mine concept-drifting data streams. In Proceedings of the Tenth ACM SIGKDD international conference on knowledge discovery and data mining. Seattle, WA: ACM.
Zurück zum Zitat Hulten, G., Spencer, L., & Domingos, P. (2001). Mining time-changing data streams. In Proceedings of the Seventh ACM SIGKDD international conference on knowledge discovery and data mining. San Francisco, CA: ACM. Hulten, G., Spencer, L., & Domingos, P. (2001). Mining time-changing data streams. In Proceedings of the Seventh ACM SIGKDD international conference on knowledge discovery and data mining. San Francisco, CA: ACM.
Zurück zum Zitat Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Proceedings of ECML-98, 10th European Conference on Machine Learning. New York: Springer. Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Proceedings of ECML-98, 10th European Conference on Machine Learning. New York: Springer.
Zurück zum Zitat Katakis, I., Tsoumakas, G., & Vlahavas, I. (2006). Dynamic feature space and incremental feature selection for the classification of textual data streams. In Proceedings of ECML/PKDD-2006 International Workshop on knowledge discovery from data streams. Berlin, Germany: Springer. Katakis, I., Tsoumakas, G., & Vlahavas, I. (2006). Dynamic feature space and incremental feature selection for the classification of textual data streams. In Proceedings of ECML/PKDD-2006 International Workshop on knowledge discovery from data streams. Berlin, Germany: Springer.
Zurück zum Zitat Kim, B. M., Li, Q., Park, C. S., Kim, S. G., & Kim, J. Y. (2006). A new approach for combining content-based and collaborative filters. Journal of Intelligent Information Systems, 27(1), 79–91.CrossRef Kim, B. M., Li, Q., Park, C. S., Kim, S. G., & Kim, J. Y. (2006). A new approach for combining content-based and collaborative filters. Journal of Intelligent Information Systems, 27(1), 79–91.CrossRef
Zurück zum Zitat Klinkenberg, R. (2004). Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, Special Issue on Incremental Learning Systems Capable of Dealing with Concept Drift, 8(3), 281–200. Klinkenberg, R. (2004). Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, Special Issue on Incremental Learning Systems Capable of Dealing with Concept Drift, 8(3), 281–200.
Zurück zum Zitat Kokkoras, F., Bassiliades, N., & Vlahavas, I. (2007). Cooperative CG-wrappers for web content extraction. In Proceedings of the 15th International Conference on Conceptual Structures, ICCS’07, Sheffield, UK. Kokkoras, F., Bassiliades, N., & Vlahavas, I. (2007). Cooperative CG-wrappers for web content extraction. In Proceedings of the 15th International Conference on Conceptual Structures, ICCS’07, Sheffield, UK.
Zurück zum Zitat Laskov, P., Gehl, C., Kruger, S., & Muller, K.-R. (2006). Incremental support vector learning: Analysis, implementation and applications. Journal of Machine Learning Research, 7, 1909–1936.MathSciNet Laskov, P., Gehl, C., Kruger, S., & Muller, K.-R. (2006). Incremental support vector learning: Analysis, implementation and applications. Journal of Machine Learning Research, 7, 1909–1936.MathSciNet
Zurück zum Zitat Lewis, D. D. (1992). An evaluation of phrasal and clustered representations on a text categorization task. In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval. Copenhagen, Denmark: ACM. Lewis, D. D. (1992). An evaluation of phrasal and clustered representations on a text categorization task. In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval. Copenhagen, Denmark: ACM.
Zurück zum Zitat Lewis, D. D., & Ringuette, M. (1994). A comparison of two learning algorithms for text categorization. In Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, NV. Lewis, D. D., & Ringuette, M. (1994). A comparison of two learning algorithms for text categorization. In Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, NV.
Zurück zum Zitat McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text classification. In Proceedings of AAAI-98 Workshop on Learning for Text Categorization. McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text classification. In Proceedings of AAAI-98 Workshop on Learning for Text Categorization.
Zurück zum Zitat Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130–137. Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130–137.
Zurück zum Zitat Scholz, M., & Klinkenberg, R. (2007). Boosting classifiers for drifting concepts. Intelligent Data Analysis, 11(1), 3–28. Scholz, M., & Klinkenberg, R. (2007). Boosting classifiers for drifting concepts. Intelligent Data Analysis, 11(1), 3–28.
Zurück zum Zitat Schutze, H., Hull, D. A., & Pedersen, J. O. (1995). A comparison of classifiers and document representations for the routing problem. In Proceedings of the SIGIR ‘95, 18th Annual International ACM SIGIR conference on research and development in information retrieval. Seattle, WA: ACM. Schutze, H., Hull, D. A., & Pedersen, J. O. (1995). A comparison of classifiers and document representations for the routing problem. In Proceedings of the SIGIR ‘95, 18th Annual International ACM SIGIR conference on research and development in information retrieval. Seattle, WA: ACM.
Zurück zum Zitat Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.CrossRef Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.CrossRef
Zurück zum Zitat Tsymbal, A. (2004). The problem of concept drift: Definitions and related work. Technical Report. Dublin, Ireland: Department of Computer Science, Trinity College. Tsymbal, A. (2004). The problem of concept drift: Definitions and related work. Technical Report. Dublin, Ireland: Department of Computer Science, Trinity College.
Zurück zum Zitat Wenerstrom, B., & Giraud-Carrier, C. (2006). Temporal data mining in dynamic feature spaces. In Proceedings of the Sixth International Conference on Data Mining. Wenerstrom, B., & Giraud-Carrier, C. (2006). Temporal data mining in dynamic feature spaces. In Proceedings of the Sixth International Conference on Data Mining.
Zurück zum Zitat Widmer, G., & Kubat, M. (1996). Learning in the presense of concept drift and hidden contexts. Machine Learning, 23(1), 69–101. Widmer, G., & Kubat, M. (1996). Learning in the presense of concept drift and hidden contexts. Machine Learning, 23(1), 69–101.
Zurück zum Zitat Witten, I., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco, CA: Kaufmann.MATH Witten, I., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco, CA: Kaufmann.MATH
Zurück zum Zitat Yang, Y. (1994a). An example-based mapping method for text categorization and retrieval. ACM Transactions on Information Systems, 12(3), 252–277.CrossRef Yang, Y. (1994a). An example-based mapping method for text categorization and retrieval. ACM Transactions on Information Systems, 12(3), 252–277.CrossRef
Zurück zum Zitat Yang, Y. (1994b). Expert network: Effective and efficient learning from human decisions in text categorization and retrieval. In Proceedings of the 17th Annual International ACM SIGIR conference on research and development in information retrieval. Dublin, Ireland: Springer. Yang, Y. (1994b). Expert network: Effective and efficient learning from human decisions in text categorization and retrieval. In Proceedings of the 17th Annual International ACM SIGIR conference on research and development in information retrieval. Dublin, Ireland: Springer.
Zurück zum Zitat Yang, Y., & Pedersn, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of ICML-97, 14th International Conference on Machine Learning. San Francisco, CA: Kaufmann. Yang, Y., & Pedersn, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of ICML-97, 14th International Conference on Machine Learning. San Francisco, CA: Kaufmann.
Metadaten
Titel
An adaptive personalized news dissemination system
verfasst von
Ioannis Katakis
Grigorios Tsoumakas
Evangelos Banos
Nick Bassiliades
Ioannis Vlahavas
Publikationsdatum
01.04.2009
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 2/2009
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-008-0053-8

Premium Partner