nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

An Unsupervised Approach for Low-Quality Answer Detection in Community Question-Answering

verfasst von : Haocheng Wu, Zuohui Tian, Wei Wu, Enhong Chen

Erschienen in: Database Systems for Advanced Applications

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Community Question Answering (CQA) sites such as Yahoo! Answers provide rich knowledge for people to access. However, the quality of answers posted to CQA sites often varies a lot from precise and useful ones to irrelevant and useless ones. Hence, automatic detection of low-quality answers will help the site managers efficiently organize the accumulated knowledge and provide high-quality contents to users. In this paper, we propose a novel unsupervised approach to detect low-quality answers at a CQA site. The key ideas in our model are: (1) most answers are normal; (2) low-quality answers can be found by checking its “peer” answers under the same question; (3) different questions have different answer quality criteria. Based on these ideas, we devise an unsupervised learning algorithm to assign soft labels to answers as quality scores. Experiments show that our model significantly outperforms the other state-of-the-art models on answer quality prediction.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Similarity Search Combining Query Relaxation and Diversification

Nächstes Kapitel Approximate OLAP on Sustained Data Streams

https://answers.yahoo.com/question/index?qid=20090408172834AArbCtu.

We set \(\epsilon =0.00001\) and \(N=200\) in our experiments.

http://alt.qcri.org/semeval2015/task3.

Features of Q_u_other and A_u_other in Table 2 are only traceable in Yahoo dataset.

http://developer.yahoo.com/answers.

http://lucene.apache.org/.

http://www.ranks.nl/stopwords.

http://gibbslda.sourceforge.net/.

https://code.google.com/archive/p/word2vec/.

http://www.statmt.org/moses/giza/GIZA++.html.

http://nlp.stanford.edu/software/tagger.shtml.

http://nlp.stanford.edu/software/lex-parser.shtml.

http://www.cs.cmu.edu/~alavie/METEOR/.

c: trade-off between training error and margin. j: cost-factor of training errors difference between positive and negative examples. b: use biased hyperplane or not.

To save space we only report the results on Qatar dataset. The results in terms of Fatwa and Yahoo have similar trends.

“Non-English” and “Other” answers are categorized into “Irrelevant” answers.

Berger, A., et al.: Bridging the lexical chasm: statistical approaches to answer-finding. In: SIGIR 2000 (2000)

Blei, D.M., et al.: Latent Dirichlet allocation. In: NIPS 2001 (2001)

Chandola, V., et al.: Anomaly detection: a survey. ACM Comput. Surv. 41(3) (2009)

Crawford, M., et al.: Survey of review spam detection using machine learning techniques. J. Big Data 2(1), 23 (2015)CrossRef

Denkowski, M.J., Lavie, A.: Meteor universal: language specific translation evaluation for any target language. In: EACL 2014 (2014)

Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)CrossRefMATH

Jeon, J., et al.: A framework to predict the quality of answers with non-textual features. In: SIGIR 2006 (2006)

Jindal, N., Liu, B.: Review spam detection. In: WWW 2007, pp. 1189–1190 (2007)

Joachims, T.: Learning to Classify Text Using Support Vector Machines - Methods, Theory, and Algorithms. Kluwer/Springer, New York (2002)CrossRef

10.

Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: ACL 2003 (2003)

11.

Li, F., et al.: Learning to identify review spam. In: IJCAI 2011 (2011)

12.

Liu, W., et al.: Unsupervised one-class learning for automatic outlier removal. In: CVPR 2014 (2014)

13.

Lyon, C., et al.: Detecting short passages of similar text in large document collections. In: EMNLP 2001, pp. 118–125 (2001)

14.

Mikolov, T., et al.: Efficient estimation of word representations in vector space. CoRR, abs/1301.3781 (2013)

15.

Nakov, P., et al.: Semeval-2015 task 3: answer selection in community question answering. In: SemEval@NAACL-HLT 2015 (2015)

16.

Nakov, P., et al.: Semeval-2016 task 3: community question answering. In: SemEval@NAACL-HLT 2016, pp. 525–545 (2016)

17.

Nallapati, R.: Discriminative models for information retrieval. In: SIGIR 2004 (2004)

18.

Nicosia, M.Q., et al.: QCRI: answer selection for community question answering - experiments for arabic and english. In: SemEval@NAACL-HLT 2015 (2015)

19.

Radev, D.R., et al.: Evaluating web-based question answering systems. In: LREC’s 2002 (2002)

20.

Sakai, T., et al.: Using graded-relevance metrics for evaluating community QA answer selection. In: WSDM 2011 (2011)

21.

Shah, C., Pomerantz, J.: Evaluating and predicting answer quality in community QA. In: SIGIR 2010 (2010)

22.

Toutanova, K., et al.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: HLT-NAACL (2003)

23.

Tran, Q.H., et al.: JAIST: combining multiple features for answer selection in community question answering. In: SemEval@NAACL-HLT 2015 (2015)

24.

Wise, M.J.: YAP3: improved detection of similarities in computer program and other texts. In: SIGCSE 1996, pp. 130–134 (1996)

25.

Xia, Y., et al.: Learning discriminative reconstructions for unsupervised outlier removal. In: ICCV 2015 (2015)

Titel: An Unsupervised Approach for Low-Quality Answer Detection in Community Question-Answering
verfasst von: Haocheng Wu
Zuohui Tian
Wei Wu
Enhong Chen
Verlag: Springer International Publishing
Buch: Database Systems for Advanced Applications
Print ISBN: 978-3-319-55698-7

Electronic ISBN: 978-3-319-55699-4

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-55699-4_6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"