A New Model for Text Coherence Evaluation Using Statistical Characteristics

Authors

1 Kharazmi International Campus, Shahrood University, Shahrood, Iran.

2 Kharazmi International Campus Shahrood University Shahrood, Iran

Abstract

Discourse coherence modeling evaluation becomes a critical but challenging task for all content analysis tasks in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging in semantic and linguistic concepts of a text. It means that the problem cannot be solved very well and these methods are only very limited to available word co-occurrence information in the sequential sentences within a short part of a text. One of the greatest challenges of the above methods is their limitation in long documents coherence evaluation and being suitable for documents with low number of sentences. Our proposed method focuses on both local and global coherence. It can also assess the local topic integrity of text at the paragraph level regardless of word meaning and handcrafted rules. The global coherence in the proposed method is evaluated by sequence paragraph dependency. According to the derived results in word embeddings, by applying statistical approaches, the presented method incorporates the external word correlation knowledge into short and long stories to assess both local and global coherence, simultaneously. Using the effect of combined word2vec vectors and most likely n-grams, we show that our proposed method is independent of the language and its semantic concepts.The derived results indicate that the proposed method offers the higher accuracy with respect to the other algorithms, in long documents with a high number of sentences.

Graphical Abstract

A New Model for Text Coherence Evaluation Using Statistical Characteristics

Keywords


[1]      Z. Lin, C. Liu, H. T. Ng, and M.-Y. Kan, “Combining coherence models and machine translation evaluation metrics for summarization evaluation,” in Proc. The 50th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1006–1014, Jeju, Republic of Korea, 2012.

[2]      D. Xiong, Y. Ding, M. Zhang, and C. L. Tan, “Lexical chain based cohesion models for document-level statistical machine translation,” in Proc. 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1563–1573, Washington, USA, 2013.

[3]      D. Xiong, M. Zhang, and X. Wang, “Topic-based coherence modeling for statistical machine translation,” Trans. Audio, Speech and Lang., vol. 23, no. 3, pp. 483–493, 2015.

[4]      H. J. Fox, “Phrasal cohesion and statistical machine translation,” in Proc. The Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, USA, pp. 304-311, 2002.

[5]      H. Yannakoudakis and T. Briscoe, “Modeling coherence in ESOL learner texts,” in Proc. The Seventh Workshop on Building Educational Applications Using NLP, Montreal, Canada, pp. 33-43, 2012.

[6]      J. Burstein, J. Tetreault, and S. Andreyev, “Using entity-based features to model coherence in student essays,” in Proc. NAACL-HLT, California, USA, pp. 681-684, 2010.

[7]      D. Higgins, J. Burstin, D. Marcu, and C. Gentile, “Evaluating multiple aspects of coherence in student essays,” in Proc. NAACL-HLT, pp. 185-192, Boston, USA, 2004.

[8]      A. Celikyilmaz and D. Hakkani-Tur, “Discovery of topically coherent sentences for extractive summarization,” in Proc. The 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, USA, pp. 491–499, 2011.

[9]      D. Parveen and M. Strube, “Integrating importance, non-redundancy and coherence in graph-based extractive summarization,” in Proc. The Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), pp. 1298–1304,  Buenos Aires, Argentina, 2015.

[10]   R. Zhang, “Sentence ordering driven by local and global coherence for summary generation,” in Proc. The ACL-HLT 2011 Student Session, Portland, OR, USA, pp. 6–11, 2011.

[11]   M. A. K. Halliday and R. Hasan, “Cohesion in English,” London, Longman, 1976.

[12]   B. J. Grosz, A. K. Joshi, and S. Weinstein, “Centering: A framework   for  modeling  the  local  coherence  of  discourse,”

Computational Linguistics, vol. 21, no. 2, pp. 203–225, 1995.

[13]   I. Tapiero, “Situation models and levels of coherence: towards a definition of comprehension,” Routledge, first edition, ISBN-13: 978-1138004221, 2014.

[14]   P. W. Foltz, W. Kintsch, and T. K. Landauer, “The measurement of textual coherence with latent semantic analysis,” Discourse Processes, vol. 25, no. 2-3, pp. 285-307, 1998.

[15]   R. Barzilay and M. Lapata, “Modeling local coherence: An entity-based approach,” in Proc. ACL '05 the 43rd Annual Meeting on Association for Computational Linguistics, pp.141-148, Michigan, USA, 2005.

[16]   R. Barzilay and M. Lapata, “Modeling local coherence: An entity-based approach,” Computational Linguistics, vol. 34, pp. 1-34, 2008.

[17]   S. Somasundaran, J. Burstein, and M. Chodorow, “Lexical chaining for measuring discourse coherence quality in test-taker essays,” in Proc. COLING the 25th International Conference on Computational Linguistics: Technical Papers, pp. 950–961, Dublin, Ireland, 2014.

[18]   V. W. Feng and G. Hirst, “Extending the entity-based coherence model with multiple ranks,” in Proc. EACL, pp. 315-324, Avignon, France, 2012.

[19]   C. Petersen, C. Lioma, J. G. Simonsen, and B. Larsen, “Entropy and graph based modeling of document coherence using discourse entities: An application to IR,” in Proc. ICTIR, pp. 191–200, Northampton, MA, USA, 2015.

[20]   M. Zhang, V. W. Feng, B. Qin, G. Hirst, T. Liu, and J. Huang, “Encoding world knowledge in the evaluation of local coherence,” in Proc. NAACL HLT, pp. 1087–1096, Denver, Colorado, USA, 2015.

[21]   Z. H. Lin, H. T. Ng, and M. Y. Kan, “Automatically evaluating text coherence using discourse relations,” in Proc. ACL-11, pp. 997-1006, Portland, USA, 2011.

[22]   A. Louis and A. Nenkova, “A coherence model based on syntactic patterns,” in Proc. EMNLP-CNLL, pp. 1157-1168, Jeju Island, Korea, 2012.

[23]   R. Iida and T. Tokunaga, “A metric for evaluating discourse coherence based on coreference resolution,” in Proc. COLING, pp.483-494, Mumbai, India, 2012.

[24]   M. Elsner and E. Charniak, “Coreference-inspired coherence modeling,” in Proc. ACL-08, pp. 41-44, Ohio, USA, 2008.

[25]   R. Barzilay and L. Lee, “Catching the drift: probabilistic content models, with applications to generation and summarization,” in Proc. NAACL-HLT, pp. 113-120, 2004.

[26]   M. Elsner, J. Austerweil, and E. Charniak, “A unified local and global model for discourse coherence,” in Proc. NAACL, pp. 436-443, New York, USA,  2007.

[27]   F. Xu, Q. Zhu, G. Zhou, and M. Wang, “Cohesion-driven discourse coherence modeling,” Journal of Chinese Information Processing, vol. 28, no. 3, pp. 11-21, 2014.

[28]   K. Filippova, M. Strube, “Extending the entity-grid coherence model to semantically related entities,” in Proc. ENLG '07 the Eleventh European Workshop on Natural Language Generation, pp. 139-142, 2007.

[29]   M. Lapata and R. Barzilay, “Automatic evaluation of text coherence: models and representations,” in Proc. The 19th International Joint Conference on Artificial Intelligence, pp. 1085-1090, Scotland, UK, 2005.

[30]   F. Xu and S. Du, “An entity-driven recursive neural network model for Chinese discourse coherence modeling,” International Journal of Artificial Intelligence and Applications (IJAIA), vol. 8, no. 2, pp. 1-9, 2017.

[31]   C. Lioma, F. Tarissan, J. Grue Simonsen, C. Petersen, and B. Larsen, “Exploiting the bipartite structure of entity grids for document coherence and retrieval,” presented at the 2nd ACM International Conference on the Theory of Information, Newark, United States, Sep 2016.        

[32]   C. Guinaudeau and M. Strube, “Graph-based Local Coherence Modeling,” in Proc.  The 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, pp. 93–103, 2013.

[33]   M. Mesgar and M. Strube “Graph-based coherence modeling for assessing readability,” in Proc. The Fourth Joint Conference on Lexical and Computational Semantics, pp. 309–318, Denver, USA, 2015.

[34]   R. Soricut and D. Marcu, “Discourse generation using utility-trained coherence models,” in Proc. The COLING/ACL on Main conference poster sessions, pp. 803-810, Sydney, Australia, 2006.

[35]   J. Li and E. Hovy, “A model of coherence based on distributed sentence representation,” in Proc. The 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 2039-2048, Doha, Qatar, 2014.

[36]   J. Li and D. Jurafsky, “Neural net models for open-domain discourse coherence,” in Proc. The 2017 Conference on Empirical Methods in Natural Language Processing, pp. 198–209, 2017.

[37]   L. Logeswaran, H. Lee and D. Radev, “Sentence ordering using recurrent neural networks,” arXiv preprint arXiv:1611.02654,  Nov 2016.

[38]   R. Rosenfeld, “A maximum entropy approach to adaptive statistical language modeling,” Computer Speech & Language, vol. 10, no. 3, pp. 187–228, 1996.

[39]   M. Abdolahi and M. Zahedi, “Text coherence new method using word2vec sentence vectors and most likely n-grams,” presented at the 3rd Iranian conference and intelligent systems on signal processing (ICSPIS), Shahrood, Iran, 2017.

[40]   Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin. “A neural probabilistic language model,” Journal of Machine Learning Research, vol. 3, pp. 1137–1155, 2003.

[41]   R. Johnson and T. Zhang, “Effective use of word order for text categorization with convolutional neural networks,” in Proc. The 2015 Annual Conference of the North American Chapter of the ACL, pp. 103–112, Denver, USA, 2015.

[42]   R. Johnson and T. Zhang, “Semi-supervised convolutional neural networks for text categorization via region embedding,” in Proc. Advances in Neural Information Processing Systems (NIPS 2015), pp. 919-927, 2015.

[43]   T. H. Nguyen, R. Grishman, “Relation extraction: Perspective from convolutional neural networks,” in Proc. Workshop on Vector Space Modeling for NLP at NAACL 2015, pp. 39–48, Denver, USA, 2015.

[44]   N. Kalchbrenner, E. Grefenstette, and P. Blunsom “A convolutional neural network for modelling sentences,” in Proc. The 52nd Annual Meeting of the Association for Computational Linguistics Acl, pp. 655–665, Baltimore, USA, 2014.

 [45]   Y. Kim, “Convolutional neural networks for sentence classification,” in Proc. The 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751, Doha, Qatar, 2014.

[46]   Y. Zhang and B. Wallace, “A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classification,” in Proc. The 8th International Joint Conference on Natural Language Processing, pp. 253-263, Taipei, Taiwan, 2017.

[47]   F. Yaghmaee and M. Kamyar, “Introducing new trends for persian CAPTCHA,” Journal of Electrical and Computer Engineering Innovations (JECEI), vol. 4, no. 2, pp. 119-126, 2016.

[48]   T. Mikolov and I. Sutskever, “Distributed representations of words and phrases and their compositionality,” in Proc. NIPS 2013, pp. 3111–3119, Nevada, USA,  2013.

[49]   J. Pennington, R. Socher, and C. D. Manning, “GloVe: global vectors forword representation,” in Proc. The 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543, Doha, Qatar, 2014.

[50]   M. J. Kusner, Y. Sun, N. I. Kolkin, and K. Q. Weinberger, “From word embeddings to document distances,” in Proc. The 32nd International Conference on Machine Learning, JMLR: W&CP vol. 37, pp. 957-966, Lille, France, 2015.

[51]   M. Abdolahi and M. Zahedi, “Sentence matrix normalization using most likely n-grams vector,” presented at the 2017 IEEE 4th International Conference on Knowledge-Based Engineering and Innovation (KBEI), Tehran, Iran, 2017.