Journal of Electrical and Computer Engineering Innovations (JECEI)

Natural Language Processing

Deep Reinforcement Learning for Efficient Multilingual Dialogue Management

M. J. Nasri-Lowshani; J. Salimi Sartakhti; H. Ebrahimpour-Komole

Articles in Press, Accepted Manuscript, Available Online from 04 May 2025

https://doi.org/10.22061/jecei.2025.11348.814

Abstract

Background and Objectives: Developing efficient task-oriented dialogue systems capable of handling multilingual interactions is a growing area of research in natural language processing (NLP). In this paper, we propose SenSimpleDS, a deep reinforcement learning-based joint task-oriented dialogue system, ... Read More Background and Objectives: Developing efficient task-oriented dialogue systems capable of handling multilingual interactions is a growing area of research in natural language processing (NLP). In this paper, we propose SenSimpleDS, a deep reinforcement learning-based joint task-oriented dialogue system, designed for multilingual conversations.Methods: The system utilizes a deep Q-network and the SBERT model to represent the dialogue environment. We introduce two variants, SenSimpleDS+ and SenSimpleDS-NSP, which incorporate modifications in the ε-greedy method and leverage next sequence prediction (NSP) using BERT to refine the reward function. These methods are evaluated on datasets in English, Persian, Spanish, and German, and compared with baseline methods such as SimpleDS and SCGSimpleDS.Results: Our experimental results demonstrate that the proposed methods outperform the baselines in terms of average collected rewards, requiring fewer learning steps to achieve optimal dialogue policies. Notably, the incorporation of NSP significantly improves performance by optimizing reward collection. The multilingual SenSimpleDS further showcases the system’s ability to function across languages using a random forest classifier for language detection and MPNet for environment construction. In addition to system evaluations, we introduce a new Persian dataset for task-oriented dialogue in the restaurant domain, expanding the resources available for developing dialogue systems in low-resource languages.Conclusion: SenSimpleDS, a deep reinforcement learning-based joint task-oriented dialogue system, demonstrates superior performance over baseline methods by leveraging deep Q-networks, SBERT. The integration of next sequence prediction (NSP) significantly enhances reward optimization, enabling faster convergence to optimal dialogue policies. This work establishes a foundation for future research in multilingual dialogue systems, with potential applications across diverse service domains.

View Article

Natural Language Processing

Persian Slang Text Conversion to Formal and Deep Learning of Persian Short Texts on Social Media for Sentiment Classification

M. Khazeni; M. Heydari; A. Albadvi

Volume 13, Issue 1 , January 2025, , Pages 27-42

https://doi.org/10.22061/jecei.2024.10745.731

Abstract

Background and Objectives: The lack of a suitable tool for the analysis of conversational texts in Persian language has made various analyzes of these texts, including Sentiment Analysis, difficult. In this research, it has we tried to make the understanding of these texts easier for the machine by providing ... Read More

Natural Language Processing

Actor Double Critic Architecture for Dialogue System

Y. Saffari; J. Salimi Sartakhti

Volume 11, Issue 2 , July 2023, , Pages 363-372

https://doi.org/10.22061/jecei.2023.9346.614

Abstract

Background and Objectives: Most of the recent dialogue policy learning ‎methods are based on reinforcement learning (RL). However, the basic RL ‎algorithms like deep Q-network, have drawbacks in environments with ‎large state and action spaces such as dialogue systems. Most of the ‎policy-based ... Read More Background and Objectives: Most of the recent dialogue policy learning ‎methods are based on reinforcement learning (RL). However, the basic RL ‎algorithms like deep Q-network, have drawbacks in environments with ‎large state and action spaces such as dialogue systems. Most of the ‎policy-based methods are slow, cause of the estimating of the action value ‎using the computation of the sum of the discounted rewards for each ‎action. In value-based RL methods, function approximation errors lead to ‎overestimation in value estimation and finally suboptimal policies. There ‎are works that try to resolve the mentioned problems using combining RL ‎methods, but most of them were applied in the game environments, or ‎they just focused on combining DQN variants. This paper for the first time ‎presents a new method that combines actor-critic and double DQN named ‎Double Actor-Critic (DAC), in the dialogue system, which significantly ‎improves the stability, speed, and performance of dialogue policy learning. ‎Methods: In the actor critic to overcome the slow learning of normal DQN, ‎the critic unit approximates the value function and evaluates the quality ‎of the policy used by the actor, which means that the actor can learn the ‎policy faster. Moreover, to overcome the overestimation issue of DQN, ‎double DQN is employed. Finally, to have a smoother update, a heuristic ‎loss is introduced that chooses the minimum loss of actor-critic and ‎double DQN. ‎Results: Experiments in a movie ticket booking task show that the ‎proposed method has more stable learning without drop after ‎overestimation and can reach the threshold of learning in fewer episodes ‎of learning. ‎Conclusion: Unlike previous works that mostly focused on just proposing ‎a combination of DQN variants, this study combines DQN variants with ‎actor-critic to benefit from both policy-based and value-based RL methods ‎and overcome two main issues of both of them, slow learning and ‎overestimation. Experimental results show that the proposed method can ‎make a more accurate conversation with a user as a dialogue policy ‎learner.‎

Journal of Electrical and Computer Engineering Innovations (JECEI)

Articles in Press

Current Issue

Volume 13 (2025)

Volume 12 (2024)

Volume 11 (2023)

Volume 10 (2022)

Volume 9 (2021)

Volume 8 (2020)

Volume 7 (2019)

Volume 6 (2018)

Volume 5 (2017)

Volume 4 (2016)

Volume 3 (2015)

Volume 2 (2014)

Volume 1 (2013)

Main Subjects = Natural Language Processing

Deep Reinforcement Learning for Efficient Multilingual Dialogue Management

Abstract

Persian Slang Text Conversion to Formal and Deep Learning of Persian Short Texts on Social Media for Sentiment Classification

Abstract

Actor Double Critic Architecture for Dialogue System

Abstract