Original Research Paper
Artificial Intelligence
S. H. Zahiri; R. Iranpoor; N. Mehrshad
Abstract
Background and Objectives: Person re-identification is an important application in computer vision, enabling the recognition of individuals across non-overlapping camera views. However, the large number of pedestrians with varying appearances, poses, and environmental conditions makes this task particularly ...
Read More
Background and Objectives: Person re-identification is an important application in computer vision, enabling the recognition of individuals across non-overlapping camera views. However, the large number of pedestrians with varying appearances, poses, and environmental conditions makes this task particularly challenging. To address these challenges, various learning approaches have been employed. Achieving a balance between speed and accuracy is a key focus of this research. Recently introduced transformer-based models have made significant strides in machine vision, though they have limitations in terms of time and input data. This research aims to balance these models by reducing the input information, focusing attention solely on features extracted from a convolutional neural network model. Methods: This research integrates convolutional neural network (CNN) and Transformer architectures. A CNN extracts important features of a person in an image, and these features are then processed by the attention mechanism in a Transformer model. The primary objective of this work is to enhance computational speed and accuracy in Transformer architectures. Results: The results obtained demonstrate an improvement in the performance of the architectures under consistent conditions. In summary, for the Market-1501 dataset, the mAP metric increased from approximately 30% in the downsized Transformer model to around 74% after applying the desired modifications. Similarly, the Rank-1 metric improved from 48% to approximately 89%.Conclusion: Indeed, although it still has limitations compared to larger Transformer models, the downsized Transformer architecture has proven to be much more computationally efficient. Applying similar modifications to larger models could also yield positive effects. Balancing computational costs while improving detection accuracy remains a relative goal, dependent on specific domains and priorities. Choosing the appropriate method may emphasize one aspect over another.
Original Research Paper
Wireless Networks
F. Rahdari; M. Sheikh-Hosseini; M. Jamshidi
Abstract
Background and Objectives: This research addresses the performance drop of edge users in downlink non-orthogonal multiple access (NOMA) systems. The challenging issue is paring the users, which becomes more critical in the case of edge users due to poor signal quality as well as the similarity of users' ...
Read More
Background and Objectives: This research addresses the performance drop of edge users in downlink non-orthogonal multiple access (NOMA) systems. The challenging issue is paring the users, which becomes more critical in the case of edge users due to poor signal quality as well as the similarity of users' channel gains.Methods: To study this issue, the capabilities of intelligent reflecting surface (IRS) technology are investigated to enhance system performance by modifying the propagation environment through intelligent adjusting of the IRS components. In doing so, an optimization problem is formulated to determine the optimal user powers and phase shifts of IRS elements. The objective is to maximize the system sum rate by considering the channel gain difference constraint. Additionally, the study addresses the effect of the IRS location in the cell on system performance.Results: The proposed approach is evaluated for various scenarios and compared with benchmarks in terms of average bit error rate (BER) and sum rate. The numerical results show that IRS-assisted NOMA improves the performance of edge users and distributes resources more fairly compared to conventional NOMA.Conclusion: Simulation results demonstrate that using IRS-assisted NOMA can effectively address the issue of edge users. By modifying the channel between the BS and the edge users using IRS, the channel gain difference of the users is increased, thereby enhancing the overall system performance. Particularly, the proposed IRS-NOMA system offers a gain of about 4 dB at a BER of 0.01 and 3 dB at the sum rate of 0.1 bps/Hz compared to conventional NOMA. In addition, it was observed that the location of the IRS in the cell affects the system's performance.
Original Research Paper
Electronics
E. Shafigh Fard; M. A. Jabraeil Jamali; M. Masdari; K. Majidzadeh
Abstract
Background and Objectives: A network on Chip (NoC) is a scalable communication framework that supports several cores. In some cases, while designing a customized Network-on-Chip, the communication needs across IP cores are often uneven, resulting in imbalanced loads on the input ports of a router. The ...
Read More
Background and Objectives: A network on Chip (NoC) is a scalable communication framework that supports several cores. In some cases, while designing a customized Network-on-Chip, the communication needs across IP cores are often uneven, resulting in imbalanced loads on the input ports of a router. The arbitration unit plays a crucial role in the design of the NoC micro-router architecture as it substantially influences the performance, chip occupancy, and power consumption of the NoC. Methods: This article presents a router arbitration architecture that utilizes a mix of variable priority arbitration and round-robin methods. The arbitration process evaluates other channels' requests using the Round Robin index within this architectural framework. A novel approach was suggested to integrate a network router unit onto a single chip, offering several benefits compared to earlier methods. The most significant advantage is its variable priority feature, which allows inputs to be assigned different priority levels regardless of the design circuit. The system is meant to prioritize fairness across all requests by sequentially executing them. The second and primary benefit of the developed circuit is its ability to retain the previously assigned virtual channel ID. This feature preserves the provided virtual channel ID and reduces the time required to verify the requested virtual channels in the subsequent cycle.Results: The evaluation process occurs after the flit has been requested to quit the virtual channel and the availability of the corresponding virtual channel has been verified. The simulation findings demonstrate that the RR-SFVP arbitration unit's design is 12.1% more compact in space than the standard RR approach, offering a promising solution for space-constrained designs. It exhibits 4.3% lower power consumption, a significant improvement in energy efficiency, and 55.1% reduced critical path time, enhancing the system's overall performance. Conclusion: The RR-SFVP technique incorporates all favorable elements in the design of the arbitration unit circuit, such as variable priority and equitable arbitration. Its clear benefits make a strong case for its superiority in the field.
Original Research Paper
Microwave Filters
S. Barzegar-Parizi
Abstract
Background and Objectives: The design of the circuit analog absorbers including resistive and conductive patterns on a dielectric substrate placed above the ground plane with a free spacer is interesting for researchers in the microwave regime. Broad absorption band can be achieved by appropriately designing ...
Read More
Background and Objectives: The design of the circuit analog absorbers including resistive and conductive patterns on a dielectric substrate placed above the ground plane with a free spacer is interesting for researchers in the microwave regime. Broad absorption band can be achieved by appropriately designing the structure parameters that lead to matching the input impedance of the structure with the impedance of free space over a wide operating band. In this study, a wideband circuit analogue absorber including double-layer of resistive frequency selective surfaces (FSS) is proposed. Methods: The proposed structure is composed of two layers of periodic arrays of strips loaded with lumped resistors deposited on dielectric substrates and separated by an air spacer. Strips of each layer are orthogonal to each other. The structure is placed on a metallic back reflector with an air spacer. The bottom resistive FSS including resistor-loaded strips directed in the x-direction plays the effective role of producing the resonant frequencies with exciting TM polarization waves and leads to a wide high-frequency absorption band, while the top resistive FSS, including resistor-loaded strips directed in the y-direction plays the effective role in exciting the resonances for TE polarization that can produce a broad low frequency absorption band. Indeed, in each polarization, one of the resistive FSS acts as a resonator while the other resistive FSS acts as a transparent layer and transmits the wave. A circuit model for characterizing the proposed structure is presented for both TE and TM polarizations in the subwavelength regime, which shows good agreement with the full-wave simulations.Results: The results demonstrate that the reflectivity below −10 dB (absorption above 90%) obtains from 3.55 to 9.82 GHz (fractional bandwidth of 93%) under normal incidence for TE polarization while with TM incident wave excitation, the absorption above 90% from 9.44 to 20.85 GHz (fractional bandwidth of 75%) can be achieved. Conclusion: The proposed structure leads to a wideband absorber with various bandwidths corresponding to exciting TE and TM incident waves. Most of the proposed structures in the literature produce similar bandwidths for both polarizations. Therefore, a polarization-controlled wideband absorber is designed in this task.
Original Research Paper
Communications
H. Alizadeh Ghazijahani; M. Atashbar
Abstract
Background and Objectives: The combination of multiple-input-multiple-output (MIMO) with a Visible light communication (VLC) system leads to a higher speed of data transmission named the MIMO-VLC system. In multi-user (MU) MIMO-VLC, an LED array transmits signals to users. These signals are categorized ...
Read More
Background and Objectives: The combination of multiple-input-multiple-output (MIMO) with a Visible light communication (VLC) system leads to a higher speed of data transmission named the MIMO-VLC system. In multi-user (MU) MIMO-VLC, an LED array transmits signals to users. These signals are categorized as signals of private information for each user and signals of public information for all users. Methods: In this research, we design an omnidirectional precoding to transmit the signals of public information in the MU-MIMO-VLC network. We aim to maximize the achievable rate which leads to maximizing the received mean power at the possible location of the users. Besides maximizing the achievable rate, we consider equal mean transmission power constraints in all LEDs to achieve higher power efficiency of the power amplifiers used in the LED array. Based on this, we formulate an optimization problem in which the constraint is in the form of a manifold, and utilize a gradient method projected on the manifold to solve the problem. Results: Simulation results indicate that the proposed omnidirectional precoding can achieve superior received mean power besides more than 10x bit error rate reduction compared to the classical form without precoding utilization. Conclusion: In this research, we proposed an omnidirectional precoding for transmitting the public signals in the MU-MIMO-VLC system. The proposed optimization problem maximizes the received mean power constrained with equal transmission mean power of LEDs in the array.
Original Research Paper
Artificial Intelligence
S. Kabiri Rad; V. Afshin; S. H. Zahiri
Abstract
Background and Objectives: When dealing with high-volume and high-dimensional datasets, the distribution of samples becomes sparse, and issues such as feature redundancy or irrelevance arise. Dimensionality reduction techniques aim to incorporate correlation between features and map the original features ...
Read More
Background and Objectives: When dealing with high-volume and high-dimensional datasets, the distribution of samples becomes sparse, and issues such as feature redundancy or irrelevance arise. Dimensionality reduction techniques aim to incorporate correlation between features and map the original features into a lower dimensional space. This usually reduces the computational burden and increases performance. In this paper, we study the problem of predicting heart disease in a situation where the dataset is large and (or) the proportion of instances belonging to one class compared to others is significantly low.Methods: We investigated three of the prominent dimensionality reduction techniques, including Principal Component Analysis (PCA), Information Bottleneck (IB) and Variational Autoencoder (VAE) on popular classification algorithms. To have adequate samples in all classes to properly feed the classifier, an efficient data balancing technique is used to compensate for fewer positives than negatives. Among all data balancing methods, a SMOTE-based method is selected, which generates new samples at the boundary of the samples distribution and avoids the synthesis of noise and redundant data. Results: The experimental results show that VAE-based method outperforms other dimensionality reduction algorithms in the performance measures. The proposed hybrid method improves accuracy to 97.1% and sensitivity to 99.2%.Conclusion: Finally, it can be concluded that the combination of VAE with oversampling algorithms can significantly enhance system performance as well as computational time.
Original Research Paper
Deep Learning
Z. Raisi; V. M. Nazarzehi Had; E. Sarani; R. Damani
Abstract
Background and Objectives: Research on right-to-left scripts, particularly Persian text recognition in wild images, is limited due to lacking a comprehensive benchmark dataset. Applying state-of-the-art (SOTA) techniques on existing Latin or multilingual datasets often results in poor recognition performance ...
Read More
Background and Objectives: Research on right-to-left scripts, particularly Persian text recognition in wild images, is limited due to lacking a comprehensive benchmark dataset. Applying state-of-the-art (SOTA) techniques on existing Latin or multilingual datasets often results in poor recognition performance for Persian scripts. This study aims to bridge this gap by introducing a comprehensive dataset for Persian text recognition and evaluating SOTA models on it.Methods: We propose a Farsi (Persian) text recognition (FATR) dataset, which includes challenging images captured in various indoor and outdoor environments. Additionally, we introduce FATR-Synth, the largest synthetic Persian text dataset, containing over 200,000 cropped word images designed for pre-training scene text recognition models. We evaluate five SOTA deep learning-based scene text recognition models using standard word recognition accuracy (WRA) metrics on the proposed datasets. We compare the performance of these recent architectures qualitatively on challenging sample images of the FATR dataset.Results: Our experiments demonstrate that SOTA recognition models' performance declines significantly when tested on the FATR dataset. However, when trained on synthetic and real-world Persian text datasets, these models demonstrate improved performance on Persian scripts.Conclusion: Introducing the FATR dataset enhances the resources available for Persian text recognition, improving model performance. The proposed datasets, trained models, and code is available at https://github.com/zobeirraisi/FATDR.
Original Research Paper
Classification
M. Rohani; H. Farsi; S. Mohamadzadeh
Abstract
Background and Objectives: Recent advancements in race classification from facial images have been significantly propelled by deep learning techniques. Despite these advancements, many existing methodologies rely on intricate models that entail substantial computational costs and exhibit slow processing ...
Read More
Background and Objectives: Recent advancements in race classification from facial images have been significantly propelled by deep learning techniques. Despite these advancements, many existing methodologies rely on intricate models that entail substantial computational costs and exhibit slow processing speeds. This study aims to introduce an efficient and robust approach for race classification by utilizing transfer learning alongside a modified Efficient-Net model that incorporates attention-based learning.Methods: In this research, Efficient-Net is employed as the base model, applying transfer learning and attention mechanisms to enhance its efficacy in race classification tasks. The classifier component of Efficient-Net was strategically modified to minimize the parameter count, thereby enhancing processing speed without compromising classification accuracy. To address dataset imbalance, we implemented extensive data augmentation and random oversampling techniques. The modified model was rigorously trained and evaluated on a comprehensive dataset, with performance assessed through accuracy, precision, recall, and F1 score metrics.Results: The modified Efficient-Net model exhibited remarkable classification accuracy while significantly reducing computational demands on the UTK-Face and FairFace datasets. Specifically, the model achieved an accuracy of 88.19% on UTK-Face and 66% on FairFace, reflecting a 2% enhancement over the base model. Additionally, it demonstrated a 9-14% reduction in memory consumption and parameter count. Real-time evaluations revealed a processing speed 14% faster than the base model, alongside achieving the highest F1-score results, which underscores its effectiveness for practical applications. Furthermore, the proposed method enhanced test accuracy in classes with approximately 50% fewer training samples by about 5%.Conclusion: This study presents efficient race classification model grounded in a modified Efficient-Net that utilizes transfer learning and attention-based learning to attain state-of-the-art performance. The proposed approach not only sustains high accuracy but also ensures rapid processing speeds, rendering it ideal for real-time applications. The findings indicate that this lightweight model can effectively rival more complex and computationally intensive recent methods, providing a valuable asset for practical race classification endeavors.
Original Research Paper
Power Systems
M. Moradi; R. Havangi
Abstract
The rail vehicle dynamics are greatly impacted by the different forces present at the contact region of wheel and rail. Since the wheel and rail adhesion force is the key element in maintaining high braking and acceleration performance of rail vehicles, continuous monitoring of this factor ...
Read More
The rail vehicle dynamics are greatly impacted by the different forces present at the contact region of wheel and rail. Since the wheel and rail adhesion force is the key element in maintaining high braking and acceleration performance of rail vehicles, continuous monitoring of this factor is very important. Damage to the infrastructures and lack of transportation convenience are the most obvious consequences of improper levels of adhesion force. Adhesion modeling is a time consuming task. In addition, due to the difficulties exist in directly measuring the adhesion force, the use of alternative methods in which state observers are adopted has attracted much attention. The selection of the primary model of the studied system and the ability of the selected estimator have a significant effect on the success of the proposed approach in estimating the variables. In this paper, the dynamics of the wheelset is simulated in the presence of irregularities that can be encountered in the railroad. Estimation of wheel-rail adhesion force is done indirectly by nonlinear filters as estimators and their accuracies in the estimation are compared to identify the better one. Meanwhile, inertial sensors (accelerometer and gyroscope) outputs are used as measuring matrix and employed to simulate actual situation and evaluate the estimators performances. To check the accuracy and ability of the estimators in estimating states and variables, the proposed approache implemented in Matlab.This study introduces an advantageous approach that uses the longitudinal, lateral and torsional dynamics to estimate wheel-rail adhesion force in variable conditions. Experimental results showed high precision, fast convergence, and low error values in variable estimation. Real time knowledge of the contact condition results in proper traction and braking performances. The results of proposed method can lead to decreasing wheel deterioration and operational costs, minimizing high creep levels, maximizing the use of already-existing adhesion, and mproving the frequency of service. It is worth noting that the proposed method is beneficial for both conventional railway transport and automated driverless trains.
Original Research Paper
Power System Operation and Analysis
M. Setareh; A. Moradi Birgani
Abstract
Background and Objectives: This paper proposes a novel formula to calculate the sensitivities of electromechanical modes with respect to generators active power changes. The generic ZIP load model is considered and the effect of various types of loads on the best paradigm of generators redispatch (GR) ...
Read More
Background and Objectives: This paper proposes a novel formula to calculate the sensitivities of electromechanical modes with respect to generators active power changes. The generic ZIP load model is considered and the effect of various types of loads on the best paradigm of generators redispatch (GR) is investigated. Furthermore, transmission lines resistance are modeled in the proposed formulae; and the best GR schemes to improve the power system damping with considering and neglecting transmission lines resistance are compared.Methods: Four energy functions are defined and the quadratic eigenvalue problem is applied to construct the framework of the proposed formula. The dynamic equation of the classical model of synchronous generators along with algebraic equations of power network considering transmission lines resistance and ZIP model of power system loads are written in a systematic manner using the partial differential of the energy functions. Then, set of equations of the power system are linearized and sensitivity factors are calculated using power system model parameters and power flow variables, which can be either obtained via state estimation or measured directly by phasor measurement units.Results: The 39-bus New England power system is used to calculate sensitivities. The value of Sensitivity factors in conditions of considering transmission lines resistance and neglecting ones are compared and then the best GR plan to improve critical damping is determined. If all the loads are assumed to be in constant power mode, then for two modes with and without considering transmission lines resistance, generators pairs (9,1) and (5,2) are the best redispatch plans to damp oscillations. However, If all the loads are assumed to be in constant current mode, the best generators pair without considering transmission lines resistance mode does not change, although, it changes to generators pair (5,1) for the mode of considering transmission lines resistance.Conclusion: Using the classical model of synchronous generators does not give information about the damping-ratio of the inter-area mode and only estimates its frequency well. Besides, considering the load model and resistance of transmission lines change the best paradigm of GR to damp oscillations.
Original Research Paper
Compress Sensing
A. Vakili; M. Shams Esfand Abadi; M. Kalantari
Abstract
Background and Objectives: In the realm of compressed sensing, most greedy sparse recovery algorithms necessitate former information about the signal's sparsity level, which may not be available in practical conditions. To address this, methods based on the Sparsity Adaptive Matching Pursuit (SAMP) algorithm ...
Read More
Background and Objectives: In the realm of compressed sensing, most greedy sparse recovery algorithms necessitate former information about the signal's sparsity level, which may not be available in practical conditions. To address this, methods based on the Sparsity Adaptive Matching Pursuit (SAMP) algorithm have been developed to self-determine this parameter and recover the signal using only the sampling matrix and measurements. Determining a suitable Initial Value for the algorithm can greatly affect the performance of the algorithm.Methods: One of the latest sparsity adaptive methods is Correlation Calculation SAMP (CCSAMP), which relies on correlation calculations between the signals recovered from the support set and the candidate set. In this paper, we present a modified version of CCSAMP that incorporates a pre-estimation phase for determining the initial value of the sparsity level, as well as a modified acceptance criteria considering the variance of noise. Results: To validate the efficiency of the proposed algorithm over the previous approaches, random sparse test signals with various sparsity levels were generated, sampled at the compression ratio of 50%, and recovered with the proposed and previous methos. The results indicate that the suggested method needs, on average, 5 to 6 fewer iterations compared to the previous methods, just due to the pre-estimation of the initial guess for the sparsity level. Furthermore, as far as the least square technique is integrated in some parts of the algorithm, in presence of noise the modified acceptance criteria significantly improve the success rate while achieving a lower mean squared error (MSE) in the recovery process.Conclusion: The pre-estimation process makes it possible to recover signal with fewer iterations while keeping the recovery quality as before. The fewer the number of iterations, the faster the algorithm. By incorporating the noise variance into the accept criteria, the method achieves a higher success rate and a lower mean squared error (MSE) in the recovery process.
Original Research Paper
Machine Learning
K. Gorgani Firouzjah; J. Ghasemi
Abstract
Background and Objectives: Power transformer (PT) health assessment is crucial for ensuring the reliability of power systems. Dissolved Gas Analysis (DGA) is a widely used technique for this purpose, but traditional DGA interpretation methods have limitations. This study aims to develop a more accurate ...
Read More
Background and Objectives: Power transformer (PT) health assessment is crucial for ensuring the reliability of power systems. Dissolved Gas Analysis (DGA) is a widely used technique for this purpose, but traditional DGA interpretation methods have limitations. This study aims to develop a more accurate and reliable PT health assessment method using an ensemble learning approach with DGA.Methods: The proposed method utilizes 11 key parameters obtained from real PT samples. In this way, synthetic data are generated using statistical simulation to enhance the model's robustness. Twelve different classifiers are initially trained and evaluated on the combined dataset. Two novel indices (a risk index and an unnecessary cost index) are introduced to assess the classifiers' performance alongside traditional metrics such as accuracy, precision, and the confusion matrix. An ensemble learning method is then constructed by selecting classifiers with the lowest risk and cost indices.Results: The ensemble learning approach demonstrated superior performance compared to individual classifiers. The learning algorithm achieved high accuracy (99%, 92%, and 86% for three health classes), a low unnecessary cost index (6%), and a low misclassification risk (16%). This result indicates the effectiveness of the ensemble approach in accurately detecting PT health conditions.Conclusion: The proposed ensemble learning method provides a reliable and accurate assessment of PT health using DGA data. This approach effectively optimizes maintenance strategies and enhances the overall reliability of power systems by minimizing misclassification risks and unnecessary costs.
Original Research Paper
Artificial Intelligence
L. Hafezi; S. Zarifzadeh; M. R. Pajoohan
Abstract
Background and Objectives: Detecting multiple entities within financial texts and accurately analyzing the sentiment associated with each is a challenging yet critical task. Traditional models often struggle to capture the nuanced relationships between multiple entities, especially when sentiments are ...
Read More
Background and Objectives: Detecting multiple entities within financial texts and accurately analyzing the sentiment associated with each is a challenging yet critical task. Traditional models often struggle to capture the nuanced relationships between multiple entities, especially when sentiments are context-dependent and spread across different levels of a document. Addressing these complexities requires advanced models that can not only identify multiple entities but also distinguish their individual sentiments within a broader context. This study aims to introduce and evaluate two novel methods, ENT-HAN and SNT-HAN, built upon the Hierarchical Attention Networks, specifically designed to enhance the accuracy of both entity extraction and sentiment analysis in complex financial documents.Methods: In this study, we design ENT-HAN and SNT-HAN methods to address the tasks of multi-entity detection and sentiment analysis within financial texts. The first method focuses on entity extraction, where capture hierarchical relationships between words and sentences. By utilizing word-level attention, the model identifies the most relevant tokens for recognizing entities, while sentence-level attention helps refine the context in which these entities appear, allowing the model to detect multiple entities with precision. The second method is applied for sentiment analysis, aiming to classify sentiments into positive, negative, or neutral categories. The sentiment analysis model employs hierarchical attention to identify the most important words and sentences that convey sentiment about each entity. This approach ensures that the model not only focuses on the overall sentiment of the text but also accounts for context-specific variations in sentiment across different entities. Both methods were evaluated on FinEntity dataset, and the results demonstrate their effectiveness, with significantly improving the accuracy of both entity extraction and sentiment classification tasks.Results: The ENT-HAN and SNT-HAN demonstrated strong performance in both entity extraction and sentiment analysis, outperforming the methods they were compared against. For entity extraction, ENT-HAN was evaluated against RNN and BERT models, showing superior accuracy in identifying multiple entities within complex texts. In sentiment analysis, SNT-HAN was compared to the best-performing method previously applied to FinEntity dataset. Despite the good performance of the existing methods, SNT-HAN demonstrated superior results, achieving a better accuracy.Conclusion: The outcome of this research highlights the potential of the ENT-HAN and SNT-HAN for improving entity extraction and sentiment analysis accuracy in financial documents. Their ability to model attention at multiple levels allows for a more nuanced understanding of text, establishing them as a valuable resource for complex tasks in financial text analysis.
Original Research Paper
Data Mining
J. Salimi Sartakhti; A. Beirnvand; M. Sarhadi
Abstract
Background and Objectives: Large Language Models have demonstrated exceptional performance across various NLP tasks, especially when fine-tuned for specific applications. Full fine-tuning of large language models requires extensive computational resources, which are often ...
Read More
Background and Objectives: Large Language Models have demonstrated exceptional performance across various NLP tasks, especially when fine-tuned for specific applications. Full fine-tuning of large language models requires extensive computational resources, which are often unavailable in real-world settings. While Low-Rank Adaptation (LoRA) has emerged as a promising solution to mitigate these challenges, its potential remains largely untapped in multi-task scenarios. This study addresses this gap by introducing a novel hybrid approach that combines LoRA with an attention-based mechanism, enabling fine-tuning across tasks while facilitating knowledge sharing to improve generalization and efficiency. This study aims to address this gap by introducing a novel hybrid fine-tuning approach using LoRA for multi-task text classification, with a focus on inter-task knowledge sharing to enhance overall model performance.Methods: We proposed a hybrid fine-tuning method that utilizes LoRA to fine-tune LLMs across multiple tasks simultaneously. By employing an attention mechanism, this approach integrates outputs from various task-specific models, facilitating cross-task knowledge sharing. The attention layer dynamically prioritizes relevant information from different tasks, enabling the model to benefit from complementary insights. Results: The hybrid fine-tuning approach demonstrated significant improvements in accuracy across multiple text classification tasks. On different NLP tasks, the model showed superior generalization and precision compared to conventional single-task LoRA fine-tuning. Additionally, the model exhibited better scalability and computational efficiency, as it required fewer resources to achieve comparable or better performance. Cross-task knowledge sharing through the attention mechanism was found to be a critical factor in achieving these performance gains.Conclusion: The proposed hybrid fine-tuning method enhances the accuracy and efficiency of LLMs in multi-task settings by enabling effective knowledge sharing between tasks. This approach offers a scalable and resource-efficient solution for real-world applications requiring multi-task learning, paving the way for more robust and generalized NLP models.
Original Research Paper
Power Electronics
M. Nabizadeh; P. Hamedani; B. Mirzaeian Dehkordi
Abstract
Background and Objectives: Due to the disadvantages of the traditional AC-DC-AC converters, especially in electric drive applications, Matrix Converters (MCs) have been widely researched. MCs are well-known structures that remove the DC-Link capacitor and provide bidirectional power flow, while also ...
Read More
Background and Objectives: Due to the disadvantages of the traditional AC-DC-AC converters, especially in electric drive applications, Matrix Converters (MCs) have been widely researched. MCs are well-known structures that remove the DC-Link capacitor and provide bidirectional power flow, while also giving the ability to control reactive power flow, which the AC-DC-AC converter lacks. Methods: In this work, Model Predictive Current Control (MPCC) is utilized in conjunction with the MC to provide more versatility and controllability than traditional control methods. The work endeavors to investigate the current control of the MC utilizing the finite control set Model Predictive Control (MPC) approach. Results: Current tracking performance, reactive power control, and switching frequency minimization have been included in the objective function of the controller. Moreover, the results have been compared to the traditional AC-DC-AC converters under similar circumstances. The MC can reduce the switching frequency by 40% compared to the AC-DC-AC converter while maintaining the same current THD value. Additionally, it achieves a 58% reduction in current THD compared to the AC-DC-AC converter at the same average switching frequency. However, in the MC, the mitigation of reactive power and the reduction in switching frequency have opposing effects on the current tracking performance.Conclusion: This work proposes an MPCC method for the MC with an RL load, effectively controlling load current and reactive power. The reduction of switching commutations was also evaluated using different weighting factors in the prediction strategy for both the MC and AC-DC-AC converters. Simulation results demonstrate that the MC outperforms the AC-DC-AC converter in dynamic response and reactive power control.