1 MSc student of computer engineering – software, Pooyandegan Danesh Institution of Higher Education, Chalus, Iran

2 Full time science Committee member, Islamic Azad University of chalus, Chalus, Iran


Background and Objectives: Nowadays, data mining is one of the most significant issues. One field of data mining is a mixture of computer science and statistics which is considerably limited due to increase in digital data and growth of computational power of computers. One of the domains of data mining is the software cost estimation category.
Methods: In this article, classifying techniques of learning algorithm of machine and COCOMO model as the most common estimation model of software costs are presented. Then, the analysis method of principal component approach is presented.
Results: This article presents a suitable method to improve the performance of the software cost estimation. Moreover, the basic data set is decreased and is turned into a new collection by using this method. Among the features, the best are extracted. The algorithms of several classifications are assessed by applying this method. Finally, the evidence for accuracy of our claims in terms of increase in estimation accuracy of software costs is presented.
Conclusion:. The results proved that the suggested method could have significant influence on models of decision tree, naïve Bayes and nearest neighborhood by decreasing dimension of input data and turning it into data. 


Main Subjects

[1] F. Soleimanian Gharehchopogh, A. Talebi, I. Maleki, “Analysis of use case points models for software cost estimation,” International journal of academic Research, Part A, 6(3): 118-124, 2014.

[2] H. Leung, Z. Fan, “Software cost estimation,” Handbook of Software Engineering, Hong Kong Polytechnic University: 1-14, 2002.

[3] M. Fatima, S. F. Ahmad, M. Hasan, “Fuzzy based software cost estimation methods: a comparative study,” IJIRST-International Journal for Innovative Research in Science & Technology, 1(7): 287-290, 2014.

[4] R. Tripathi, P. K. Rai, “Comparative study of software cost estimation techniques,” International Journal of Advanced Research in Computer Science and Software Engineering, 6(1): 323-328, 2016.

[5] T. Menzies, D. Port, Z. Chen, J. Hihn, “Validation methods for calibrating software effort models,” presented at the 27th International Conference on Software Engineering, Saint Louis, USA, 2005.

[6] J. Hihn, T. Menzies, “Data mining methods and cost estimation models: Why is it so hard to infuse new ideas?,” in Proc. 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW): 5-9, Lincoln, USA, 2015.

[7] T. Menzies, Y. Yang, G. Mathew, B. Boehm, J. Hihn, “Negative results for software effort estimation,” Empiriccal Software Engineering, 22: 1-22, 2016.

[8] S. Gupta, S. Tiwari, H. Singh, A. Shukla, H. Raghuvanshi, “A comparison between various software cost estimation models," International Journal of Emerging Trends in Science and Technology, 3(11): 4771-4776, 2016.

[9] T. Kaur, J. Singh, “A hybrid model for the enhancement in software effort estimation,” International Journal of Scientific & Engineering Research, 6, no .7): 619-624, 2015.

[10] S. Sharma, A. Kaushik, A. Tomar, “Software cost estimation using hybrid algorithm,” International Journal of Engineering Trends and Technology (IJETT), 37(2): 62-71, 2016.

[11] A. khatibi Bardsiri, S. M. Hashemi, “Software effort estimation: a survey of well-known approaches,” International Journal of Computer Science Engineering (IJCSE), 3(1): 46-50, 2014.

[12] G. Mathew, T. Menzies, J. Hihn, “Impacts of bad ESP (early size predication) on software effort estimation,” arxiv preprint arxiv: 1612.03240,: 1-17, 2018.

[13] H. Najadat, I. Alsmadi, Y. Shboul, “Predicting software projects cost estimation based on mining historical data,” International Scholarly Research Network, ISRN Software Engineering,  January 2012.

[14] I. M. Baytas, K. Lin, F. Wang, A. K. Jain, J. Zhou, “Stochastic convex sparse principal component analysis,” EURASIP Journal on Bioinformatics and Systems Biology, 15(1): 2-11, 2016.

[15] T. Ensor, J. Cami, N. H. Bhatt, A. Soddu, “A principal component analysis of the diffuse interstellar bands,” The Astrophysical Journal, 836(2): 1-31, 2017.

[16] T. M. V.  Suryanarayana, P. B. Mistry, Principal Component Regression for Crop Yield Estimation, Springer, 2016.

[17] R. Tavoli, E. Kozegar, M. Shojafar, H. Soleimani, Z. Pooranian, “Weighted PCA for improving document image retrieval system based on keyword spotting accuracy,” in Proc.  36th International Conference on Telecommunications and Signal Processing (TSP), Rome, Italy: 773-777, 2013.

[18] R. Tavoli, F. Mahmoudi, “PCA-based relevance feedback in document image retrieval,” arXiv preprint arXiv: 1209.2274, 2012.

[19] M. Ghazanfari, S. Alizadeh, B. Teimourpour, Data Mining & Knowledge Discovery, Third edition, Iran University of science and Technology, Tehran, 2008.

[20]  J. Fan, Y. Liao, H. Lin, “An overview on the estimation of large covariance and precision matrices,” The Econometrics Journal, 19(1): 1-46, 2015.

[21]  C. J. Idoine, E. Brethenoux, J. Hare, P. Krensky, N. Shen, S.  Sicular, S. Vashisth, (2018, February 22). Gartner magic quadrant for data science and machine learning platforms, 2018.

[22] Boston, Mass, (2018, February 26). Rapid miner named a leader in the 2018 Gartner magic quadrant for data science and machine-learning platforms, 2018.

[23] D. Morris. (2013). Rapid miner – a potential game changer, 2013.

[24] K. Deshmukh, S. Raut, J. Bhargaw, “An overview on implementation using hybrid naïve Bayes algorithm for text categorization,” International Journal on Future Revolution in Computer Science & Communication Engineering, 4(3): 142-146, 2018.

[25] D. M. Farid, L. Zhang, C. M. Rahman, M. A. Hossain, R. Strachan, “Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks,” Expert System with Applications, 4(4): 1937-1946, 2014.

[26] A. A. Soofi, A. Awan, “Classification techniques in machine learning: applications and issues,” Journal of Basic & Applied Sciences, 13: 459-465, 2017.

[27] M. Hossin, M. N. Sulaiman, “A review on evaluation metrics for data classification evaluation,” International Journal of Data Mining Knowledge Management Process (IJDKP), 5(2): 1-11, 2015.

[28] M. Keyvanpour, R. Tavoli, “Document image retrieval: Algorithms, analysis and promising directions,” International Journal of Software Engineering and Its Applications, 7(1): 93-106, 2013.

[29] R. Tavoli, “Classification and evaluation of document image retrieval system," Wseas Transactions on Computers, 11(10): 329-338, 2012.

[30] M. Keyvanpour, R. Tavoli, S. Mozafari, “Document image retrieval based on keyword spotting using relevance feedback,” International Journal of Engineering, IJE Transactions A: Basics, 27(1): 7-14, 2014.  

[31] M. Keyvanpour, R. Tavoli, “Feature weighting for improving document image retrieval system performance,” arXiv preprint arXiv: 1206.1291, 2012.

[32]  M. Hasanluo, F. Soleimanian Gharehchopogh, "Software cost estimation by a new hybrid model of particle swarm optimization and k – nearest neighbor algorithms," Journal of Electrical and Computer Engineering Innovations (JECEI), 4(1): 49-55, 2016.