Text Summarization versus CHI for Feature Selection

R. S. Jabri *

Department of Computer Science, University of Jordan, Jordan.

E. Al-Thwaib

Department of Computer Science, University of Jordan, Jordan.

*Author to whom correspondence should be addressed.


Abstract

Text Classification is an important technique for handling the huge and increasing amount of text documents on the web. An important problem of text classification is features selection. Many feature selection techniques were used in order to solve this problem, such as chi-square (CHI). Rather than using these techniques, this paper proposes a method for feature selection based on text summarization. We demonstrate this method on Arabic text documents and use text summarization for feature selection. Support Vector Machine (SVM) is then used to classify the summarized documents and the ones processed by CHI. The classification indicators (precision, recall, and accuracy) achieved by text summarization are higher than the ones achieved by CHI. However, text summarization has negligible higher execution time.

Keywords: Text classification, text summarization, feature selection, CHI square.


How to Cite

Jabri, R. S., and E. Al-Thwaib. 2017. “Text Summarization Versus CHI for Feature Selection”. Journal of Advances in Mathematics and Computer Science 22 (4):1-8. https://doi.org/10.9734/BJMCS/2017/33615.

Downloads

Download data is not yet available.