A survey on textual semantic classification algorithms

This paper provides a broad overview of three popular textual semantic classification algorithms used both in the industry and in the scientific community. The three algorithms are TF-IDF, Latent Semantic Analysis and Latent Dirichlet Allocation. We selected these three algorithms because they are t...

Full description

Main Authors: Zubir, W.M.A.M., Aziz, I.A., Jaafar, J.
Format: Article
Institution: Universiti Teknologi Petronas
Record Id / ISBN-0: utp-eprints.21772 /
Published: Institute of Electrical and Electronics Engineers Inc. 2018
Online Access: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85047420059&doi=10.1109%2fICBDAA.2017.8284098&partnerID=40&md5=8693fde163723518787fff06ac204563
http://eprints.utp.edu.my/21772/
Tags: Add Tag
No Tags, Be the first to tag this record!
Summary: This paper provides a broad overview of three popular textual semantic classification algorithms used both in the industry and in the scientific community. The three algorithms are TF-IDF, Latent Semantic Analysis and Latent Dirichlet Allocation. We selected these three algorithms because they are the foundation of semantic classification and they are still widely used in the field of semantic classification. Firstly, this paper exhibits the inner workings of each of the algorithm both in the original authors intuition and the mathematical model utilized. Next, we discuss the advantages of each of the algorithms based on recent and credible research papers and articles. We also critically dissect the limitations of each of the algorithms. Lastly, we provide a general argument on the way forward in improving of the algorithms. This paper aims to give a general understanding on these algorithms which we hope will spur more research in improving the field of semantic classification. © 2017 IEEE.