Papers
Topics
Authors
Recent
Search
2000 character limit reached

An Effective Approach for Web Document Classification using the Concept of Association Analysis of Data Mining

Published 21 Jun 2014 in cs.IR | (1406.5616v1)

Abstract: Exponential growth of the web increased the importance of web document classification and data mining. To get the exact information, in the form of knowing what classes a web document belongs to, is expensive. Automatic classification of web document is of great use to search engines which provides this information at a low cost. In this paper, we propose an approach for classifying the web document using the frequent item word sets generated by the Frequent Pattern (FP) Growth which is an association analysis technique of data mining. These set of associated words act as feature set. The final classification obtained after Na\"ive Bayes classifier used on the feature set. For the experimental work, we use Gensim package, as it is simple and robust. Results show that our approach can be effectively classifying the web document.

Citations (9)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.