2000 character limit reached
Learning Taxonomy for Text Segmentation by Formal Concept Analysis (1010.2384v1)
Published 12 Oct 2010 in cs.CL
Abstract: In this paper the problems of deriving a taxonomy from a text and concept-oriented text segmentation are approached. Formal Concept Analysis (FCA) method is applied to solve both of these linguistic problems. The proposed segmentation method offers a conceptual view for text segmentation, using a context-driven clustering of sentences. The Concept-oriented Clustering Segmentation algorithm (COCS) is based on k-means linear clustering of the sentences. Experimental results obtained using COCS algorithm are presented.