Creating Knowledge Base from Automatically Extracted Information

Beata Nachyła


In this article we present a self-learning method for discovering the domain specific knowledge contained in a set of text documents. The method assumes that contents of the input documents have tagged domain-relevant information. The information is tagged with labels from a prespecified set. The method counts the co-occurrences of various sequences of the labels in a sentence and represents them in form of a data structure called a Prefix Label Tree. In order to extract knowledge from a given document, we use a hierarchical clustering method to group the labels contained within the document’s content. In order to calculate similarity of clusters during the clustering process, we also propose a measure called the Relation Possibility (RP).
Author Beata Nachyła (FEIT / IN)
Beata Nachyła,,
- The Institute of Computer Science
Book Pan Jeng-Shyang, Polycarpou Marios M., Woźniak Michał, de Carvalho André C. P. L. F. , Quintián Héctor, Corchado Emilio (eds.): Hybrid Artificial Intelligent Systems, 8th International Conference, HAIS 2013. Proceedings, Lecture Notes in Artificial Intelligence, vol. 8073, 2013, Heidelberg New York Dordrecht London, Springer Berlin Heidelberg, ISBN 978-3-642-40845-8, [978-3-642-40846-5], 689 p., DOI:10.1007/978-3-642-40846-5
front-matter-HAIS.pdf / 1.72 MB / No licence information
ProjectEstablishment of the universal, open, hosting and communication, repository platform for network resources of knowledge to be used by science, education and open knowledge society. Project leader: Kryszkiewicz Marzena, , Phone: +48 22 234 7701, start date 16-08-2010, planned end date 16-08-2013, end date 31-10-2013, WEiTI/2012/PS/1, Completed
BG PW Projects financed by NCRD [Projekty finansowane przez NCBiR (NCBR)]
Languageen angielski
Score (nominal)15
Score sourceconferenceIndex
ScoreMinisterial score = 10.0, 18-05-2020, BookChapterSeriesAndMatConfByIndicator
Ministerial score (2013-2016) = 15.0, 18-05-2020, BookChapterSeriesAndMatConfByIndicator
Citation count*
Share Share

Get link to the record

* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Are you sure?