Below the first draft of the abstract of my paper. It doesn’t yet include the results/conclusion. Word count: 127

Semantic annotation uses human knowledge formalized in ontologies to enrich texts, by providing structured and machine-understandable information of its content. This paper proposes an approach for automatically annotating texts of the Cyttron Scientific Image Database, using the NCI Thesaurus ontology. Several frequency-based keyword extraction algorithms, aiming to extract core concepts and exclude less relevant concepts, were implemented and evaluated. Furthermore, text classification algorithms were applied to identify important concepts which do not occur in the text. The algorithms were evaluated by comparing them to annotations provided by experts. Semantic networks were generated from these annotations and an ontology-based similarity metric was used to cross-compare them. Finally the networks were visualized to provide further insights into the differences of the semantic structure generated by humans, and the algorithms.

Tags: Semantic annotation, ontology-based semantic similarity, semantic networks, keyword extraction, text classification, network visualization, text mining

Not dead (yet)

While I haven’t been as active and hard working on my graduation project as I would have liked to be, I am not dead (nor the project). Earlier this week I presented my project to the Bio-imaging group of Leiden University, which helped me a lot. I was able to present my project pretty much as-is, since I’m mostly done with the technical parts. I received valuable feedback and got good insights into what I should explain more thoroughly in the presentation. Continue reading “Not dead (yet)”