Skip to Main content Skip to Navigation
Theses

Fast recursive biomedical event extraction

Abstract : Internet as well as all the modern media of communication, information and entertainment entails a massive increase of digital data quantities. Automatically processing and understanding these massive data enables creating large knowledge bases, more efficient search, social medial research, etc. Natural language processing research concerns the design and development of algorithms that allow computers to process natural language in texts, audios, images or videos automatically for specific tasks. Due to the complexity of human language, natural language processing of text can be divided into four levels: morphology, syntax, semantics and pragmatics. Current natural language processing technologies have achieved great successes in the tasks of the first two levels, leading to successes in many commercial applications such as search. However, advanced structured search engine would require computers to understand language deeper than at the morphology and syntactic levels. Information extraction is designed to extract meaningful structural information from unannotated or semi-annotated resources to enable advanced search and automatically create knowledge bases for further use. This thesis studies the problem of information extraction in the specific domain of biomedical event extraction. We propose an efficient solution, which is a trade-off between the two main trends of methods proposed in previous work. This solution reaches a good balance point between performance and speed, which is suitable to process large scale data. It achieves competitive performance to the best models with a much lower computational complexity. While designing this model, we also studied the effects of different classifiers that are usually proposed to solve the multi-class classification problem. We also tested two simple methods to integrate word vector representations learned by deep learning method into our model. Even if different classifiers and the integration of word vectors do not greatly improve the performance, we believe that these research directions carry some promising potential for improving information extraction.
Complete list of metadatas

Cited literature [93 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01690893
Contributor : Abes Star :  Contact
Submitted on : Tuesday, January 23, 2018 - 3:11:07 PM
Last modification on : Friday, May 17, 2019 - 11:41:42 AM
Long-term archiving on: : Thursday, May 24, 2018 - 12:09:32 PM

File

These_UTC_Xiao_Liu.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01690893, version 1

Collections

Citation

Xiao Liu. Fast recursive biomedical event extraction. Artificial Intelligence [cs.AI]. Université de Technologie de Compiègne, 2014. English. ⟨NNT : 2014COMP1963⟩. ⟨tel-01690893⟩

Share

Metrics

Record views

289

Files downloads

117