Evaluating Cross-lingual Semantic Annotation for Medical Forms

Auteurs

Y.C. Lin, V. Christen, A. Gross, T. Kirsten, S.D. Cardoso, C. Pruski, M. Da Silveira, and E. Rahm

Référence

in the Proceedings of the 13th international joint conference on biomedical engineering systems and technologies, vol 5: healthinf, pp. 145-155, ISBN:978-989-758-398-8, 2020

Description

Annotating documents or datasets using concepts of biomedical ontologies has become increasingly important. Such ontology-based semantic annotations can improve the interoperability and the quality of data integration in health care practice and biomedical research. However, due to the restrictive coverage of non-English ontologies and the lack of comparably good annotators as for English language, annotating non-English documents is even more challenging. In this paper we aim to annotate medical forms in German language. We present a parallel corpus where all medical forms are in both German and English languages. We use three annotators to automatically generate annotations and these annotations are manually verified to construct an English Silver Standard Corpus (SSC). Based on the parallel corpus of German and English documents and the SSC, we evaluate the quality of different annotation approaches, mainly 1) direct annotation using German corpus and German ontologies and 2) integrating machine translators to translate German corpus and annotate the translated corpus with English ontologies. The results show that using German ontologies only produces very restricted results, whereas translation achieves better annotation quality and is able to retain almost 70% of the annotations.

Lien

doi:10.5220/0008979901450155

Partager cette page :