A classification approach for detecting cross-lingual biomedical term translations.
Saved in:
| Title: | A classification approach for detecting cross-lingual biomedical term translations. |
|---|---|
| Authors: | HAKAMI, H.1 hoda.h@tu.edu.sa, BOLLEGALA, D.2 danushka.bollegala@liverpool.ac.uk |
| Source: | Natural Language Engineering. Jan2017, Vol. 23 Issue 1, p31-51. 21p. |
| Subjects: | Medical language, Machine translating, Bilingualism, N-gram models (Computational linguistics), Accuracy of information |
| Abstract: | Finding translations for technical terms is an important problem in machine translation. In particular, in highly specialized domains such as biology or medicine, it is difficult to find bilingual experts to annotate sufficient cross-lingual texts in order to train machine translation systems. Moreover, new terms are constantly being generated in the biomedical community, which makes it difficult to keep the translation dictionaries up to date for all language pairs of interest. Given a biomedical term in one language (source language), we propose a method for detecting its translations in a different language (target language). Specifically, we train a binary classifier to determine whether two biomedical terms written in two languages are translations. Training such a classifier is often complicated due to the lack of common features between the source and target languages. We propose several feature space concatenation methods to successfully overcome this problem. Moreover, we study the effectiveness of contextual and character n-gram features for detecting term translations. Experiments conducted using a standard dataset for biomedical term translation show that the proposed method outperforms several competitive baseline methods in terms of mean average precision and top-k translation accuracy. [ABSTRACT FROM PUBLISHER] |
| Copyright of Natural Language Engineering is the property of Cambridge University Press and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Database: | Engineering Source |
| FullText | Text: Availability: 0 |
|---|---|
| Header | DbId: egs DbLabel: Engineering Source An: 120262036 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: A classification approach for detecting cross-lingual biomedical term translations. – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22HAKAMI%2C+H%2E%22">HAKAMI, H.</searchLink><relatesTo>1</relatesTo><i> hoda.h@tu.edu.sa</i><br /><searchLink fieldCode="AR" term="%22BOLLEGALA%2C+D%2E%22">BOLLEGALA, D.</searchLink><relatesTo>2</relatesTo><i> danushka.bollegala@liverpool.ac.uk</i> – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="JN" term="%22Natural+Language+Engineering%22">Natural Language Engineering</searchLink>. Jan2017, Vol. 23 Issue 1, p31-51. 21p. – Name: Subject Label: Subjects Group: Su Data: <searchLink fieldCode="DE" term="%22Medical+language%22">Medical language</searchLink><br /><searchLink fieldCode="DE" term="%22Machine+translating%22">Machine translating</searchLink><br /><searchLink fieldCode="DE" term="%22Bilingualism%22">Bilingualism</searchLink><br /><searchLink fieldCode="DE" term="%22N-gram+models+%28Computational+linguistics%29%22">N-gram models (Computational linguistics)</searchLink><br /><searchLink fieldCode="DE" term="%22Accuracy+of+information%22">Accuracy of information</searchLink> – Name: Abstract Label: Abstract Group: Ab Data: Finding translations for technical terms is an important problem in machine translation. In particular, in highly specialized domains such as biology or medicine, it is difficult to find bilingual experts to annotate sufficient cross-lingual texts in order to train machine translation systems. Moreover, new terms are constantly being generated in the biomedical community, which makes it difficult to keep the translation dictionaries up to date for all language pairs of interest. Given a biomedical term in one language (source language), we propose a method for detecting its translations in a different language (target language). Specifically, we train a binary classifier to determine whether two biomedical terms written in two languages are translations. Training such a classifier is often complicated due to the lack of common features between the source and target languages. We propose several feature space concatenation methods to successfully overcome this problem. Moreover, we study the effectiveness of contextual and character n-gram features for detecting term translations. Experiments conducted using a standard dataset for biomedical term translation show that the proposed method outperforms several competitive baseline methods in terms of mean average precision and top-k translation accuracy. [ABSTRACT FROM PUBLISHER] – Name: AbstractSuppliedCopyright Label: Group: Ab Data: <i>Copyright of Natural Language Engineering is the property of Cambridge University Press and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.) |
| PLink | https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=120262036 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1017/S1351324915000431 Languages: – Code: eng Text: English PhysicalDescription: Pagination: PageCount: 21 StartPage: 31 Subjects: – SubjectFull: Medical language Type: general – SubjectFull: Machine translating Type: general – SubjectFull: Bilingualism Type: general – SubjectFull: N-gram models (Computational linguistics) Type: general – SubjectFull: Accuracy of information Type: general Titles: – TitleFull: A classification approach for detecting cross-lingual biomedical term translations. Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: HAKAMI, H. – PersonEntity: Name: NameFull: BOLLEGALA, D. IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Text: Jan2017 Type: published Y: 2017 Identifiers: – Type: issn-print Value: 13513249 Numbering: – Type: volume Value: 23 – Type: issue Value: 1 Titles: – TitleFull: Natural Language Engineering Type: main |
| ResultId | 1 |