Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries.
Saved in:
| Title: | Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries. |
|---|---|
| Authors: | Savic, Nenad1 nenad.savic@unisante.ch, Bovio, Nicolas1, Gilbert, Fabien2, Paz, José1, Canu, Irina Guseva1 |
| Source: | Annals of Work Exposures & Health. Jan2022, Vol. 66 Issue 1, p113-118. 6p. |
| Subjects: | Job classification, Experimental design, Evaluation of human services programs, Research methodology, Machine learning, Acquisition of data, Occupational exposure, Industrial hygiene, Algorithms |
| Abstract: | Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) by implementing Complement Naïve Bayes (CNB) as a machine-learning technique. The paper describes the algorithm, performance evaluation, and future goals regarding the tool's development. Almost 30 000 free-texts with manually assigned classification codes of French classification of occupations (PCS) and French classification of activities (NAF) were used to train CNB. A 5-fold cross-validation found that Procode predicts correct classification codes in 57–81 and 63–83% cases for PCS and NAF, respectively. Procode also integrates recoding between two classifications. In the first version of Procode, this operation, however, is only a simple search function of recoding links in existing crosswalks. Future focus of the project will be collection of the data to support automatic coding to other classification and to establish a more advanced method for recoding. [ABSTRACT FROM AUTHOR] |
| Copyright of Annals of Work Exposures & Health is the property of Oxford University Press / USA and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Database: | Engineering Source |
|
Full text is not displayed to guests.
Login for full access.
|
|
| FullText | Links: – Type: pdflink Text: Availability: 1 |
|---|---|
| Header | DbId: egs DbLabel: Engineering Source An: 154714203 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries. – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Savic%2C+Nenad%22">Savic, Nenad</searchLink><relatesTo>1</relatesTo><i> nenad.savic@unisante.ch</i><br /><searchLink fieldCode="AR" term="%22Bovio%2C+Nicolas%22">Bovio, Nicolas</searchLink><relatesTo>1</relatesTo><br /><searchLink fieldCode="AR" term="%22Gilbert%2C+Fabien%22">Gilbert, Fabien</searchLink><relatesTo>2</relatesTo><br /><searchLink fieldCode="AR" term="%22Paz%2C+José%22">Paz, José</searchLink><relatesTo>1</relatesTo><br /><searchLink fieldCode="AR" term="%22Canu%2C+Irina+Guseva%22">Canu, Irina Guseva</searchLink><relatesTo>1</relatesTo> – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="JN" term="%22Annals+of+Work+Exposures+%26+Health%22">Annals of Work Exposures & Health</searchLink>. Jan2022, Vol. 66 Issue 1, p113-118. 6p. – Name: Subject Label: Subjects Group: Su Data: <searchLink fieldCode="DE" term="%22Job+classification%22">Job classification</searchLink><br /><searchLink fieldCode="DE" term="%22Experimental+design%22">Experimental design</searchLink><br /><searchLink fieldCode="DE" term="%22Evaluation+of+human+services+programs%22">Evaluation of human services programs</searchLink><br /><searchLink fieldCode="DE" term="%22Research+methodology%22">Research methodology</searchLink><br /><searchLink fieldCode="DE" term="%22Machine+learning%22">Machine learning</searchLink><br /><searchLink fieldCode="DE" term="%22Acquisition+of+data%22">Acquisition of data</searchLink><br /><searchLink fieldCode="DE" term="%22Occupational+exposure%22">Occupational exposure</searchLink><br /><searchLink fieldCode="DE" term="%22Industrial+hygiene%22">Industrial hygiene</searchLink><br /><searchLink fieldCode="DE" term="%22Algorithms%22">Algorithms</searchLink> – Name: Abstract Label: Abstract Group: Ab Data: Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) by implementing Complement Naïve Bayes (CNB) as a machine-learning technique. The paper describes the algorithm, performance evaluation, and future goals regarding the tool's development. Almost 30 000 free-texts with manually assigned classification codes of French classification of occupations (PCS) and French classification of activities (NAF) were used to train CNB. A 5-fold cross-validation found that Procode predicts correct classification codes in 57–81 and 63–83% cases for PCS and NAF, respectively. Procode also integrates recoding between two classifications. In the first version of Procode, this operation, however, is only a simple search function of recoding links in existing crosswalks. Future focus of the project will be collection of the data to support automatic coding to other classification and to establish a more advanced method for recoding. [ABSTRACT FROM AUTHOR] – Name: AbstractSuppliedCopyright Label: Group: Ab Data: <i>Copyright of Annals of Work Exposures & Health is the property of Oxford University Press / USA and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.) |
| PLink | https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=154714203 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1093/annweh/wxab037 Languages: – Code: eng Text: English PhysicalDescription: Pagination: PageCount: 6 StartPage: 113 Subjects: – SubjectFull: Job classification Type: general – SubjectFull: Experimental design Type: general – SubjectFull: Evaluation of human services programs Type: general – SubjectFull: Research methodology Type: general – SubjectFull: Machine learning Type: general – SubjectFull: Acquisition of data Type: general – SubjectFull: Occupational exposure Type: general – SubjectFull: Industrial hygiene Type: general – SubjectFull: Algorithms Type: general Titles: – TitleFull: Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries. Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Savic, Nenad – PersonEntity: Name: NameFull: Bovio, Nicolas – PersonEntity: Name: NameFull: Gilbert, Fabien – PersonEntity: Name: NameFull: Paz, José – PersonEntity: Name: NameFull: Canu, Irina Guseva IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Text: Jan2022 Type: published Y: 2022 Identifiers: – Type: issn-print Value: 23987308 Numbering: – Type: volume Value: 66 – Type: issue Value: 1 Titles: – TitleFull: Annals of Work Exposures & Health Type: main |
| ResultId | 1 |