Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries.

Saved in:
Bibliographic Details
Title: Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries.
Authors: Savic, Nenad1 nenad.savic@unisante.ch, Bovio, Nicolas1, Gilbert, Fabien2, Paz, José1, Canu, Irina Guseva1
Source: Annals of Work Exposures & Health. Jan2022, Vol. 66 Issue 1, p113-118. 6p.
Subjects: Job classification, Experimental design, Evaluation of human services programs, Research methodology, Machine learning, Acquisition of data, Occupational exposure, Industrial hygiene, Algorithms
Abstract: Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) by implementing Complement Naïve Bayes (CNB) as a machine-learning technique. The paper describes the algorithm, performance evaluation, and future goals regarding the tool's development. Almost 30 000 free-texts with manually assigned classification codes of French classification of occupations (PCS) and French classification of activities (NAF) were used to train CNB. A 5-fold cross-validation found that Procode predicts correct classification codes in 57–81 and 63–83% cases for PCS and NAF, respectively. Procode also integrates recoding between two classifications. In the first version of Procode, this operation, however, is only a simple search function of recoding links in existing crosswalks. Future focus of the project will be collection of the data to support automatic coding to other classification and to establish a more advanced method for recoding. [ABSTRACT FROM AUTHOR]
Copyright of Annals of Work Exposures & Health is the property of Oxford University Press / USA and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
Full text is not displayed to guests.
Description
Abstract:Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) by implementing Complement Naïve Bayes (CNB) as a machine-learning technique. The paper describes the algorithm, performance evaluation, and future goals regarding the tool's development. Almost 30 000 free-texts with manually assigned classification codes of French classification of occupations (PCS) and French classification of activities (NAF) were used to train CNB. A 5-fold cross-validation found that Procode predicts correct classification codes in 57–81 and 63–83% cases for PCS and NAF, respectively. Procode also integrates recoding between two classifications. In the first version of Procode, this operation, however, is only a simple search function of recoding links in existing crosswalks. Future focus of the project will be collection of the data to support automatic coding to other classification and to establish a more advanced method for recoding. [ABSTRACT FROM AUTHOR]
ISSN:23987308
DOI:10.1093/annweh/wxab037