Mostrar el registro sencillo

dc.contributor.authorBolívar Gómez, Sergio
dc.contributor.authorNieto Reyes, Alicia 
dc.contributor.authorRogers, Heather L.
dc.contributor.otherUniversidad de Cantabriaes_ES
dc.date.accessioned2023-01-16T15:24:12Z
dc.date.available2023-01-16T15:24:12Z
dc.date.issued2022
dc.identifier.issn2227-7390
dc.identifier.otherMTM2017-86061-C2-2-Pes_ES
dc.identifier.otherMCIN/AEI/10.13039/501100011033es_ES
dc.identifier.urihttps://hdl.handle.net/10902/27216
dc.description.abstractAchieving a good success rate in supervised classification analysis of a text dataset, where the relationship between the text and its label can be extracted from the context, but not from isolated words in the text, is still an important challenge facing the fields of statistics and machine learning. For this purpose, we present a novel mathematical framework. We then conduct a comparative study between established classification methods for the case where the relationship between the text and the corresponding label is clearly depicted by specific words in the text. In particular, we use logistic LASSO, artificial neural networks, support vector machines, and decision-tree-like procedures. This methodology is applied to a real case study involving mapping Consolidated Framework for Implementation and Research (CFIR) constructs to health-related text data and achieves a prediction success rate of over 80% when just the first 55% of the text, or more, is used for training and the remaining for testing. The results indicate that the methodology can be useful to accelerate the CFIR coding process.es_ES
dc.description.sponsorshipA.N.-R. is supported by Grant MTM2017-86061-C2-2-P funded by “ERDF A way of making Europe” and MCIN/AEI/10.13039/501100011033. For H.L.R., this study was funded by Instituto de Salud Carlos III through the project “PI17/02070” (co-funded by the European Regional Development Fund/European Social Fund “A way to make Europe”/“Investing in your future”) and the Basque Government Department of Health project “2017111086”. The funding bodies had no role in the design of the study, collection, analysis, nor interpretation of data, nor the writing of the manuscript. The APC was paid by PI17/02070es_ES
dc.format.extent31 p.es_ES
dc.language.isoenges_ES
dc.publisherMDPIes_ES
dc.rights© 2022 by the authorses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.sourceMathematics, 2022, 10(12), 2005es_ES
dc.subject.otherArtificial Neural Networkses_ES
dc.subject.otherDecision Treees_ES
dc.subject.otherLogistic LASSOes_ES
dc.subject.otherNatural Language Processinges_ES
dc.subject.otherQualitative Dataes_ES
dc.subject.otherSupervised Classificationes_ES
dc.subject.otherSupport Vector Machineses_ES
dc.subject.otherText Data Analysises_ES
dc.titleSupervised Classification of Healthcare Text Data Based on Context-Defined Categorieses_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.relation.publisherVersionhttps://doi.org/10.3390/math10122005es_ES
dc.rights.accessRightsopenAccesses_ES
dc.identifier.DOI10.3390/math10122005
dc.type.versionpublishedVersiones_ES


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo

© 2022 by the authorsExcepto si se señala otra cosa, la licencia del ítem se describe como © 2022 by the authors