Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/6124
Full metadata record
DC FieldValueLanguage
dc.contributor.authorHaruna, Charles Roland-
dc.contributor.authorHou, Mengshu-
dc.contributor.authorXi, Rui-
dc.contributor.authorEghan, Moses Ojo-
dc.contributor.authorKpiebaareh, Michael-
dc.contributor.authorTandoh, Lawrence-
dc.contributor.authorEghan-Yartel, Barbie-
dc.contributor.authorAsante-Mensah, Maame G.-
dc.date.accessioned2021-10-04T13:55:41Z-
dc.date.available2021-10-04T13:55:41Z-
dc.date.issued2019-06-04-
dc.identifier.issn23105496-
dc.identifier.urihttp://hdl.handle.net/123456789/6124-
dc.description10p:, ill.en_US
dc.description.abstractIn this paper, we present an extension on a hybrid-based deduplication technique in entity reconciliation (ER), by proposing an algorithm that builds clusters upon receiving a pre-specified K numberof clusters, and second developing a crowd-based procedure for refining the results of the clusters produced after the clustering generation phases. With the clusters refined, we aim to minimize the cost metric 30(R) of the solitary and compound cluster generation algorithms, to achieve an improved and efficient deduplication method, to have an increase in accuracy in identifying duplicate records, and finally, further reduce the crowdsourcing overheads incurred. In this paper, in the experiments, we made use of three datasets commonly known to hybrid-based deduplication such as paper, product, and restaurant. The performance results and evaluations demonstrate clear superiority to the methods compared with our work offering low-crowdsourcing cost and high accuracy of deduplication, as well as better deduplication efficiency due to the clusters being refineden_US
dc.language.isoenen_US
dc.publisherUniversity of Cape Coasten_US
dc.subjectCluster refinementen_US
dc.subjectMinimization approachen_US
dc.subjectTriangular split and merger operationsen_US
dc.subjectEntity reconciliationen_US
dc.subjectCrowdsourcingen_US
dc.titleApplying cluster refinement to improve crowd-based data duplicate detection approachen_US
dc.typeArticleen_US
Appears in Collections:Department of Physics

Files in This Item:
File Description SizeFormat 
Applying Cluster Refinement to Improve.pdfArticle8.25 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.