Classification of file duplication by hierarchical clustering based on similarity relations

217

Views

0

Downloads

Phankokkruad, Manop (2017) Classification of file duplication by hierarchical clustering based on similarity relations In: 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 2017-07-29, Guilin.

Abstract

This paper have proposed the classification of the duplicate file by measuring the similarity score between the couple of files. This work examined the distance between the pairwise of files by the Smith-Waterman algorithm. In addition, the make use of the Euclidean distance matrix could identify the relativity between the persons who often copies the files each other. Since the regularity of the duplication happens, this work could classify the proximity to the persons, and a group of person who positioned closely together by applying the hierarchical clustering. The result revealed that the Smith-Waterman algorithms could measure the similarity between files effectively. Also, this work could analyze the relativity of the persons, classifies the person who positioned closely together, and the person between nearest related members of the group. Finally, this work represented the amount of time that person duplicated the files.

Item Type:

Conference or Workshop Item (Paper)

Identification Number (DOI):

Deposited by:

ระบบ อัตโนมัติ

Date Deposited:

2021-09-09 23:53:44

Last Modified:

2021-09-28 12:03:50

Impact and Interest:

Statistics