Son Güncelleme:

21/08/2020 - 15:23

Üniversitemiz öğretim üyelerinden Dr. Öğr. Üyesi Mustafa Kemal Tural ve Prof. Dr. Nur Evin Özdemirel’in yazarları arasında bulunduğu “Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance” başlıklı makale Annals of Operations Research’te yayınlandı.

We consider a clustering problem in which the data objects are rooted m-ary trees with known node correspondence. We assume that the nodes of the trees are unweighted, but the edges can be unweighted or weighted. We measure the similarity and distance between two trees using vertex/edge overlap (VEO) and graph edit distance (GED), respectively. For both measures, we first study the problem of finding a centroid tree of a given cluster of trees in both the unweighted and weighted edge cases. We compute the optimal centroid tree of a given cluster for all measures except the weighted VEO for which a heuristic is developed. We then propose k-means based algorithms that repeat cluster assignment and centroid update steps until convergence. The initial centroid trees are constructed based on the properties of the data. The assignment steps utilize unweighted or weighted versions of VEO or GED to assign each tree to the most similar centroid tree. In the update steps, each centroid tree is updated by considering the trees assigned to it. The proposed algorithms are compared with the traditional k-modes and k-means on randomly generated datasets and shown to be more effective and robust (to outliers) in separating trees into clusters. We also apply our algorithms on a real world brain artery data and show that the previously observed age and sex effects on brain artery structures can be revealed better by means of clustering with our algorithms than the traditional k-modes and k-means.


Dinler, D., Tural, M. K., & Ozdemirel, N. E. (2020). Centroid based tree-structured data clustering using Vertex/Edge overlap and graph edit distance. Annals of Operations Research, 289(1), 85-122. doi:10.1007/s10479-019-03505-7

 

Makaleye erişim için: https://link.springer.com/article/10.1007/s10479-019-03505-7


ODTÜ Yazarları

Dr. Öğr. Üyesi Mustafa Kemal Tural

tural@metu.edu.tr Scopus Yazar Kimliği: 55879785700
Yazar Hakkında ORCID: 0000-0002-7173-7646

Prof. Dr. Nur Evin Özdemirel

nurevin@metu.edu.tr Scopus Yazar Kimliği: 35612115000
Yazar Hakkında ORCID: 0000-0001-5905-3375

Etiketler/Anahtar sözcükler:

Brain artery data, Centroid, Clustering, Graph edit distance, k-means, Tree-structured data objects, Vertex/edge overlap


Diğer Yazarlar:
Dinler, D.


Ek Bilgiler:
Derya Dinler was partially supported by the Scientific and Technological Research Council of Turkey under Grant 2211. Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.