Algorithm

Introduction

The main methodological contribution is an algorithm that heuristically solves the weighted n-cluster editing problem. The algorithms is motivated by the well-known physics-inspired graph layout algorithm of Fruchterman and Reingold. This algorithms seeks to arrange all nodes of a graph in a two-dimensional plane such that "similar" nodes are located more close to each other than to others. n-CluE, afterwards, assignes the nodes from each "dense" part of the graph layout to one n-cluster by single-linkage clustering (SLC) or k-means clustering based on the Euclidean distances. This can be done iteratively for different parameters and the solution with minimal n-cluster editing costs is picked.

The algorithm is carried out in a three-step procedure:

  1. Layout generation
  2. n-cluster partition
  3. Post-processing

I. Layout generation

In this stage, the coordinates of nodes are generated and re-arranged in a way that the nodes with higher similarities are located next to each other, and far away from those that are dissimilar. The layout is like a "self-arranging" network: the similar nodes attract each other and the dissimilar nodes repulse each other.
n-CluE starts the initial layout by locating all points "almost" evenly on a 3D sphere (or a 2D circle, if preferred by the user). The attracting/repulsing forces are computed by the following formula: biclust-editing The final layout is generated in a iterative manner. In each iteration, the "movement" of each nodes is the joint force of the effects coming from all other nodes. See Fig. 1 and 2 for an illustration.

II. Partition phase

In this phase, the nodes are partitioned into different clusters by Single Linkage Clustering or K-means Clustering (user choice). This clustering is repeated for different parameters and the clustering that minimizes the cluster editing costs is selected.

III. Post-processing

In post-processing, the n-CluE tries to further reduce the editing costs in two steps.

  1. N-cluster merging
  2. Random singleton node moving

n-Clue

Fig. 1. The layout generation of n-CluE in 3D for some iterations.

n-Clue

Fig. 2. The layout generation of n-CluE in 2D plain for some iterations.