Networks of clusters

This analytical approach, developed in our company, is based on a sequence of techniques designed to identify highly homogeneous groups within a large population of elements.

In the first stage, an algorithm is applied to the entire population to determine, for each element, whether a variable is present or absent, assigning a binary value: 0 when the variable is present and 1 when it is not. Each element is thus represented by a string of zeros and ones.

Elements with identical strings are grouped into the same cluster. Each cluster is then interpreted as a node within a network, where the strength of the connection between two nodes depends on the number of shared binary values in their strings. In this way, the edges of the network reflect the degree of similarity among clusters.

The result is a network of clusters on which community detection techniques are applied. Once these communities are defined, all the elements within them are explored. This procedure makes it possible to identify sets of elements that simultaneously contain most of the variables required for analysis, which is particularly useful in extensive databases where not all variables (or even most of them) are present in every record.

In our company, we have applied this approach to classify different types of studies within the Vict3r project database, where studies encompass very diverse variables from multiple domains. Thanks to this method, it was possible to detect groups of studies that are highly homogeneous, all of them characterized by the presence of the same variables.

Co-occurrence Network

Access the interactive co-occurrence network application.

Open Application

Phone

(+40) 721-717174

Address

Strada Barbu Vacarescu nr. 164E
020265, Bucarest
Romania