Figures

Download figures from the paper and the supplementary material in PDF / JPG format.

Figure
Caption
Format
Figure 1
Examples of clusters in three different data sets.
Figure 3A
Expression profiles of the 868 genes in the ESR data across the 173 microarray stress experiments. Data taken from Gasch et. al.
Figure 3B
Mutual information relations (in bits) among the ESR genes.
Figure 4A
Tradeoff curves for the ESR data with 1/T=5,10,15,20,25.
Figure 4B
Tradeoff curves for the Standard and Poor's 500 data with 1/T=15,20,25,30,35.
Figure 4C
Tradeoff curves for the EachMovie data with 1/T=20,25,30,35,40.
Figure 5
Comparison of coherence results of the Iclust algorithm (yellow) with conventional clustering algorithms: K--means (green); K--medians (blue); Hierarchical (red). For the hierarchical algorithms, four different variants are tried: complete, average, centroid, and single linkage, respectively from left to right. For every algorithm, three different similarity measures are applied: Pearson correlation (left); absolute value of Pearson correlation (middle); Euclidean distance (right). The white bars in the ESR data correspond to applying the algorithm over the log2 transformation of the expression ratios. In all cases, the results are averaged over all the different numbers of clusters that we tried: Nc = 5,10,15,20. For the ESR data the results are further averaged over the three GOs.
Figure 6
Relations between the optimal solutions with Nc = 5,10,15,20 clusters at 1/T = 25 for the ESR data. Every cluster is connected to the cluster in the next -- less detailed -- partition that absorbs its most significant portion. The edge type indicates the level of inclusion. At the upper level the clusters are sorted as in Fig. 3. The number above every cluster indicates the number of genes in it, and the text title corresponds to the most enriched GO biological--process annotation in this cluster. The titles of the five clusters at the lower level are by their most enriched GO cellular-component annotation. Most clusters were enriched with more than one annotation, hence the short titles are sometimes too concise. Red and green clusters represent clusters with a clear majority of stress–induced or stress– repressed genes, respectively.
Supplementary Figure 9
Mutual information relations (in bits) among the Standard and Poor's 500 companies.
Supplementary Figure 10
Relations between the optimal solutions with Nc = 5,10,15,20 clusters at 1/T = 35 for the Standard and Poor's 500 data. At the upper level the clusters are sorted as in Supplementary Figure 9. The numbers above every cluster indicate the number of companies in this cluster. The title of each cluster correspond to the most enriched annotation in the cluster. Similar color of text boxes indicate that the corresponding annotations belong to the same major sector of economy. Notice, that most clusters were enriched with more than one annotation, hence the short titles might be too concise in some cases.
Supplementary Figure 14
Relations between the optimal solutions with Nc = 5,10,15,20 clusters at 1/T = 40 for the EachMovie data. The numbers above every cluster indicate the number of movies in this cluster. The title of each cluster correspond to the enriched ``genre'' annotation in the cluster..