Categories | Alphabetical | Classes | All Contents | [ < ] | [ > ]

CLUSTER


Syntax | Return Value | Arguments | Keywords | Examples | Version History | See Also

The CLUSTER function computes the classification of an m-column, n-row array, where m is the number of variables and n is the number of observations or samples. The classification is based upon a cluster analysis of sample-based distances.

This routine is written in the IDL language. Its source code can be found in the file cluster.pro in the lib subdirectory of the IDL distribution.

For more information on cluster analysis, see:

Everitt, Brian S. Cluster Analysis. New York: Halsted Press, 1993. ISBN 0-470-22043-0

Syntax

Result = CLUSTER( Array, Weights [, /DOUBLE] [, N_CLUSTERS=value] )

Return Value

Results in a 1-column, n-row array of cluster number assignments that correspond to each sample.

Arguments

Array

An M-column, N-row array of type float or double.

Weights

An array of weights (the cluster centers) computed using the CLUST_WTS function. The dimensions of this array vary according to keyword values.

Keywords

DOUBLE

Set this keyword to force the computation to be done in double-precision arithmetic.

N_CLUSTERS

Set this keyword equal to the number of clusters. The default is based upon the row dimension of the Weights array.

Examples

; Define an array with 4 variables and 10 observations: 
array = $ 
[[ 1.5, 43.1, 29.1,  1.9], $ 
 [24.7, 49.8, 28.2, 22.8], $ 
 [30.7, 51.9,  7.0, 18.7], $ 
 [ 9.8,  4.3, 31.1,  0.1], $ 
 [19.1, 42.2,  0.9, 12.9], $ 
 [25.6, 13.9,  3.7, 21.7], $ 
 [ 1.4, 58.5, 27.6,  7.1], $ 
 [ 7.9,  2.1, 30.6,  5.4], $ 
 [22.1, 49.9,  3.2, 21.3], $ 
 [ 5.5, 53.5,  4.8, 19.3]] 
 
; Compute the cluster weights, using two distinct clusters: 
weights = CLUST_WTS(array, N_CLUSTERS=2) 
 
; Compute the classification of each sample: 
result = CLUSTER(array, weights, N_CLUSTERS=2) 
 
; Print each sample (each row) of the array and its corresponding 
; cluster assignment: 
FOR k = 0, N_ELEMENTS(result)-1 DO PRINT, $ 
array[*,k], result(k), FORMAT = '(4(f4.1, 2x), 5x, i1)' 

IDL prints:

 1.5  43.1  29.1   1.9       1 
24.7  49.8  28.2  22.8       0 
30.7  51.9   7.0  18.7       0 
 9.8   4.3  31.1   0.1       1 
19.1  42.2   0.9  12.9       0 
25.6  13.9   3.7  21.7       0 
 1.4  58.5  27.6   7.1       1 
 7.9   2.1  30.6   5.4       1 
22.1  49.9   3.2  21.3       0 
 5.5  53.5   4.8  19.3       0 

Version History

Introduced: 5.0

See Also

CLUST_WTS, PCOMP, STANDARDIZE, "Multivariate Analysis" in the Using IDL manual.


Categories | Alphabetical | Classes | All Contents | [ < ] | [ > ]