The CLUSTER function computes the classification of an m-column, n-row array, where m is the number of variables and n is the number of observations or samples. The classification is based upon a cluster analysis of sample-based distances.
This routine is written in the IDL language. Its source code can be found in the file cluster.pro in the lib subdirectory of the IDL distribution.
For more information on cluster analysis, see:
Everitt, Brian S. Cluster Analysis. New York: Halsted Press, 1993. ISBN 0-470-22043-0
Result = CLUSTER( Array, Weights [, /DOUBLE] [, N_CLUSTERS=value] )
Results in a 1-column, n-row array of cluster number assignments that correspond to each sample.
An M-column, N-row array of type float or double.
An array of weights (the cluster centers) computed using the CLUST_WTS function. The dimensions of this array vary according to keyword values.
Set this keyword to force the computation to be done in double-precision arithmetic.
Set this keyword equal to the number of clusters. The default is based upon the row dimension of the Weights array.
; Define an array with 4 variables and 10 observations: array = $ [[ 1.5, 43.1, 29.1, 1.9], $ [24.7, 49.8, 28.2, 22.8], $ [30.7, 51.9, 7.0, 18.7], $ [ 9.8, 4.3, 31.1, 0.1], $ [19.1, 42.2, 0.9, 12.9], $ [25.6, 13.9, 3.7, 21.7], $ [ 1.4, 58.5, 27.6, 7.1], $ [ 7.9, 2.1, 30.6, 5.4], $ [22.1, 49.9, 3.2, 21.3], $ [ 5.5, 53.5, 4.8, 19.3]] ; Compute the cluster weights, using two distinct clusters: weights = CLUST_WTS(array, N_CLUSTERS=2) ; Compute the classification of each sample: result = CLUSTER(array, weights, N_CLUSTERS=2) ; Print each sample (each row) of the array and its corresponding ; cluster assignment: FOR k = 0, N_ELEMENTS(result)-1 DO PRINT, $ array[*,k], result(k), FORMAT = '(4(f4.1, 2x), 5x, i1)'
IDL prints:
1.5 43.1 29.1 1.9 1 24.7 49.8 28.2 22.8 0 30.7 51.9 7.0 18.7 0 9.8 4.3 31.1 0.1 1 19.1 42.2 0.9 12.9 0 25.6 13.9 3.7 21.7 0 1.4 58.5 27.6 7.1 1 7.9 2.1 30.6 5.4 1 22.1 49.9 3.2 21.3 0 5.5 53.5 4.8 19.3 0
Introduced: 5.0
CLUST_WTS, PCOMP, STANDARDIZE, "Multivariate Analysis" in the Using IDL manual.