Fuzzy c-means clustering algorithm v.0.3 for Multidimensional Data
Overview
The new version is adapted to the multidimensional data clustering. It means that objects can have more than two characteristics. Lets look how existing code was changed to apply for the multidimensional data clustering.
ClusterCentroid class was exluded
This class was an exact copy of the ClusterPoint class so I exluded it from the solution to make code more clear.
CusterPoint class changes
The Coords property was added for storing any number of object properties:
public List
I left X and Y properties for convenience but changed them accordinly:
public double X { get { return this.Coords[0]; } }
public double Y { get { return this.Coords[1]; } }
Also I added Dimension property:
public int Dimention { get { return this.Coords.Count; } }
New constructors:
public ClusterPoint(List
: this(coords, null)
{
}
public ClusterPoint(List
{
this.Coords = coords;
this.Tag = tag;
this.ClusterIndex = -1;
}
CMeansAlgorithm class changes
First of all we need to change calculation logic of distance:
private double CalculateEulerDistance(ClusterPoint point, ClusterPoint centroid)
{
//return Math.Sqrt(Math.Pow(p.X - c.X, 2) + Math.Pow(p.Y - c.Y, 2));
double sum = 0.0;
for (int i = 0; i < point.Dimention; i++ )
{
sum += Math.Pow(point.Coords[i] - centroid.Coords[i], 2);
}
return Math.Sqrt(sum);
}
And the most important change in CalculateClusterCenters method:
double uX = 0.0;
double uY = 0.0;
uX += uu * c.X;
uY += uu * c.Y;
was replaced to:
double[] uC = new double[c.Dimention];
for (int k = 0; k < c.Dimention; k++) {
uC[k] += uu * c.Coords[k];
}
and
c.X = ((int)(uX / l));
c.Y = ((int)(uY / l));
was replaced to:
for (int k = 0; k < c.Dimention; k++) {
c.Coords[k] = ((int)(uC[k] / l));
}
That's all! Now the sample code for using new version of algorithm looks like as the following:
var points = new List
points.Add(new ClusterPoint(0,0));
points.Add(new ClusterPoint(100,100));
var clusters = new List
clusters.Add(points[0]);
clusters.Add(points[1]);
CMeansAlgorithm alg = new CMeansAlgorithm(points, clusters);
alg.Run();
Console.Write(alg.Log);
Source Code
The source doce can be found on the http://datamining.codeplex.com/ (Fuzzy c-means clustering v.0.3 for multidimensional data).
Happy codding!