My Thoughts: The miscellaneous category (in a casebase)

Monday, October 1, 2012

The miscellaneous category (in a casebase)

If a case is "far enough" away from all existing clusters (categories), and <N other cases have been found that are "near" to it, then add this case to a "miscellaneous" category.

Search and employ the miscellaneous category as if it was a cluster of its own. (But do not define a mean and standard deviations for it. It is not compressible.)

Remove a case from "miscellaneous" and form a new cluster if and when N is exceeded (i.e., when enough cases like this one are discovered). (A reasonable value for N might be estimated/arrived at by looking at the number of cases found in all the other categories; the mean and standard deviation of this number. N should be set a few standard deviations below the mean. The size of the "miscellaneous" category should also be kept similar to the size of other categories.)

My Thoughts

Monday, October 1, 2012

The miscellaneous category (in a casebase)

No comments:

Post a Comment