This information is for reference purposes only. It was current when produced and may now be outdated. Archive material is no longer maintained, and some links may not work. Persons with disabilities having difficulty accessing this information should contact us at: https://info.ahrq.gov. Let us know the nature of the problem, the Web address of what you want, and your contact information.
Please go to www.ahrq.gov for current information.
- Clustering divides large data sets into coherent subsets that can be studied more easily
- Given an event report, CBR will
- go through all event reports in database
- compute similarity between them
- find all reports within a certain distance or similarity (defined by the user)
- These reports form a cluster
There are many algorithms used to create clusters
Here we will discuss :
As an overgeneralization, all clustering algorithms basically do what was described in the previous slides: they divide the data into subsets based on some criterion of "distance." The two techniques presented here use different definitions of "distance." Statistical clustering uses numerical distance, while case-based reasoning uses distance between semantic concepts.
Previous Slide Contents Next Slide