473,289 Members | 1,940 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes and contribute your articles to a community of 473,289 developers and data experts.

The research path of clustering

Hello, I'm happy to join in this platform. I'm a graduate student and my research interests is machine learning. I'm working on subspace clustering and related work. This is my first article in Bytes.

Unsupervised Learning

The core of Artificial Intelligence is machine learning(ML), whose main task is to identify and distinguish between things. ML is divided into two categories supervised learning and unsupervised learning. The main task of supervised learning is classification, i.e., to complete the distinction of new data with a large number of labeled data. The main task of unsupervised learning is clustering, i.e., to distinct data into many class without manual intervention.

Humanity must be clear aware of that the unsupervised learning is more difficult than supervised learning and there are far fewer researchers in unsupervised than in supervised. Thus, the process of unsupervised development is relatively slow. Nevertheless, the field of unsupervised learning has been explored by scholars for decades. Many research results such as the k-means algorithm were studied. Especially in recent years, with the importance of unsupervised learning has been recognized, more scholars have devoted themselves into this filed and have achieved breakthrough.

Clustering is one of the most important issue in the domain of unsupervised learning. Clustering is employed in many real-world problem, such as image segmentation, bioinformation and finance fraud. Clustering is able to group data which have no label, thus discovering the natural structure of data. Clustering always be apply in three areas as follow.
1. find latent structure of data
2. group data naturally
3. compressed data

Thousands of clustering algorithms have been published by humanity. These algorithms can be divided into division-based algorithm, hierarchy-based algorithm, density-based algorithm etc.

The research about clustering can be divided in three areas.
1. technology-centered research
2. data-centered research
3. clustering-derived-centered research


Some key research findings in field of clustering

Hartigan, J. A. , and M. A. Wong . A K-Means Clustering Algorithm. Applied Statistics, 1979, 28.1.

Luxburg, U. Von. A Tutorial on Spectral Clustering. Statistics and Computing, 2004, 17.4:395-416.

Comaniciu, D. , and P. Meer . Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Analysis & Machine Intelligence, 2002, 24.5:603-619.

Zhang, T. , R. Ramakrishnan , and M. Livny . BIRCH: An Efficient Data Clustering Method for Very Large Databases. ACM SIGMOD Record, 1999, 25.2.

Frey, B. J. , and D. Dueck . Clustering by passing messages between data points. Science, 2007.

Ester, M., H. P. Kriegel, J. Sander, and X. Xu .A. Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, OR, AAAI Press, 1996, 226–231.

Koga, H. , T. Ishibashi , and T. Watanabe . Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing. Knowledge and Information Systems, 2007, 12.1:25-53.

Elhamifar E., Vidal R. Sparse Subspace Clustering: Algorithm, Theory, and Applications. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2012, 35(11):2765-2781.

Rodriguez A., Laio A. Clustering by fast search and find of density peaks. Science, 2014, 344(6191):1492.
Oct 12 '21 #1
0 4385

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Nico de Groot | last post by:
I have a 2 node Microsoft 2000 cluster with a shared storage device. I want to create automatic failover for MS SQL 2000 server. I can do that wit the following options: 1. Active/Pasive (one...
1
by: kumar | last post by:
Dear Friends, I wanted to configure Failover cluster for SQL Server 2000 on Windows 2000 advanced servers. I have only 2 no.s of windows 2000 advanced server m/cs. I dont have any shared...
3
by: Shabam | last post by:
When a web application becomes overloaded with traffic, one can offload it by load balancing and clustering the front end web servers. What happens when the back-end MSSQL database becomes...
1
by: willie | last post by:
Hi all: I have a clustering SQL Server on Node1 and Node2, the Node1 has named Instance1 and Node2 has named Instance2, no default instance. We tested it that everthing is OK, then we decide to...
2
by: CSN | last post by:
Just wondering - is there something similar to this (clustering) for PostgreSQL? If so, how does it compare? http://www.mysql.com/press/release_2003_30.html ...
3
by: datapro01 | last post by:
Running DB2 version 8.1.1 on AIX 5.1.1 The table (employee) is being reorged and has a clustering index (empid). Is there any different between these two commands? db2 reorg table employee...
11
by: chmmr | last post by:
Hi, I am currently in the process of gathering info/experiences for an incoming Linux DB2 clustering phase we actually know nothing about (since we are doing it for the first time ever), so I...
3
by: dejavue82 | last post by:
Hi, Does anybody know of a software package that allows for several servers, running asp.net 2.0 to be clustered, regardless of where they are located (ie. without a hardware load balancer)....
5
by: Lakesider | last post by:
Hi NG, I have a question about data: I have travel-times from A to B like this from | to | sec. A B 17 A B 18 A B 30 A B 32
3
by: Manish | last post by:
I think this question has been asked number of times. However, I am looking for some specific information. Perhaps some of you can help close the gap. Or perhaps you can point me towards right...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.