The research path of clustering

Hello, I'm happy to join in this platform. I'm a graduate student and my research interests is machine learning. I'm working on subspace clustering and related work. This is my first article in Bytes.

Unsupervised Learning

The core of Artificial Intelligence is machine learning(ML), whose main task is to identify and distinguish between things. ML is divided into two categories supervised learning and unsupervised learning. The main task of supervised learning is classification, i.e., to complete the distinction of new data with a large number of labeled data. The main task of unsupervised learning is clustering, i.e., to distinct data into many class without manual intervention.

Humanity must be clear aware of that the unsupervised learning is more difficult than supervised learning and there are far fewer researchers in unsupervised than in supervised. Thus, the process of unsupervised development is relatively slow. Nevertheless, the field of unsupervised learning has been explored by scholars for decades. Many research results such as the k-means algorithm were studied. Especially in recent years, with the importance of unsupervised learning has been recognized, more scholars have devoted themselves into this filed and have achieved breakthrough.

Clustering is one of the most important issue in the domain of unsupervised learning. Clustering is employed in many real-world problem, such as image segmentation, bioinformation and finance fraud. Clustering is able to group data which have no label, thus discovering the natural structure of data. Clustering always be apply in three areas as follow.
1. find latent structure of data
2. group data naturally
3. compressed data

Thousands of clustering algorithms have been published by humanity. These algorithms can be divided into division-based algorithm, hierarchy-based algorithm, density-based algorithm etc.

The research about clustering can be divided in three areas.
1. technology-centered research
2. data-centered research
3. clustering-derived-centered research

Some key research findings in field of clustering

Hartigan, J. A. , and M. A. Wong . A K-Means Clustering Algorithm. Applied Statistics, 1979, 28.1.

Luxburg, U. Von. A Tutorial on Spectral Clustering. Statistics and Computing, 2004, 17.4:395-416.

Comaniciu, D. , and P. Meer . Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Analysis & Machine Intelligence, 2002, 24.5:603-619.

Zhang, T. , R. Ramakrishnan , and M. Livny . BIRCH: An Efficient Data Clustering Method for Very Large Databases. ACM SIGMOD Record, 1999, 25.2.

Frey, B. J. , and D. Dueck . Clustering by passing messages between data points. Science, 2007.

Ester, M., H. P. Kriegel, J. Sander, and X. Xu .A. Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, OR, AAAI Press, 1996, 226–231.

Koga, H. , T. Ishibashi , and T. Watanabe . Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing. Knowledge and Information Systems, 2007, 12.1:25-53.

Elhamifar E., Vidal R. Sparse Subspace Clustering: Algorithm, Theory, and Applications. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2012, 35(11):2765-2781.

Rodriguez A., Laio A. Clustering by fast search and find of density peaks. Science, 2014, 344(6191):1492.

Oct 12 '21 #1

Subscribe Post Reply

4404

Similar topics

MS SQL Clustering and Failover

by: Nico de Groot | last post by:

I have a 2 node Microsoft 2000 cluster with a shared storage device. I want to create automatic failover for MS SQL 2000 server. I can do that wit the following options: 1. Active/Pasive (one...

Microsoft SQL Server

Failover clustering without SAN & SCSI ?

by: kumar | last post by:

Dear Friends, I wanted to configure Failover cluster for SQL Server 2000 on Windows 2000 advanced servers. I have only 2 no.s of windows 2000 advanced server m/cs. I dont have any shared...

Microsoft SQL Server

Load balancing and clustering

by: Shabam | last post by:

When a web application becomes overloaded with traffic, one can offload it by load balancing and clustering the front end web servers. What happens when the back-end MSSQL database becomes...

Microsoft SQL Server

can not access clustering SQL Server after relocation

by: willie | last post by:

Hi all: I have a clustering SQL Server on Node1 and Node2, the Node1 has named Instance1 and Node2 has named Instance2, no default instance. We tested it that everthing is OK, then we decide to...

Microsoft SQL Server

PG clustering

by: CSN | last post by:

Just wondering - is there something similar to this (clustering) for PostgreSQL? If so, how does it compare? http://www.mysql.com/press/release_2003_30.html ...

PostgreSQL Database

Question on a table reorg with clustering index.

by: datapro01 | last post by:

Running DB2 version 8.1.1 on AIX 5.1.1 The table (employee) is being reorged and has a clustering index (empid). Is there any different between these two commands? db2 reorg table employee...

DB2 Database

DB2 Clustering

by: chmmr | last post by:

Hi, I am currently in the process of gathering info/experiences for an incoming Linux DB2 clustering phase we actually know nothing about (since we are doing it for the first time ever), so I...

DB2 Database

Software clustering for Asp.net 2.0

by: dejavue82 | last post by:

Hi, Does anybody know of a software package that allows for several servers, running asp.net 2.0 to be clustered, regardless of where they are located (ie. without a hardware load balancer)....

ASP.NET

A simple question about "clustering" ...

by: Lakesider | last post by:

Hi NG, I have a question about data: I have travel-times from A to B like this from | to | sec. A B 17 A B 18 A B 30 A B 32

C# / C Sharp

Clustering, Security, Performance, Load Balance

by: Manish | last post by:

I think this question has been asked number of times. However, I am looking for some specific information. Perhaps some of you can help close the gap. Or perhaps you can point me towards right...

Microsoft SQL Server

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA