High dimensional data partitioning with an adaptive ensemble construction and analysis scheme

Citation metadata

Date: Mar. 2017
Publisher: American-Eurasian Network for Scientific Information
Document Type: Report
Length: 3,828 words
Lexile Measure: 1390L

Document controls

Main content

Abstract :

Clustering techniques are applied to partition the transaction data values. High dimensional support, prior knowledge usage and equal membership priority are the key factors in the traditional cluster ensemble approach. Incremental Semi Supervised Cluster Ensemble (ISSCE) approach is built to solve the limitations of conventional cluster ensemble approaches. The ISSCE approach uses the steps in random subspace technique, the constraint propagation approach, the incremental ensemble member selection process and the normalized cut algorithm to perform high dimensional data clustering. The random subspace technique is effective for handling high dimensional data. The constraint propagation approach is useful for incorporating prior knowledge. The incremental ensemble member selection process is applied to judiciously remove redundant ensemble members based on a local cost function and a global cost function. The normalized cut algorithm is adopted to serve as the consensus function for providing more stable, robust and accurate results. A measure is applied to quantify the similarity between two sets of attributes, and is used for computing the local cost function in ISSCE. The incremental semi supervised clustering ensemble framework (ISSCE) approach is enhanced to support structure based parameter selection process. Datasets complexity is also integrated with the parameter selection process. Membership rearrangement mechanism is adapted to handle the incremental membership selection process. Member and ensemble weight measure is also applied to discover the importance of the cluster ensembles. The cluster ensemble model is integrated with the Partition Around Medoids (PAM) clustering scheme. The system also increases the clustering accuracy and scalability levels. KEYWORDS: clustering, Incremental Semi Supervised Cluster Ensemble, algorithm, microcluster, Partition Around Medoids.

Source Citation

Source Citation   

Gale Document Number: GALE|A498845491