Temple University
Home
Mailbox
Colloquiums
People
Research
Directions
Contact Us
SBW04
TITLE : A Dimensionality Reduction Technique for Classification and Similarity Searches of Region Data in Spatial Databases

Talk by : Despina Kontos , Ph.D. student, CIS Department - Temple University
Talk by : Dr.Marcus J. Sobel , Statistics Department - Temple University

Abstract: In most of the attempts to characterize data (images, signals, text, etc.) The prime concern is to extract descriptive features that provide significant information. A characterization approach is to map the data to points in a k-dimensional space, where k is the number of features extracted. Dimensionality reduction can further be used to select the most discriminative features, improving classification, indexing and retrieval. Here, we focus on characterizing spatial Regions of Interest (ROIs). We propose a novel statistical approach based on a supervised framework for reducing the dimensionality of the feature space, when distinct classes of data are present. The method employs a Markov Chain Monte Carlo (MCMC) algorithm designed to select the most informative features, according to their discriminative power across distinct classes of data. This reduces the dimensionality of the initial feature space and also improves the classification of the ROIs, since attributes providing irrelevant information with respect to class membership are discarded. We extend this effect by introducing as well a weighted Euclidean Distance, designed to effectively classify the ROIs. We demonstrate the effectiveness of the proposed technique by applying it to 2D and 3D spatial ROIs, we test its scalability on large datasets and perform similarity searches. Finally, we compare the proposed approach with other dimensionality reduction techniques (Singular Value Decomposition, Karhunen-Loève transform) and present classification performance using Neural Networks, Decision Trees and Euclidean Distance measurements.

We will also discuss clustering methods e.g., adaboost' which minimize clustering bias and variance when the number and characterization of region features are known. Additionally, we will discuss the relevance of random trees, multiscale analysis, wavelets and fractal methodologies to properly choosing the number and characterization of region features when they are not known.


© Center for IST, Temple University 2002. Privacy Statement