Data mining for features using scale-sensitive gated experts
Abstract
This article introduces a new tool for exploratory data analysis and data mining called Scale-Sensitive Gated Experts (SSGE) which can partition a complex nonlinear regression surface into a set of simpler surfaces (which we call features). The set of simpler surfaces has the property that each element of the set can be efficiently modeled by a single feedforward neural network. The degree to which the regression surface is partitioned is controlled by an external scale parameter. The SSGE consists of a nonlinear gating network and several competing nonlinear experts. Although SSGE is similar to the mixture of experts model of Jacobs et al. the mixture of experts model gives only one partitioning of the input-output space, and thus a single set of features, whereas the SSGE gives the user the capability to discover families of features. One obtains a new member of the family of features for each setting of the scale parameter. In this paper, we derive the Scale-Sensitive Gated Experts and demonstrate its performance on a time series segmentation problem. The main results are: 1) the scale parameter controls the granularity of the features of the regression surface, 2) similar features are modeled by the same expert and different kinds of features are modeled by different experts, and 3) for the time series problem, the SSGE finds different regimes of behavior, each with a specific and interesting interpretation.