Helping predictive analytics interpretation using regression trees and clustering perturbation
O. Parisot, Y. Didry, T. Tamisier, and B. Otjacques
Journal of Decision Systems, pp. 1-18, 2015
Regression trees are helpful tools for decision support and predictive analytics, due to their simple structure and the ease with which they can be obtained from data. Nonetheless, when applied to non-trivial datasets, they tend to grow according to the complexity of the data, becoming difficult to interpret. This difficulty can be overcome by clustering the dataset and representing the regression tree of each cluster independently. In order to help create predictive models that are more comprehensible, we propose in this work a clustering perturbation method to reduce the size of the regression tree obtained from each cluster. A prototype has been developed and tested on several regression datasets.