Welcome to Forest-Guided Clustering’s documentation#

Forest-Guided Clustering (FGC) is an explainability method for Random Forest models. Standard explainability methods (e.g. feature importance) assume independence of model features and hence, are not suited in the presence of correlated features. The Forest-Guided Clustering algorithm does not assume independence of model features, because it computes the feature importance based on subgroups of instances that follow similar decision rules within the Random Forest model. Hence, this method is well suited for cases with high correlation among model features.

For a short introduction to Forest-Guided Clustering, click below:

For a detailed comparison of FGC and Permutation Feature Importance, have a look at this Notebook Introduction to FGC: Comparison of Forest-Guided Clustering and Feature Importance.

GETTING STARTED

Contributing#

Contributions are more than welcome! Everything from code to notebooks to examples and documentation are all equally valuable so please don’t feel you can’t contribute. To contribute please fork the project make your changes and submit a pull request. We will do our best to work through any issues with you and get your code merged into the main branch.

How to cite#

If Forest-Guided Clustering is useful for your research, consider citing the package:

@software{lisa_sousa_2022_6445529,
   author       = {Lisa Barros de Andrade e Sousa,
                     Helena Pelin,
                     Dominik Thalmeier,
                     Marie Piraud},
   title        = {{Forest-Guided Clustering - Explainability for Random Forest Models}},
   month        = april,
   year         = 2022,
   publisher    = {Zenodo},
   version      = {v0.2.0},
   doi          = {10.5281/zenodo.7085465},
   url          = {https://doi.org/10.5281/zenodo.7085465}
}

License#

fgclustering is released under the MIT license. See LICENSE for additional details about it.