Lots of Data & Root Cause Investigation

This post will describe the differences between root cause analysis when given multi-dimensional continuous variables only (Chi-Square, multinomial logistic regression, random forest/decision trees) vs multi-dimensional categorical variables (multiple linear regression, ANoVA, PCA, Clustering). Also will describe how to handle if you're given both categorical and continuous variables (combination analysis, binary hot encoding, AUC-ROC, etc). Will also provide snippets of python code for somebody to take and change for their application.

4/13/20241 min read

The quick brown fox jumps over the lazy dog.