Lots of Data & Root Cause Investigation
This post will describe the differences between root cause analysis when given multi-dimensional continuous variables only (Chi-Square, multinomial logistic regression, random forest/decision trees) vs multi-dimensional categorical variables (multiple linear regression, ANoVA, PCA, Clustering). Also will describe how to handle if you're given both categorical and continuous variables (combination analysis, binary hot encoding, AUC-ROC, etc). Will also provide snippets of python code for somebody to take and change for their application.
4/13/20241 min read


The quick brown fox jumps over the lazy dog.
Jared Vogler
industrial engineer
Location
Charleston, South Carolina