Box Cox Transformation with SigmaXL

Box Cox Transformation

Data transforms are usually applied so that the data appear to more closely meet assumptions of a statistical inference model to be applied or to improve the interpret-ability or appearance of graphs.
Power transformation is a class of transformation functions that raise the response to some power. For example, a square root transformation converts X to X^1/2
Box Cox transformation is a popular power transformation method developed by George E. P. Box and David Cox.

Use SigmaXL to Perform a Box-Cox Transformation

SigmaXL provides the best Box-Cox transformation with an optimal λ that minimizes the model SSE (sum of squared error). Here is an example of how we transform the non-normally distributed response to normal data using Box-Cox method.
Data File: “Box-Cox” tab in “Sample Data.xlsx”

Step 1: Test the normality of the original data set.

Select the entire range of “Y” in column H
Click SigmaXL -> Graphical Tool -> Histograms & Descriptive Statistics
A new window named “Histograms & Descriptive” pops up and the selected range automatically appears in the box below “Please select your data”.
Click “Next >>”
A new window named “Histograms & Descriptive Statistics” pops up.
Select “Y” as “Numeric Data Variables (Y)”
Click “OK>>”
The analysis results are shown automatically in the new spreadsheet “Hist Descript(1)”

Normality Test:

H₀: The data are normally distributed.
H₁: The data are not normally distributed.

If p-value > alpha level (0.05), we fail to reject the null hypothesis. Otherwise, we reject the null. In this example, p-value = 0.029 < alpha level (0.05). The data are not normally distributed.

Step 2: Run the Box-Cox Transformation:

Select the entire range of Y in column H
Click SigmaXL -> Process Capability -> Nonnormal -> Box-Cox Transformation
A new window named “Box-Cox Transformation” pops up and the selected range appears automatically in the box under “Please select your data”
Click “Next >>”
A new window also named “Box-Cox Transformation” pops up.
Select “Y” as “Numeric Data Variables (Y)”
Click “OK>>”
The analysis results are shown automatically in the new spreadsheet “Box-Cox (1)”

The software looks for the optimal value of lambda that minimizes the SSE (Sum of Squares of Error). In this case the minimum value is 0.12. The transformed Y can also be saved in another column. The transformed Y is also listed in Column G in the newly generated tab “Box-Cox (1)
Use the Anderson–Darling test to test the normality of the transformed data

H₀: The data are normally distributed.
H₁: The data are not normally distributed.

Model summary: If p-value > alpha level (0.05), we fail to reject the null. Otherwise, we reject the null. In this example, p-value = 0.327 > alpha level (0.05). The data are normally distributed.

Join Our Community

Instant access to hundreds of "How to" articles, Tools, Templates, Roadmaps, Data-Files.. Everything Lean Six Sigma! Come on in! Welcome to our community of Lean Six Sigma certified professionals.

Join_horizontal

Lean Sigma Corporation

Lean Sigma Corporation is a trusted leader in Lean Six Sigma training and certification, boasting a rich history of providing high-quality educational resources. With a mission to honor and maintain the traditional Lean Six Sigma curriculum and certification standards, Lean Sigma Corporation has empowered thousands of professionals and organizations worldwide with over 5,300 certifications, solidifying its position and reputation as a go-to source for excellence through Lean Six Sigma methodologies.

See Full Bio