What is a one factor between
Chapter 5: One Factor Designs Show
As explained in Simple Linear Regression Analysis and Multiple Linear Regression Analysis, the analysis of observational studies involves the use of regression models. The analysis of experimental studies involves the use of analysis of variance (ANOVA) models. For a comparison of the two models see Fitting ANOVA Models. In single factor experiments, ANOVA models are used to compare the mean response values at different levels of the factor. Each level of the factor is investigated to see if the response is significantly different from the response at other levels of the factor. The analysis of single factor experiments is often referred to as one-way ANOVA. To illustrate the use of ANOVA models in the analysis of experiments, consider a single factor experiment where the analyst wants to see if the surface finish of certain parts is affected by the speed of a lathe machine. Data is collected for three speeds (or three treatments). Each treatment is replicated four times. Therefore, this experiment design is balanced. Surface finish values recorded using randomization are shown in the following table.
The ANOVA model of the means model can also be written using [math]{{\mu }_{i}}=\mu +{{\tau }_{i}}\,\![/math], where [math]\mu \,\![/math] represents the overall mean and [math]{{\tau }_{i}}\,\![/math] represents the effect due to the [math]i\,\![/math]th treatment. [math]{{Y}_{ij}}=\mu +{{\tau }_{i}}+{{\epsilon }_{ij}}\,\![/math]
Fitting ANOVA ModelsTo fit ANOVA models and carry out hypothesis testing in single factor experiments, it is convenient to express the effects model of the effects model in the form [math]y=X\beta +\epsilon \,\![/math] (that was used for multiple linear regression models in Multiple Linear Regression Analysis). This can be done as shown next. Using the effects model, the ANOVA model for the single factor experiment in the first table can be expressed as: [math]{{Y}_{ij}}=\mu +{{\tau }_{i}}+{{\epsilon }_{ij}}\,\![/math]
The coefficients of the treatment effects [math]{{\tau }_{1}}\,\![/math] and [math]{{\tau }_{2}}\,\![/math] can be expressed using two indicator variables, [math]{{x}_{1}}\,\![/math] and [math]{{x}_{2}}\,\![/math], as follows: [math]\begin{align} \text{Treatment Effect }{{\tau }_{1}}: & {{x}_{1}}=1,\text{ }{{x}_{2}}=0 \\ \text{Treatment Effect }{{\tau }_{2}}: & {{x}_{1}}=0,\text{ }{{x}_{2}}=1\text{ } \\ \text{Treatment Effect }{{\tau }_{3}}: & {{x}_{1}}=-1,\text{ }{{x}_{2}}=-1\text{ } \end{align}\,\![/math]
Treat Numerical Factors as Qualitative or Quantitative?It can be seen from the equation given above that in an ANOVA model each factor is treated as a qualitative factor. In the present example the factor, lathe speed, is a quantitative factor with three levels. But the ANOVA model treats this factor as a qualitative factor with three levels. Therefore, two indicator variables, [math]{{x}_{1}}\,\![/math] and [math]{{x}_{2}}\,\![/math], are required to represent this factor. Note that in a regression model a variable can either be treated as a quantitative or a qualitative variable. The factor, lathe speed, would be used as a quantitative factor and represented with a single predictor variable in a regression model. For example, if a first order model were to be fitted to the data in the first table, then the regression model would take the form [math]{{Y}_{ij}}={{\beta }_{0}}+{{\beta }_{1}}{{x}_{i1}}+{{\epsilon }_{ij}}\,\![/math]. If a second order regression model were to be fitted, the regression model would be [math]{{Y}_{ij}}={{\beta }_{0}}+{{\beta }_{1}}{{x}_{i1}}+{{\beta }_{2}}x_{i1}^{2}+{{\epsilon }_{ij}}\,\![/math]. Notice that unlike these regression models, the regression version of the ANOVA model does not make any assumption about the nature of relationship between the response and the factor being investigated. The choice of treating a particular factor as a quantitative or qualitative variable depends on the objective of the experimenter. In the case of the data of the first table, the objective of the experimenter is to compare the levels of the factor to see if change in the levels leads to a significant change in the response. The objective is not to make predictions on the response for a given level of the factor. Therefore, the factor is treated as a qualitative factor in this case. If the objective of the experimenter were prediction or optimization, the experimenter would focus on aspects such as the nature of relationship between the factor, lathe speed, and the response, surface finish, so that the factor should be modeled as a quantitative factor to make accurate predictions. Expression of the ANOVA Model as Y = XΒ + εThe regression version of the ANOVA model can be expanded for the three treatments and four replicates of the data in the first table as follows: [math]\begin{align} {{Y}_{11}}= & 6=\mu +1\cdot {{\tau }_{1}}+0\cdot {{\tau }_{2}}+{{\epsilon }_{11}}\text{ Level 1, Replicate 1} \\ {{Y}_{21}}= & 13=\mu +0\cdot {{\tau }_{1}}+1\cdot {{\tau }_{2}}+{{\epsilon }_{21}}\text{ Level 2, Replicate 1} \\ {{Y}_{31}}= & 23=\mu -1\cdot {{\tau }_{1}}-1\cdot {{\tau }_{2}}+{{\epsilon }_{31}}\text{ Level 3, Replicate 1} \\ {{Y}_{12}}= & 13=\mu +1\cdot {{\tau }_{1}}+0\cdot {{\tau }_{2}}+{{\epsilon }_{12}}\text{ Level 1, Replicate 2} \\ {{Y}_{22}}= & 16=\mu +0\cdot {{\tau }_{1}}+1\cdot {{\tau }_{2}}+{{\epsilon }_{22}}\text{ Level 2, Replicate 2} \\ {{Y}_{32}}= & 20=\mu -1\cdot {{\tau }_{1}}-1\cdot {{\tau }_{2}}+{{\epsilon }_{32}}\text{ Level 3, Replicate 2} \\ & & ... \\ {{Y}_{34}}= & 18=\mu -1\cdot {{\tau }_{1}}-1\cdot {{\tau }_{2}}+{{\epsilon }_{34}}\text{ Level 3, Replicate 4} \end{align}\,\![/math]
Hypothesis Test in Single Factor ExperimentsThe hypothesis test in single factor experiments examines the ANOVA model to see if the response at any level of the investigated factor is significantly different from that at the other levels. If this is not the case and the response at all levels is not significantly different, then it can be concluded that the investigated factor does not affect the response. The test on the ANOVA model is carried out by checking to see if any of the treatment effects, [math]{{\tau }_{i}}\,\![/math], are non-zero. The test is similar to the test of significance of regression mentioned in Simple Linear Regression Analysis and Multiple Linear Regression Analysis in the context of regression models. The hypotheses statements for this test are: [math]\begin{align} & {{H}_{0}}: & {{\tau }_{1}}={{\tau }_{2}}=...={{\tau }_{{{n}_{a}}}}=0 \\ & {{H}_{1}}: & {{\tau }_{i}}\ne 0\text{ for at least one }i \end{align}\,\![/math]
Calculation of the Statistic [math]{{F}_{0}}\,\![/math]The sum of squares to obtain the statistic [math]{{F}_{0}}\,\![/math] can be calculated as explained in Multiple Linear Regression Analysis. Using the data in the first table, the model sum of squares, [math]S{{S}_{TR}}\,\![/math], can be calculated as: [math]\begin{align} S{{S}_{TR}}= & {{y}^{\prime }}[H-(\frac{1}{{{n}_{a}}\cdot m})J]y \\ = & {{\left[ \begin{matrix} 6 \\ 13 \\ . \\ . \\ 18 \\ \end{matrix} \right]}^{\prime }}\left[ \begin{matrix} 0.1667 & -0.0833 & . & . & -0.0833 \\ -0.0833 & 0.1667 & . & . & -0.0833 \\ . & . & . & . & . \\ . & . & . & . & . \\ -0.0833 & -0.0833 & . & . & 0.1667 \\ \end{matrix} \right]\left[ \begin{matrix} 6 \\ 13 \\ . \\ . \\ 18 \\ \end{matrix} \right] \\ = & 232.1667 \end{align}\,\![/math]
Confidence Interval on the ith Treatment MeanThe response at each treatment of a single factor experiment can be assumed to be a normal population with a mean of [math]{{\mu }_{i}}\,\![/math] and variance of [math]{{\sigma }^{2}}\,\![/math] provided that the error terms can be assumed to be normally distributed. A point estimator of [math]{{\mu }_{i}}\,\![/math] is the average response at each treatment, [math]{{\bar{y}}_{i\cdot }}\,\![/math]. Since this is a sample average, the associated variance is [math]{{\sigma }^{2}}/{{m}_{i}}\,\![/math], where [math]{{m}_{i}}\,\![/math] is the number of replicates at the [math]i\,\![/math]th treatment. Therefore, the confidence interval on [math]{{\mu }_{i}}\,\![/math] is based on the [math]t\,\![/math] distribution. Recall from Statistical Background on DOE (inference on population mean when variance is unknown) that: [math]\begin{align} {{T}_{0}}= & \frac{{{{\bar{y}}}_{i\cdot }}-{{\mu }_{i}}}{\sqrt{{{{\hat{\sigma }}}^{2}}/{{m}_{i}}}} \\ = & \frac{{{{\bar{y}}}_{i\cdot }}-{{\mu }_{i}}}{\sqrt{M{{S}_{E}}/{{m}_{i}}}} \end{align}\,\![/math]
Confidence Interval on the Difference in Two Treatment MeansThe confidence interval on the difference in two treatment means, [math]{{\mu }_{i}}-{{\mu }_{j}}\,\![/math], is used to compare two levels of the factor at a given significance. If the confidence interval does not include the value of zero, it is concluded that the two levels of the factor are significantly different. The point estimator of [math]{{\mu }_{i}}-{{\mu }_{j}}\,\![/math] is [math]{{\bar{y}}_{i\cdot }}-{{\bar{y}}_{j\cdot }}\,\![/math]. The variance for [math]{{\bar{y}}_{i\cdot }}-{{\bar{y}}_{j\cdot }}\,\![/math] is: [math]\begin{align} var({{{\bar{y}}}_{i\cdot }}-{{{\bar{y}}}_{j\cdot }})= & var({{{\bar{y}}}_{i\cdot }})+var({{{\bar{y}}}_{j\cdot }}) \\ = & {{\sigma }^{2}}/{{m}_{i}}+{{\sigma }^{2}}/{{m}_{j}} \end{align}\,\![/math]
Residual AnalysisPlots of residuals, [math]{{e}_{ij}}\,\![/math], similar to the ones discussed in the previous chapters on regression, are used to ensure that the assumptions associated with the ANOVA model are not violated. The ANOVA model assumes that the random error terms, [math]{{\epsilon }_{ij}}\,\![/math], are normally and independently distributed with the same variance for each treatment. The normality assumption can be checked by obtaining a normal probability plot of the residuals.
Box-Cox MethodTransformations on the response may be used when residual plots for an experiment show a pattern. This indicates that the equality of variance does not hold for the residuals of the given model. The Box-Cox method can be used to automatically identify a suitable power transformation for the data based on the following relationship: [math]{{Y}^{*}}={{Y}^{\lambda }}\,\![/math]
ExampleTo illustrate the Box-Cox method, consider the experiment given in the first table. Transformed response values for various values of [math]\lambda \,\![/math] can be calculated using the equation for [math]{Y}^{\lambda}\,\![/math] given in Box-Cox Method. Knowing the hat matrix, [math]H\,\![/math], [math]S{{S}_{E}}\,\![/math] values corresponding to each of these [math]\lambda \,\![/math] values can easily be obtained using [math]{{y}^{\lambda \prime }}[I-H]{{y}^{\lambda }}\,\![/math]. [math]S{{S}_{E}}\,\![/math] values calculated for [math]\lambda \,\![/math] values between [math]-5\,\![/math] and [math]5\,\![/math] for the given data are shown below: [math]\begin{matrix} \lambda & {} & S{{S}_{E}} & \ln S{{S}_{E}} \\ {} & {} & {} & {} \\ -5 & {} & 5947.8 & 8.6908 \\ -4 & {} & 1946.4 & 7.5737 \\ -3 & {} & 696.5 & 6.5461 \\ -2 & {} & 282.2 & 5.6425 \\ -1 & {} & 135.8 & 4.9114 \\ 0 & {} & 83.9 & 4.4299 \\ 1 & {} & 74.5 & 4.3108 \\ 2 & {} & 101.0 & 4.6154 \\ 3 & {} & 190.4 & 5.2491 \\ 4 & {} & 429.5 & 6.0627 \\ 5 & {} & 1057.6 & -6.9638 \\ \end{matrix}\,\![/math]
What does a factor mean in ANOVA?Factors. The two independent variables in a two-way ANOVA are called factors. The idea is that there are two variables, factors, which affect the dependent variable. Each factor will have two or more levels within it, and the degrees of freedom for each factor is one less than the number of levels.
What is the difference between single factor and two factor ANOVA?The only difference between one-way and two-way ANOVA is the number of independent variables. A one-way ANOVA has one independent variable, while a two-way ANOVA has two.
What was one factor?The one-factor-at-a-time method, also known as one-variable-at-a-time, OFAT, OF@T, OFaaT, OVAT, OV@T, OVaaT, or monothetic analysis is a method of designing experiments involving the testing of factors, or causes, one at a time instead of multiple factors simultaneously.
What is SS between in ANOVA?Sum of squares between (SSB): For each subject, compute the difference between its group mean and the grand mean. The grand mean is the mean of all N scores (just sum all scores and divide by the total sample size N ) Square all these differences.
|