Application of Minimum Bayes Factor to A Balanced Two-Way ANOVA With Random Effects

It is a common practice in statistical analysis to draw conclusions based on significance. P-values often reflect the probability of incorrectly concluding that a null hypothesized model is true; they do not provide information about other types of error that are also important for interpreting statistical results. Standard model selection criteria and test procedures are often inappropriate for comparing models with different numbers of random effects, due to constraints on the parameter space of the variance components. In this paper, we focused on a minimum Bayes factor proposed by Held and Ott (2018) and applied it to a balanced two way analysis of variance (ANOVA) with random effects under three cases namely: Case 1: both factors are fixed; Case 2: both factors are random; Case 3: factor A is fixed and factor B is random. We realized that in all the three cases, considered the Bayes factor indicates weak evidence against the null hypothesis of zero variability in the effects of the levels of the factors as well as the interactions. This result is due to the conservative nature of the minimum Bayes factor

It is a common practice in statistical analysis to draw conclusions based on significance. P-values often reflect the probability of incorrectly concluding that a null hypothesized model is true; they do not provide information about other types of error that are also important for interpreting statistical results. Standard model selection criteria and test procedures are often inappropriate for comparing models with different numbers of random effects, due to constraints on the parameter space of the variance components. In this paper, we focused on a minimum Bayes factor proposed by Held and Ott (2018) and applied it to a balanced two way analysis of variance (ANOVA) with random effects under three cases namely: Case 1: both factors are fixed; Case 2: both factors are random; Case 3: factor A is fixed and factor B is random. We realized that in all the three cases, considered the Bayes factor indicates weak evidence against the null hypothesis of zero variability in the effects of the levels of the factors as well as the interactions. This result is due to the conservative nature of the minimum Bayes factor

INTRODUCTION
A typical fixed-effect analysis of an experiment involves comparing treatment means to one another in an attempt to detect differences. In a randomeffects model, the treatment levels themselves are a random sample drawn from a larger population of treatment levels. In the latter, the objective of the researcher is to extend the conclusions based on a sample of treatment levels to all treatment levels in the population. In fact, the null hypothesis of random-effects ANOVA is quite different from its fixed-effects counterpart.
Statistical analysis is often used to reason about scientific questions based on a data sample, with the goal of determining "which parameter values are supported by the data and which are not" (Hoenig and Heisey, 2001). P-values been the most commonly used tool to measure evidence against a hypothesis or hypothesized model are often incorrectly viewed as an error probability for rejection of the hypothesis or, even worse, as the posterior probability that the hypothesis is true. This difficulty in interpretation of p-values has been highlighted in many articles, (See Sellke,Bayarriand Berger, 2001) The p-value quantifies the discrepancy between the data and a null hypothesis of interest, usually the assumption of no difference or no effect. A Bayesian approach allows the calibration of p-values by transforming them to direct measures of the evidence against the null hypothesis; it also avail us an insight into how the data supports the alternative hypothesis.

LITERATURE REVIEW
A p-value is computed under the assumption that the null hypothesis is true, so it is conditional on null hypothesis. It does not allow for conclusions about the probability of null hypothesis given the data, which is usually of primary interest. The Bayes factor directly quantifies whether the data have increased or decreased the odds of the null hypothesis. A better approach than categorizing a p-value is thus to transform a p-value to a Bayes factor or a lower bound on a Bayes factor, a so-called minimum Bayes factor (Goodman 1999). But many such ways have been proposed to calibrate p-values, and there is currently no consensus on how p-values should be transformed to Bayes factors.
The Bayes factor (or its logarithm) is therefore often referred to as the strength of evidence or weight of evidence (Bernardo and Smith 2000). Several Bayes factors have been proposed over time for Student t test, Analysis of Variance (ANOVA) among others. Some of the proposed Bayes factors are Bayesian Information Criterion (BIC)-Based (Wagenmakers, 2007;Masson, 2011;Faulkenberry, 2018), while some other Bayes factors are P-value based (Sellke,Bayarriand Berger, 2001;Held &Ott, 2018).
In this article, we focused on the Bayes factor proposed by Held &Ott (2018) and applied it to a balanced two way ANOVA with random effects model. Evidence against a point null hypothesis was provided by small Bayes factors , such that Bayes factors lie in the same range as p-values, which facilitates comparisons. To categorize such Bayes factors, Held &Ott (2016) provided a sixgrade scale See (Table 2.1), which was proposed as a compromise of the grades proposed by Jeffreys (1961) and Goodman (1999). The Held &Ott (2018) minimum Bayes factor is smaller than all the other minimum Bayes factors, even smaller than the Goodman bound.

METHODOLOGY Two Way Anova With Random Effects
The balanced ANOVA model with two random effects is given by: is error term associated with (residual) effects).
The following assumptions are made: The covariance structure of the response is given by: In such a balanced variance components model (2.1), Bayesians are often interested in evaluating whether the random effects should be included; this is equivalent to testing the following null hypotheses:

Two Way Anova Model With Two Random Effects
When models include random effects, the Expected Mean Squares will often differ from the same model with fixed effects. In most cases, this will affect how F-tests are performed, and the distribution of the F-statistic and its degrees of freedom, James (2013)

Two Way Anova Model With Fixed and Random Effects
The balanced two way ANOVA model with fixed and random effects is given by: and Fixed factor A : The covariance structure of the response is given by:

P-Value-Based Bayes Factor Proposed Held And Ott (2018)
The commonly used calibration (− log ), proposed by Sellke et al. (2001) Sellke et al. (2001) also presented an alternative derivation of Equation (2.18), in which one does not have to assume the beta class for the p-value under the alternative hypothesis ( 1 ). Held (2016) noted that Equation (2.18) can also be derived as a test-based Bayes factor under the g-prior if has dimension 2 and the sample size is large. The calibration given by Equation (2.18) is always smaller than the local normal alternatives bound (See Held and Ott, 2018) and approximately equal to the lower bound in the more general class of all local alternatives, Sellkeet al. (2001). The beta distribution ( , 1) with ( ≤ 1) has prior sample size ( + 1 ≤ 2), so it is always quite uninformative. Therefore, ( | 1 , ) will be relatively flat, and the minimum Bayes factor Will be relatively large. However, this is not the only class of beta priors with monotonically decreasing density functions. An alternative, which to our knowledge has not yet been discussed in the literature, that is class of beta distributions (1, ) with ( ≥ 1). A beta distribution from this class has prior sample size (1 + ≥ 2), so the likelihood under the alternative can take larger values than for the above ( , 1)prior. Calculus shows that in this setting,  (2001). For p-values less than 0.1, it is even smaller than the Goodman approach (Goodman, 2016). This is due to a large (and unbounded) prior sample size for small , in contrast to the prior sample size of the − calibration, which cannot be larger than 2. However, Equation (2.22) provides a sharp lower bound on Bayes factors based on g-priors of any dimension d, even if the sample size is very small. For reasonably large sample sizes, however, the −e q log q calibration will be too conservative, (Held and Ott, 2018). The study was carried out in the following steps: STEPS: 1. For each case (1, 2 and 3), data was simulated for the set, m, n and p combination. 2. The frequentist Two Way ANOVA table summary was computed using the simulated data for each case and the corresponding Pvalues obtained. 3. The Minimum Bayes factor proposed by Held and Ott (2018) was computed using results in 2 above.

Data Presentation and Analysis Data Presentation
Data sets were simulated using the native functions implemented in the R software for statistical computing (version 3.4.0 for Windows, R Core Team, 2017) from a standard normal population ( = 0, = 1). Simulation was generated using random seed sets to simplify replication. The random sample generated contains 125 random numbers clustered in 25 cells (5 rows and 5 columns). Each cell contains 5 random numbers. This design is similar to a setup of the regular two way ANOVA with five replicates per cell. Table 3.1 below was generated from the process above.

Hypothesis Testing for Case 1: Both factors (i.e. Factors A and B) are Fixed
To test the hypothesis of interest, we obtained the minimum Bayes factor for each of the hypothesis using the p-values already computed in the table above:  The Bayes factor 10 = 0.5777 , signifies that the data has a weak evidence against the null hypothesis of no variability in the five levels of factor A stated in equation (5). This can be seen in Table 7 To test the hypothesis below, The Bayes factor 10 = 1, signifies that the data has a weak evidence against the null hypothesis of no variability in the five levels of factor B stated in equation (3.2). This can be seen in Table 3.3.
To test the hypothesis below The Bayes factor 10 = 1, signifies that the data has a weak evidence against the null hypothesis of no variability in the interaction effects of factors A and B stated in equation (7). This can be seen in Table 7

RESULTS AND DISCUSSION Discussion For Case 1 (Both Factors Fixed)
In all the three hypothesis examined above, the data indicated a weak evidence against the null hypothesis stated in equations 3.1, 3.2 and 3.3. This shows that the data strongly supports the null hypothesis of no variability in the effects of the levels of the factors and in the interaction effects. This result corresponds to the frequentist conclusion based on the p-values; in that all the pvalues are greater than 0.05 significance levels. This shows that for a two model with fixed effects, the Bayesian as well as the frequentist conclusions are not differing. Case 2: Both Factors Are Random (i.e. two random effects)   The Bayes factor 10 = 0.3480 , signifies that the data has a weak evidence against the null hypothesis of no variability in the five levels of factor A stated in equation (3.4). This can be seen in Table 9 To test the two (2)  The Bayes factor 10 = 1, signifies that the data has a weak evidence against the null hypothesis of no variability in levels of factor B (stated in equation (3.5)) and also the interaction effects of factors A and B stated in equation (3.6). This can be seen in Table 9 above.

Discussion For Case 2 (Two Random Effects)
In testing the three (3) hypotheses stated in equations (3.4), (3.5) and (3.6), we realized that the data shows a weak evidence against of the null hypothesis of no variability in the effects of the levels of the factors A and B as well as the effects their interactions. A side examination of a frequentist conclusion from the p-values at a 5% significance level,indicates that the null hypothesis of zero treatment effect stated in equations (3.4), (3.5) and (3.6) ought not to be rejected. Hence, the Bayesian as well as the frequentist conclusions are not differing. Held and Ott (2018)   The Bayes factor 10 = 0.3480 , signifies that the data has a weak evidence against the null hypothesis of no variability in the five levels of factor A stated in equation (3.7). This can be seen in Table 11 To test the two (2) hypotheses below, we can see from Table 3.4 that the respective Pvalues (0.8170 and 0.8074) are all greater than p = 1-e -1 = 1 -(2.718) -1 = 0.6321. Hence, min BF(p) =1 for both hypotheses. The Bayes factor 10 = 1, signifies that the data has a weak evidence against the null hypothesis of no variability in levels of factor B (stated in equation (3.8)) and also the interaction effects of factors A and B stated in equation (3.9). This can be seen in Table 11 above.

Discussion For Case 3 (Factor A Is Fixed and Factor B is Random)
In testing the three (3) hypotheses stated in equations (3.7), (3.8) and (3.9), we realized that the data shows a weak evidence against of the null hypothesis of no variability in the effects of the levels of the factors A and B as well as the effects of their interactions. A side examination of a frequentist conclusion from the p-values at a 5% significance level, indicates that the null hypothesis of zero treatment effect stated in equations (3.7), (3.8) and (3.9) will not to be rejected. Hence, the Bayesian as well as the frequentist conclusions are not differing.

CONCLUSION
In all the three cases studied in this paper, it is evident that the Held and Ott (2018) minimum Bayes factor is conservative. In fact, at no point did the data exhibit strong evidence against the null hypothesized model of zero variability. We realized that the calibrated p-value based minimum Bayes factors do not differ in conclusions from those of the frequentist. When the same data was subjected to the Faulkenberry (2018) Bayes factor, the conclusions for the fixed effects differed significantly from those of the random effects (See Egburonu and Abidoye, 2018); but the conservative nature of the minimum Bayes factor overcame this difference. The minimum Bayes factor provides lesser values for Bayes factor when compared to the Prior sensitive and BIC based Bayes factors proposed by Wang andSun (2011) andFaulkenberry (2018) respectively.