ImaGEO has implemented two of the main methods of gene expression meta-analysis: meta-analysis based on effects size combination and meta-analysis based on p-value combination.
Effect size can be defined as a quantitative measure that explains the strength of a phenomenon across different studies. In our case, we calculate the standardized mean difference between two groups (i.e. case and control, case1 and case2, etc). “In our case, we use the aproximation of the Hedges’ g as an estimator of the standardized mean difference:
\[g = J \times d = J \times \frac{\overline{x}_{1} - \overline{x}_{2}}{S}\]
where:
The variance of this estimator is:
\[V_{g} = J^{2} \times V_{d} = J^{2} \times \frac{n_{1} + n_{2}}{n_{1}n_{2}} + \frac{d^{2}}{2({n_{1} + n_{2}})}\]
Where \(V_{d}\) is the variance of the Cohen’s \(d\) estimator.
To combine the effect sizes from different studies, we consider two methods:
FEM is a linear model that assumes the different studies share a common true effect size. The combined effect size is calculated as a weighted mean of the different effect sizes:
\[\overline{M} = \frac{\sum_{i=1}^{k} \omega_{i} Y_{i}}{\sum_{i=1}^{k} \omega_{i}}\]
where:
The variance of this combined effect is calculated as:
\[V(\overline{M}) = \frac{1}{\sum_{i=1}^{k} \omega_{i}}\]
The combined effect value for a standard normal, \(N(0,1)\):
\[Z = \frac{\overline{M}}{\sqrt{V(\overline{M})}}\]
Therefore, we obtain a two-tailed p-value:
\[P-value = 2[ 1- (\Phi|Z|)]\]
where \(\Phi\) is the standard normal cumulative distribution function.
Unlike FEM, the random-effects model (REM) assumes that the true effect can vary from one study to another. In this case, the combined effect size represents the average of the true effects. In practice, this implies assuming that in the calculation of the weights for the weighted mean, there are two sources of error: the within-study variance (similar to FEM) and the between-study variance (\(\tau^{2}\)). To calculate \(\tau^{2}\), we use the method of moments (DerSimonian and Laird):
\[\tau^{2} = max(0, \frac{Q-df}{C})\]
where:
By this way, the weight of a study is:
\[\omega_{i}^{*} = \frac{1}{V(Y_{i}) + \tau^{2}}\]
Therefore, similarly to the FEM, the combined effect size for the REM is calculated as:
\[\overline{M^{*}} = \frac{\sum_{i=1}^{k} \omega_{i}^{*} Y_{i}}{\sum_{i=1}^{k} \omega_{i}^{*}}\]
And similarity:
\[V(\overline{M^{*}}) = \frac{1}{\sum_{i=1}^{k} \omega_{i}^{*}}\]
\[Z^{*} = \frac{\overline{M^{*}}}{\sqrt{V(\overline{M^{*}})}}\]
\[P-value = 2[ 1- (\Phi|Z^{*}|)]\]
FEM should only be used when the studies included in the analysis are functionally identical (not independently conducted) and the results do not need to be generalized to other studies. In the case of meta-analysis of differential expression, it’s challenging to fulfill these conditions. Therefore, we recommend always applying a random-effects model unless the researcher is entirely certain that the conditions for applying a fixed-effects model are met.
These techniques are aimed to integrate the P-values of individual analyses into one single combined P-value.
This technique uses as statistic the sum of the logarithms of the p-values:
\[- 2 \times \sum_{i=1}^{k} ln(p) \sim \chi^{2}_{2 \times k} \; under \; H_{0}\]
being \(k\) the number of studies.
This method assumes that:
\[Z_{i} = \Phi^{-1}(1-P)\]
Where \(\Phi\) is the standard normal cumulative distribution.
The statistic used in this method is the combination of Z-values:
\[\frac{\sum_{i=1}^{k} Z_{i}}{\sqrt{k}} \sim N(0,1) \; under \; H_{0}\]
The statistic of this method is the minimum of P values of all studies:
\[min(p_{1},..., p_{i},...,p_{k}) \sim Beta(1,k) \; under \; H_{0}\]
The statistic of this method is the maximum of P values of all studies:
\[max(p_{1},..., p_{i},...,p_{k}) \sim Beta(k,1) \; under \; H_{0}\]
One notable aspect of these methodologies is their uniform treatment of all studies, irrespective of their scale. This is due to the direct combination of individually obtained P-values. Furthermore, these methodologies exhibit greater compatibility in combining studies from diverse platforms or conditions compared to approaches focused on effect sizes combination. An additional benefit lies in their capacity to directly combine outcomes from disparate analyses. Nonetheless, P-value combination methods suffer from a significant drawback: the loss of directional information regarding the expression pattern.