One-sample MR is the standard implementation of MR in a single data set with data on the SNPs, exposure, and outcome for all participants. With individual participant data, the causal effect of the exposure on the outcome can be estimated by using 2-stage least-squares (2SLS) regression, a method originally developed in the field of econometrics (55). In the first stage, the exposure of interest is regressed on the genetic instrument, which can either be a single SNP, multiple SNPs, or an allele score based on multiple SNPs (e.g., the sum of the number of exposure-increasing alleles). The predicted values of the exposure are taken from the first-stage regression model. In the second stage, the outcome of interest is regressed over the predicted values of the exposure by using either linear or logistic regression, depending on whether the outcome is a continuous or binary variable. The β-coefficient from the second stage can be interpreted as the change in the outcome (for logistic regression, the log OR for disease) per unit increase in the exposure due to the genetic instrument. The rationale for TSLS is given in Figure 2.
Figure 2: Rationale for TSLS. Whilst the observed value of X may be confounded with Y, the genetically predicted component of X is not.
When implementing 2SLS, it is important that the samples used for discovering the genetic instrument or instruments are independent of the samples used for the MR analysis.