USMLE Step 3 Biostats Notes



Odds Ratio


  • Calculated in Case Control
  • (A/C)/(B/D) or AD/BC
  • Control studies to compare exposure of participants with the disease (cases) to those without the disease (controls)
  • Remember: Odds of exposure in the cases divided by Odds of exposure in the controls (because you are looking back in time). So it’s not like in RR where you are looking forward at risk of disease/outcome (risk of disease/outcome in exposed vs. risk of disease/outcome in non-exposed)

Relative Risk


  • Calculated in Cohort
  • A/(A+B) / C/(C+D)
  • Remember: In RR you are looking forward at risk of a disease/outcome in the exposed vs unexposed groups
  • risk of disease/outcome in exposed vs. risk of disease/outcome in non-exposed
  • RR comparing groups
    • If the RR of an outcome in group A as compared to group B is x then the RR in group B as compared to group A is 1/x

Calculate Sensitivity



Calculate Specificity



Positive Predictive Value



Negative Predictive Value (NPV)


  • TN/(TN+FN)
  • probability that a negative test correctly identifies a pt without the disease
  • Since higher sensitivity decreases FN, then NPV increases as well

Likelihood Ratio



Number Needed to Treat


  • used to measure efficacy of a therapy and the risk of adverse events
  • NNT= 1/ARR

Number Needed to Harm


  • Number needed to harm=1/ARI
  • number of pts that need to be exposed to risk factor over certain period of time before harmful event occurs to 1 patient

Risk Reduction



Relative Risk Reduction (RRR)



Absolute risk reduction calculation (ARR)


  • subtract risk of tx group from risk of placebo group

Attributable risk reduction (ARR)


  • (risk in exposed-risk in unexposed)/risk in exposed
  • OR
  • (RR-1)/RR
  • Measure of excess risk

Type I Error (false positive)



Type II Error (false negative)


  • Occurs when study fails to reject null hypothesis when it is false
  • Related to the power of a study
  • TII error=1-power or Power=1-type II error
  • (ranges from 0-1)

Confounding Bias


  • occurs when the exposure-disease relationship is obscured by the effect of an extraneous factor that is associated with both.
  • Randomization helps to remove the effects of both known and unknown confounders

Lead time bias


  • the overestimation of survival due to early diagnosis

Observer Bias


  • when observers misclassify data due to individual differences in interpretation or preconceived expectations regarding treatment

Measures of Central Tendency


  • In Right/Negatively skewed mean<median<mode
  • In Left/Positively skewed mode<median<mean
  • In strongly skewed distributions, median is a better measure of central tendency than mean

Power of a study


  • Is the ability to detect the difference between two groups. Increasing sample size increases power and narrows confidence interval

Biostatistics Formula


NameFormula
2 x 2 table
Disease vs Exposure
Reality ——> I A I B I
Test Result –> I C I D I
SensitivityA/(A+C)
or TP/(TP+FN)
SpecificityD/(D+B)
or TN/(TN+FP)
PPVA/(A+B)
or TP/ (TP+FP)
NPVD/(D+C)
or TN/ (TN+FN)
Relative RiskA/(A+B) / C/(C+D)
Use in Cohort
Odds Ratio(A/C)/(B/D)
or AD/BC
Use in Case Control
Absolute Risk Reduction (ARR)Event Rate in Control – Event Rate in Exposed
CER – EER
Absolute Risk Increase (ARI)Event Rate in Exposed – Event Rate in Control
EER – CER
Relative Risk ReductionARR/Event Rate in Control group
CER-EER/CER
or also: 1-RR
NNT1/ARR
NNH1/ARI
Positive Likelihood Ratiosensitivity/specificity

LR+ = sensitivity / (1-specificity)
Negative Likelihood Ratiosensitivity/specificity

LR- = (1-sensitivity)/specificity
Standardized Mortality Ratio (SMR)Observed Number of Deaths/Expected Number of Deaths
Standardized Incidence Ratio (SIR)Observed Number of Cases/Expected Number of Cases

High Yield Notes:


  • (+) Likelihood ratio = sensitivity / 1-­‐specificity = likelihood of having the disease given a positive result. This is different from PPV in that PPV is prevalence dependent.
  • (-­‐) Likelihood ratio = 1-­‐sensitivity/specificity = likelihood of not having the disease after a test result comes back negative. NPV, in contrast, is prevalence dependent.
  • PPV increases with increased specificity. NPV increases with increased sensitivity. Therefore a test with the highest PPV will have the highest specificity. A test with the highest NPV will have the highest sensitivity.
  • Higher prevalence increases PPV and decreases NPV. Lower prevalence decreases PPV and increases NPV.
  • Nominal data is dichotomous and only has two categories (e.g., male vs female).
  • Ordinal data has ranking but no numerical value (e.g., freshman, sophomore, junior, senior).
  • The median is a better indicator of central tendency (vs mean) in data with a highly skewed distribution.
  • Hazards Ratio:
    • Measure of how much effect something actually had.
    • Value of 1.00 means there is no difference between the two groups.
    • A ratio < 1 indicates a protective effect, and > 1 indicates a detrimental effect.
    • If the confidence interval of the hazard ratio includes 1.00 (null value), then the effect wasn’t statistically significant.
    • If the interval doesn’t include the value, the difference was significant.