College Board AP Statistics

Master Review

20 Expert-Verified Concept Questions · All Major Units · Detailed Explanations

Core Concepts

Study the concept, memorize the key formula, then try the example before the real test.

U1 Q1Describing Distributions — Shape, Center, Spread
Concept
When describing a distribution, always comment on SOCS: Shape (symmetric, skewed left/right, bimodal), Outliers, Center (mean or median), and Spread (range, IQR, standard deviation). For skewed distributions, the median is the preferred measure of center; the IQR is the preferred measure of spread.
★ Memorize
Skewed RIGHT → mean > median (pulled toward tail)
Skewed LEFT → mean < median
IQR outlier rule: Q1 − 1.5·IQR or Q3 + 1.5·IQR
Example
A dataset of household incomes is skewed right. Which measures should describe center and spread?
Answer: Median (center) and IQR (spread) — resistant to outliers/skew
U1 Q2Normal Distribution & Z-Scores
Concept
A z-score standardizes an observation by measuring how many standard deviations it lies from the mean. Positive z = above mean, negative z = below mean. The standard normal distribution N(0,1) allows probability lookup via tables or calculators.
★ Memorize
z = (x − μ) / σ
68-95-99.7 Rule: within 1σ: 68%, 2σ: 95%, 3σ: 99.7%
Example
Mean = 70, SD = 10. What z-score corresponds to x = 85?
z = (85 − 70) / 10 = 1.5
U2 Q3Correlation & Causation
Concept
The correlation coefficient r measures the strength and direction of a linear relationship (−1 ≤ r ≤ 1). Correlation does NOT imply causation. Lurking variables can create spurious correlations. Only controlled experiments can establish cause-and-effect.
★ Memorize
r close to ±1 → strong linear; r ≈ 0 → weak/no linear
r is unitless and not affected by switching x/y or linear transformation
Causation requires a controlled experiment (random assignment)
Example
r = −0.87 between study hours and test errors. What does this tell us?
Strong negative linear association; more study hours → fewer errors (but NOT proven causal)
U2 Q4Least-Squares Regression Line
Concept
The LSRL minimizes the sum of squared residuals. The line always passes through (x̄, ȳ). The slope b₁ tells you predicted change in ŷ per unit increase in x. R² (coefficient of determination) tells the proportion of variation in y explained by the linear model with x.
★ Memorize
ŷ = b₀ + b₁x
b₁ = r · (s_y / s_x)
b₀ = ȳ − b₁·x̄
Residual = actual − predicted = y − ŷ
Example
ŷ = 5 + 3x. Actual y = 20, x = 4. What is the residual?
ŷ = 5 + 3(4) = 17; Residual = 20 − 17 = 3
U3 Q5Sampling Methods & Bias
Concept
A simple random sample (SRS) gives every individual and group of that size equal chance of selection. Bias sources: voluntary response bias (self-selection overrepresents strong opinions), nonresponse bias, undercoverage. Increasing sample size reduces variability but does NOT reduce bias.
★ Memorize
Bias = systematic error → larger n does NOT fix it
Variability = random error → larger n reduces it
Convenience sample → often biased; SRS → unbiased
Example
A radio station asks listeners to call in about a policy. What type of bias is this?
Voluntary response bias — strong opinions overrepresented
U3 Q6Experimental Design Principles
Concept
A well-designed experiment uses: Control (compare treatments to a baseline), Randomization (assign subjects at random to remove confounding), Replication (use many subjects to reduce chance variation). Blocking reduces variability due to known nuisance variables. Blinding eliminates placebo/experimenter effects.
★ Memorize
3 Principles: Control · Randomization · Replication
Block = group similar subjects BEFORE randomizing
Double-blind: neither subject NOR evaluator knows treatment
Example
Researchers split 60 subjects by gender before randomly assigning to drug/placebo. What design is used?
Randomized block design (gender is the blocking variable)
U4 Q7Basic Probability Rules
Concept
Probability of any event A: 0 ≤ P(A) ≤ 1. Complement rule, addition rule for mutually exclusive events, and general addition rule. Independence: A and B independent if P(A∩B) = P(A)·P(B).
★ Memorize
P(Aᶜ) = 1 − P(A)
P(A∪B) = P(A) + P(B) − P(A∩B)
If independent: P(A∩B) = P(A)·P(B)
P(A|B) = P(A∩B) / P(B)
Example
P(A) = 0.4, P(B) = 0.3, A and B independent. Find P(A∩B).
P(A∩B) = 0.4 × 0.3 = 0.12
U4 Q8Discrete Random Variables — Mean & SD
Concept
For a discrete random variable X with probability distribution, the mean (expected value) is the weighted average of outcomes. Variance is the weighted average of squared deviations from the mean. For independent RVs: means add, variances add (never standard deviations).
★ Memorize
μ_X = Σ x·P(x)
σ²_X = Σ (x − μ)²·P(x)
If X,Y independent: σ²_(X±Y) = σ²_X + σ²_Y
NEVER add SDs directly
Example
σ_X = 3, σ_Y = 4, independent. Find σ_(X+Y).
σ²_(X+Y) = 9 + 16 = 25; σ_(X+Y) = 5
U4 Q9Binomial Distribution
Concept
BINS conditions: Fixed number of trials n, Binary outcomes (success/failure), Independent trials, Same probability of success p each trial. X ~ B(n, p).
★ Memorize
P(X=k) = C(n,k) · pᵏ · (1−p)^(n−k)
μ = np
σ = √(np(1−p))
10% condition: n ≤ 0.10·N for near-independence
Example
X~B(10, 0.3). Find μ and σ.
μ = 10(0.3) = 3; σ = √(10·0.3·0.7) = √2.1 ≈ 1.449
U5 Q10Sampling Distributions & Central Limit Theorem
Concept
The sampling distribution of x̄ has mean μ and standard error σ/√n. By the Central Limit Theorem, for large enough n (n ≥ 30), x̄ is approximately normal regardless of the population shape. For proportions, p̂ is approximately normal if np ≥ 10 and n(1−p) ≥ 10.
★ Memorize
x̄ ~ N(μ, σ/√n) when n ≥ 30 (CLT)
SE of x̄ = σ/√n
SE of p̂ = √(p(1−p)/n)
Larger n → smaller SE → more precise estimates
Example
Population: μ=50, σ=12. Sample n=36. Find SE of x̄.
SE = 12/√36 = 12/6 = 2
U6 Q11Confidence Intervals — Concept & Interpretation
Concept
A C% confidence interval means: if we repeated the sampling process many times, C% of the intervals constructed would capture the true parameter. It does NOT mean a C% probability the parameter is in this specific interval (the parameter is fixed, not random).
★ Memorize
CI = statistic ± critical value · SE
Margin of error = critical value · SE
Wider CI ↔ lower confidence OR smaller n
Correct interpretation: "95% of intervals capture μ"
Example
95% CI for μ is (42, 58). Correct interpretation?
"We are 95% confident the true population mean is between 42 and 58."
U6 Q12One-Sample t-Test for a Mean
Concept
When σ is unknown, use the t-distribution with df = n − 1. Conditions: SRS (or random), Normal population or n ≥ 30 (CLT), Independent (10% condition). The p-value measures how likely the observed statistic is if H₀ is true.
★ Memorize
t = (x̄ − μ₀) / (s/√n), df = n − 1
If p-value < α: reject H₀
Small p → strong evidence AGAINST H₀
Failing to reject H₀ ≠ proving H₀ is true
Example
x̄=52, μ₀=50, s=8, n=25. Find t.
t = (52 − 50)/(8/√25) = 2/1.6 = 1.25, df = 24
U6 Q13One-Sample z-Test & CI for a Proportion
Concept
For inference about a population proportion p, use the z-test when np₀ ≥ 10 and n(1−p₀) ≥ 10 (for tests) or np̂ ≥ 10 and n(1−p̂) ≥ 10 (for intervals). Use p₀ (hypothesized value) in the SE for tests; use p̂ for CI.
★ Memorize
Test: z = (p̂ − p₀) / √(p₀(1−p₀)/n)
CI: p̂ ± z* · √(p̂(1−p̂)/n)
Conditions: Random, np₀≥10 & n(1−p₀)≥10, 10% rule
Example
n=200, p̂=0.55, H₀: p=0.50. Find z.
z = (0.55−0.50)/√(0.50·0.50/200) = 0.05/0.0354 ≈ 1.41
U6 Q14Type I & Type II Errors, Power
Concept
Type I error (α): Rejecting a true H₀ (false positive). Type II error (β): Failing to reject a false H₀ (false negative). Power = 1 − β = probability of correctly rejecting a false H₀. Increasing α, n, or effect size all increase Power.
★ Memorize
Type I: reject true H₀ (probability = α)
Type II: fail to reject false H₀ (probability = β)
Power = 1 − β
Power ↑ when: n↑, α↑, effect size↑
Example
A test has α=0.05, β=0.20. What is the power?
Power = 1 − 0.20 = 0.80 (80%)
U7 Q15Two-Sample t-Procedures
Concept
Comparing two independent population means. Use two-sample t-test (do NOT assume equal variances unless stated). For paired data, take differences and use one-sample t on the differences.
★ Memorize
t = (x̄₁ − x̄₂) / √(s₁²/n₁ + s₂²/n₂)
df: use calculator (Welch) or min(n₁−1, n₂−1) conservatively
Paired data → use d̄ = mean difference, t = d̄ / (s_d/√n)
Example
n₁=n₂=20, s₁=s₂=5, x̄₁−x̄₂=3. Find t.
t = 3 / √(25/20 + 25/20) = 3 / √2.5 ≈ 1.897
U8 Q16Chi-Square Test for Independence
Concept
Used for two-way tables to test whether two categorical variables are associated. Expected counts must be at least 5 in each cell. df = (rows − 1)(columns − 1). H₀: variables are independent.
★ Memorize
Expected = (row total × col total) / grand total
χ² = Σ (Observed − Expected)² / Expected
df = (r−1)(c−1)
All expected counts ≥ 5 required
Example
2×3 table. What are the degrees of freedom?
df = (2−1)(3−1) = 1×2 = 2
U8 Q17Chi-Square Goodness of Fit
Concept
Tests whether observed frequency data matches a claimed distribution. H₀: the population follows the specified distribution. df = k − 1 where k = number of categories. Same expected count condition (≥ 5) applies.
★ Memorize
df = k − 1 (k = number of categories)
Expected count = n · p_i (claimed proportion)
χ² test is always right-tailed
Large χ² → evidence against H₀
Example
Testing if a die is fair (6 sides). df = ?
df = 6 − 1 = 5
U9 Q18Inference for Slope of Regression Line
Concept
Test H₀: β₁ = 0 (no linear relationship). Uses t-test with df = n − 2. The standard error of the slope (SE_b₁) comes from computer output. A small p-value gives evidence of a statistically significant linear relationship.
★ Memorize
t = b₁ / SE_b₁ , df = n − 2
Conditions: LINEAR, INDEPENDENT, NORMAL residuals, EQUAL variance, RANDOM
Acronym: LINE R
CI for slope: b₁ ± t* · SE_b₁
Example
b₁=2.3, SE_b₁=0.8, n=22. Find t.
t = 2.3/0.8 = 2.875, df = 22−2 = 20
U4 Q19Geometric Distribution
Concept
The geometric distribution counts the number of trials until the first success. Conditions are the same as binomial (BINS) except the number of trials is not fixed — it's the variable of interest. P(X=k) = (1−p)^(k−1)·p.
★ Memorize
P(X=k) = (1−p)^(k−1) · p
μ = 1/p
σ = √((1−p)/p²)
P(X>k) = (1−p)^k
Example
p=0.25. Find the expected number of trials until first success.
μ = 1/0.25 = 4 trials
U6 Q20Two-Proportion z-Test & CI
Concept
Compare two independent population proportions. For the test, use the pooled p̂ (combined successes / combined n) in the SE formula. For the CI, use separate p̂₁ and p̂₂.
★ Memorize
Test: z = (p̂₁ − p̂₂) / √(p̂_c(1−p̂_c)(1/n₁ + 1/n₂))
p̂_c = (X₁ + X₂)/(n₁ + n₂)
CI: (p̂₁−p̂₂) ± z*·√(p̂₁(1−p̂₁)/n₁ + p̂₂(1−p̂₂)/n₂)
Example
Group 1: 40/100 successes; Group 2: 30/100 successes. Find p̂_c.
p̂_c = (40+30)/(100+100) = 70/200 = 0.35

Timer starts when you click · Answers collected at the end

Your Score
0
/ 20
0%
Time: —
0
Correct
0
Incorrect
Answer Key & Explanations