20 carefully crafted problems covering every core unit — with memory anchors to help you explain concepts out loud.
20Problems
6Units
MCQFormat
0 / 20
Score: 0
Unit 01Exploring & Describing Data
🧠
Memory Anchor
SOCS — Shape · Outliers · Center · Spread |
Mean = pulled by outliers · Median = resistant |
IQR = Q3 − Q1 · Outlier if > Q3 + 1.5·IQR or < Q1 − 1.5·IQR
01
Measures of Center
A dataset has values: 3, 7, 7, 9, 10, 12, 50. Which statement BEST describes the relationship between the mean and median?
✅ Correct! Answer: B
Mean = (3+7+7+9+10+12+50)/7 = 98/7 ≈ 14.0, but Median = 9 (4th value when sorted). The extreme value 50 drags the mean far above the median. This is a classic right-skewed distribution — the tail stretches to the right. Median is "resistant" to outliers; mean is not.
02
Outlier Detection · IQR Rule
A dataset has Q1 = 18 and Q3 = 34. Using the 1.5 × IQR rule, which value is an outlier?
$$\text{IQR} = Q3 - Q1 \qquad \text{Outlier if } x < Q1 - 1.5\cdot\text{IQR} \;\text{ or }\; x > Q3 + 1.5\cdot\text{IQR}$$
A statistics teacher says: "We should report the median and IQR rather than the mean and standard deviation." Under what condition is this advice MOST appropriate?
✅ Correct! Answer: A
Median and IQR are resistant statistics — outliers don't move them much. Mean and standard deviation are sensitive to extreme values. So when data is skewed (e.g., income, home prices), use median + IQR. When data is roughly symmetric, mean + SD is preferred.
Unit 02Normal Distribution & z-Scores
🧠
Memory Anchor
z = (x − μ) / σ — how many SDs away from mean |
68-95-99.7 Rule: within 1, 2, 3 SDs |
Positive z = above mean · Negative z = below mean
04
z-Score Calculation
SAT scores are normally distributed with μ = 1060 and σ = 195. A student scores 1450. What is their z-score, and what does it mean?
$$z = \frac{x - \mu}{\sigma}$$
✅ Correct! Answer: C
z = (1450 − 1060) / 195 = 390 / 195 = 2.0. A positive z-score means the value is above the mean. z = 2 means the student scored exactly 2 standard deviations above average — better than approximately 97.7% of test-takers.
05
68-95-99.7 Rule · Word Problem
Heights of adult males are N(μ = 70 in, σ = 3 in). Approximately what percentage of men are shorter than 64 inches?
✅ Correct! Answer: B (2.5%)
64 in is 70 − 2(3) = 2 SDs below the mean. By the 95 rule, 95% of data falls within 2 SDs, so 5% falls outside. Since the normal distribution is symmetric, half of that 5% = 2.5% falls below z = −2. Don't confuse "below 2 SDs" (2.5%) with "below 1 SD" (16%)!
06
Percentile from z-Score · Tricky
Which student performed BETTER relative to their class?
• Ava: scored 82 on a test where μ = 75, σ = 7
• Ben: scored 78 on a test where μ = 68, σ = 4
✅ Correct! Answer: D
Never compare raw scores across different distributions! Always use z-scores.
• Ava: z = (82 − 75)/7 = 7/7 = 1.0
• Ben: z = (78 − 68)/4 = 10/4 = 2.5
Ben's z = 2.5 is much higher → he performed better relative to his classmates. Raw score comparisons are meaningless across different tests.
Unit 03Probability & Counting Rules
🧠
Memory Anchor
P(A or B) = P(A) + P(B) − P(A and B) — Addition Rule |
P(A and B) = P(A) × P(B|A) — Multiplication Rule |
Independent if P(B|A) = P(B) |
Mutually Exclusive → P(A and B) = 0
07
Addition Rule — "Or" Problems
In a class, P(plays guitar) = 0.35, P(plays piano) = 0.40, P(plays both) = 0.15. What is the probability that a randomly selected student plays guitar OR piano?
$$P(A \cup B) = P(A) + P(B) - P(A \cap B)$$
✅ Correct! Answer: A (0.60)
P(guitar OR piano) = 0.35 + 0.40 − 0.15 = 0.60. The key mistake students make is adding 0.35 + 0.40 = 0.75 and forgetting to subtract the overlap (students who play BOTH). If you count "both" groups separately, you double-count them — so subtract once.
08
Conditional Probability · Two-Way Table
Use the table below. A student is selected at random. Given that the student is female, what is the probability she prefers math?
Math
English
Total
Male
45
30
75
Female
36
54
90
Total
81
84
165
$$P(A \mid B) = \frac{P(A \cap B)}{P(B)}$$
✅ Correct! Answer: C (36/90 = 0.40)
"Given female" means we RESTRICT our sample space to the female row only (90 students). Of those 90, 36 prefer math. So P(Math | Female) = 36/90 = 0.40.
Common wrong answers: A uses the whole 165 (ignores the condition), D uses the math column total instead of female row total.
09
Independence vs. Mutual Exclusivity · Conceptual Trap
Events A and B are mutually exclusive and both have probability > 0. Which of the following MUST be true?
✅ Correct! Answer: B
This is one of the most common conceptual errors in statistics! Mutually exclusive ≠ independent.
• Mutually exclusive: P(A and B) = 0 — they can't happen together
• Independent: P(A and B) = P(A)·P(B) — knowing one tells you nothing about the other
If P(A) > 0 and P(B) > 0, then P(A)·P(B) > 0 ≠ 0. So mutually exclusive events with positive probability are always dependent. Knowing A occurred means B definitely did NOT occur.
Unit 04Sampling Distributions & CLT
🧠
Memory Anchor
CLT: As n↑, x̄ becomes normal regardless of population shape |
SE = σ/√n — bigger n → smaller spread |
Unbiased = center at parameter |
10% Rule: n ≤ 0.10 × N for independence
10
Central Limit Theorem · SE
A population has μ = 50 and σ = 20. Random samples of size n = 100 are drawn. Describe the sampling distribution of x̄.
✅ Correct! Answer: D
By the CLT (n = 100 ≥ 30), x̄ is approximately normal regardless of population shape.
• Mean of x̄ = μ = 50 ✓
• SE = σ/√n = 20/√100 = 20/10 = 2 ✓
Choice A uses σ = 20 (the population SD, not the SE). Choice B says "unknown shape" — CLT guarantees approximate normality. The word "Approximately" in D is key!
11
Bias & Variability · Word Problem
A researcher wants an estimator that is both unbiased and has low variability. Estimator A always overestimates by 3 units. Estimator B has results that vary widely but average exactly at the true value. Estimator C averages at the true value with small spread.
Which estimator is IDEAL?
✅ Correct! Answer: A
Think of a target: Bias = how far the center of your shots is from the bullseye. Variability = how spread out your shots are.
• A: Biased (consistently off) ✗
• B: Unbiased but high variability — you're centered on target but shots are all over the place ✗
• C: Unbiased AND low variability — tight cluster on the bullseye ✓ → Ideal!
12
Sampling Distribution of p̂ · Conditions
Before using a Normal approximation for p̂, we must check that np ≥ 10 and n(1−p) ≥ 10. In a survey, p = 0.08 and n = 90. Should we use the Normal approximation?
✅ Correct! Answer: C BOTH conditions must be met. np = 7.2 < 10 — FAIL! Even though n(1−p) is fine, one failed condition means we cannot use the Normal approximation. This happens when p is very small (or very close to 1) — we don't have enough expected "successes" for the distribution to be approximately normal.
Unit 05Confidence Intervals
🧠
Memory Anchor
CI = statistic ± (critical value)(SE) |
95% CI: z* = 1.96 |
Wider CI: higher confidence OR smaller n |
NEVER say: "95% chance μ is in this interval"
13
CI Interpretation · Classic Trap
A 95% confidence interval for the mean daily coffee intake is (2.1, 3.7) cups. Which interpretation is CORRECT?
✅ Correct! Answer: B
The true μ is a fixed number, not random — it's either in the interval or it isn't. There's no "probability" about it once calculated.
• A ✗ — Wrong: μ is fixed, not a random variable
• C ✗ — A CI is about the mean, not individual data values
• D ✗ — Wrong math: 95% of 20 = 19, not 95; and we'd need infinitely many samples
Correct language: "We used a method that captures the true mean 95% of the time in repeated sampling."
14
Margin of Error · Sample Size Calculation
A researcher wants a margin of error of ±3 points at 95% confidence (z* = 1.96). The population standard deviation is σ = 15. What minimum sample size is needed?
✅ Correct! Answer: D (97)
n ≥ (1.96 × 15 / 3)² = (29.4 / 3)² = (9.8)² = 96.04
Since n must be a whole number and we need AT LEAST 96.04, we round UP to 97. Never round down in sample size problems! Rounding to 96 would give a margin of error slightly larger than 3.
15
t-Interval vs z-Interval · When to Use Which
A researcher collects a random sample of n = 22 weights (in pounds) from a normally distributed population. The population standard deviation σ is unknown. Which procedure is appropriate for a confidence interval for the mean?
✅ Correct! Answer: A
Use the t-distribution when: (1) σ is unknown (so we use s) AND (2) either the population is normal OR n ≥ 30. Here n = 22 < 30, but the population is stated to be normal — so t-interval is valid. df = n − 1 = 21. Using z when σ is unknown is a classic error — the z-interval requires knowing the true σ.
Unit 06Hypothesis Testing
🧠
Memory Anchor
H₀ = null (no effect) · Hₐ = alternative (what you want to show) |
p-value: probability of data this extreme IF H₀ is true |
p < α → Reject H₀ |
Type I = false alarm · Type II = missed detection
16
p-value Interpretation
A hypothesis test yields a p-value of 0.032 with α = 0.05. Which conclusion is correct?
✅ Correct! Answer: C
Since p = 0.032 < α = 0.05, we reject H₀.
• B ✗ — The p-value is NOT the probability that H₀ is true! It's the probability of seeing data this extreme assuming H₀ is true.
• A ✗ — Wrong action for p < α
• D ✗ — We never "accept" H₀ (we only fail to reject), and statistical vs practical significance are separate concepts
Correct language: "We have convincing statistical evidence against H₀."
17
Type I & Type II Error · Real World
A medical test screens for a rare disease. The null hypothesis is H₀: patient does NOT have the disease.
A Type I Error would mean:
✅ Correct! Answer: B Type I Error = Rejecting H₀ when H₀ is actually true = "false positive."
Here H₀ = no disease, so Type I = reject H₀ (say "disease!") when the patient is actually healthy = false alarm. Type II Error = Failing to reject H₀ when H₀ is false = "false negative" = saying "no disease" when disease is actually present. The probability of Type I error = α. The probability of Type II error = β. Power = 1 − β.
18
One-Proportion z-Test · Full Problem
A school claims 70% of its students pass the state exam. A random sample of 200 students shows 130 passed. At α = 0.05, is there evidence that the true pass rate is less than 70%?
✅ Correct! Answer: D
p̂ = 130/200 = 0.65 | p₀ = 0.70
SE = √(0.70 × 0.30 / 200) = √(0.00105) ≈ 0.03240
z = (0.65 − 0.70) / 0.03240 = −0.05 / 0.03240 ≈ −1.54
This is a one-tailed (left) test. Critical value = −1.645. Since −1.54 > −1.645, we fail to reject H₀. There is not convincing evidence that the pass rate is below 70%.
19
Power of a Test · Conceptual
A researcher wants to increase the power of their hypothesis test. Which of the following actions would MOST directly increase power?
✅ Correct! Answer: A Power = P(correctly rejecting H₀ when it is false) = 1 − β.
Ways to increase power: ↑ n · ↑ α · use one-tailed test · larger true effect size · smaller σ
• B ✗ — Decreasing α makes the test more conservative, which decreases power (increases β)
• C ✗ — Two-tailed splits α, which reduces power compared to one-tailed
• D ✗ — Smaller n = less information = less power
Increasing n is the most common and effective way to boost power.
20
Chi-Square Test · Goodness of Fit
A die is rolled 120 times. We expect each face 20 times. Observed counts: 1→25, 2→18, 3→22, 4→16, 5→21, 6→18. The chi-square test statistic is:
$$\chi^2 = \sum \frac{(O - E)^2}{E}$$
✅ Correct! Answer: C (χ² = 2.50)
Each term (O−E)²/E with E = 20:
• Face 1: (25−20)²/20 = 25/20 = 1.25
• Face 2: (18−20)²/20 = 4/20 = 0.20
• Face 3: (22−20)²/20 = 4/20 = 0.20
• Face 4: (16−20)²/20 = 16/20 = 0.80
• Face 5: (21−20)²/20 = 1/20 = 0.05
• Face 6: (18−20)²/20 = 4/20 = 0.20
χ² = 1.25 + 0.20 + 0.20 + 0.80 + 0.05 + 0.20 = 2.50
df = 6 − 1 = 5. Critical value at α = 0.05 is 11.07. Since 2.50 < 11.07, we fail to reject — the die appears to be fair.