Unit 1 · Exploring Data
Q 01
Mean vs. Median — Skewed Distributions
SKEW → TAIL → MEAN CHASES THE TAIL
A real estate agent collects data on home prices in a neighborhood. The distribution of prices is strongly right-skewed due to a few mansion sales at $4M–$6M. The agent reports the mean selling price to attract buyers.
Mean ≈ $1,255,000 | Median ≈ $337,500
Which statement best explains why the median is a more appropriate measure of center for this data?
Context
Dataset (in $1000s): 280, 295, 310, 320, 335, 340, 360, 410, 4100, 5800Mean ≈ $1,255,000 | Median ≈ $337,500
Q 02
IQR and Outlier Detection
OUTLIER FENCE = Q1 − 1.5·IQR and Q3 + 1.5·IQR
A dataset of exam scores has the following five-number summary:
Min = 42 Q₁ = 61 Median = 74 Q₃ = 82 Max = 99
A student who scored 38 claims their score is an outlier. A second student who scored 99 also claims their score is an outlier. Using the 1.5 × IQR rule, which claim(s) are correct?
Unit 2 · Normal Distribution
Q 03
Z-Score Interpretation
z = (x − μ) / σ → DISTANCE IN STANDARD DEVIATIONS
The heights of adult males in a city are approximately normally distributed with a mean of μ = 70 inches and standard deviation σ = 3 inches. Marcus is 76 inches tall, and Devon is 64 inches tall.
Marcus says: "My z-score means I'm in the top 2.5% of heights."
Devon says: "My z-score is −2, so I'm 2 inches below average."
Which of the following is correct?
Marcus says: "My z-score means I'm in the top 2.5% of heights."
Devon says: "My z-score is −2, so I'm 2 inches below average."
Which of the following is correct?
Q 04
Empirical Rule (68–95–99.7)
68% (±1σ) → 95% (±2σ) → 99.7% (±3σ) — MEMORIZE THIS
SAT Math scores are approximately normal with μ = 500 and σ = 100. A college only admits students scoring above 700.
Approximately what percentage of all SAT Math test-takers would qualify for admission?
Approximately what percentage of all SAT Math test-takers would qualify for admission?
Unit 3 · Bivariate Data & Regression
Q 05
Interpreting Slope in Context
SLOPE = "For each 1-unit ↑ in x, ŷ PREDICTED to ↑/↓ by [slope]"
A researcher studies the relationship between hours of sleep (x) and reaction time in milliseconds (y) among college students. The least-squares regression line is:
ŷ = 420 − 28x
Which of the following is the correct interpretation of the slope in context?
Q 06
Coefficient of Determination r²
r² = % of variation in y EXPLAINED by the linear relationship with x
A scatterplot of advertising spending (x, in $1000s) vs. monthly sales (y, in $1000s) for a company yields a correlation of r = 0.87.
Which of the following is the most accurate interpretation of r²?
Which of the following is the most accurate interpretation of r²?
Q 07
Residuals and Residual Plots
RESIDUAL = ACTUAL − PREDICTED (y − ŷ)
After fitting a linear model to data on engine size (liters) vs. fuel efficiency (mpg), a statistician examines the residual plot. She notices a clear curved (U-shaped) pattern in the residuals rather than a random scatter around zero.
What does this pattern indicate?
What does this pattern indicate?
Unit 4 · Designing Studies
Q 08
Observational Study vs. Experiment — Causation
ASSOCIATION ≠ CAUSATION (only controlled experiments → causation)
A national health survey finds that people who drink 2+ cups of coffee per day have significantly lower rates of type 2 diabetes. A news headline reads: "Coffee Prevents Diabetes!"
A statistician objects to this headline. Which of the following best justifies the statistician's objection?
A statistician objects to this headline. Which of the following best justifies the statistician's objection?
Q 09
Sampling Methods — Bias Recognition
VOLUNTARY RESPONSE BIAS = only strong opinions respond → NOT representative
A school newspaper wants to know students' opinions on extending lunch period. They post an online poll on the school's social media page and receive 312 responses.
Which type of sampling bias is most likely present, and why?
Which type of sampling bias is most likely present, and why?
Unit 5 · Probability
Q 10
Conditional Probability — The Classic Trap
P(A|B) = P(A∩B) / P(B) — GIVEN means DIVIDE by the given row/column
A college surveys 400 students about their major and whether they use a planner:
Uses Planner Doesn't Use Total
STEM 120 80 200
Non-STEM 60 140 200
Total 180 220 400
What is the probability that a randomly selected student is a STEM major, given that they use a planner?
STEM 120 80 200
Non-STEM 60 140 200
Total 180 220 400
Q 11
Independent vs. Mutually Exclusive Events
INDEPENDENT: P(A∩B) = P(A)·P(B) | MUTUALLY EXCLUSIVE: P(A∩B) = 0
Two events A and B have P(A) = 0.4, P(B) = 0.3, and P(A ∩ B) = 0.
A student claims: "Since P(A ∩ B) = 0, events A and B must be independent."
Is the student correct?
A student claims: "Since P(A ∩ B) = 0, events A and B must be independent."
Is the student correct?
Unit 6 · Random Variables
Q 12
Combining Random Variables — Standard Deviation Rule
σ²(X±Y) = σ²X + σ²Y — VARIANCES ADD, not standard deviations!
Let X = daily revenue from morning coffee sales, with μ_X = $800, σ_X = $60.
Let Y = daily revenue from afternoon pastry sales, with μ_Y = $400, σ_Y = $40.
Assume X and Y are independent.
What is the standard deviation of total daily revenue, X + Y?
Let Y = daily revenue from afternoon pastry sales, with μ_Y = $400, σ_Y = $40.
Assume X and Y are independent.
What is the standard deviation of total daily revenue, X + Y?
Q 13
Binomial Distribution — Conditions Check
BINS: Binary · Independent · Number fixed · Same probability each trial
A call center receives calls, and 30% of callers are put on hold. An agent handles 15 randomly selected calls in one hour.
What is the probability that exactly 4 of the 15 callers are put on hold?
Which expression correctly calculates this probability?
What is the probability that exactly 4 of the 15 callers are put on hold?
Setup Check — BINS
B: on hold (yes/no) ✓ | I: calls independent ✓ | N: n = 15 (fixed) ✓ | S: p = 0.30 each call ✓
Unit 7 · Sampling Distributions
Q 14
Central Limit Theorem — When Does It Apply?
CLT: n ≥ 30 → x̄ is approx. NORMAL regardless of population shape
The distribution of a company's daily number of customer complaints is strongly right-skewed with mean μ = 12 complaints and standard deviation σ = 8.
A manager takes random samples of n = 36 days and records the sample mean x̄.
Which of the following best describes the sampling distribution of x̄?
A manager takes random samples of n = 36 days and records the sample mean x̄.
Which of the following best describes the sampling distribution of x̄?
Q 15
Standard Error vs. Standard Deviation
SE of x̄ = σ/√n — BIGGER n → SMALLER SE → MORE PRECISE estimate
A polling organization wants to estimate the average amount Americans spend per week on food. In Study A, they survey n = 100 people. In Study B, they survey n = 400 people. Both populations have σ = $60.
How does the standard error of x̄ change from Study A to Study B?
How does the standard error of x̄ change from Study A to Study B?
Unit 8 · Confidence Intervals
Q 16
Interpreting a Confidence Interval — The #1 Most Missed
"95% confident" = 95% of ALL intervals built this way will CAPTURE the true parameter
A researcher constructs a 95% confidence interval for the mean daily screen time of teenagers: (5.8 hours, 7.2 hours).
A classmate says: "There is a 95% probability that the true mean daily screen time falls between 5.8 and 7.2 hours."
Which is the correct interpretation?
A classmate says: "There is a 95% probability that the true mean daily screen time falls between 5.8 and 7.2 hours."
Which is the correct interpretation?
Q 17
Margin of Error — Factors That Affect Width
↑ Confidence Level → ↑ Width | ↑ n → ↓ Width | ↑ σ → ↑ Width
A researcher wants to estimate the average commute time in a city. She currently has a 95% CI with a margin of error of ±8 minutes based on n = 50 commuters.
She wants to reduce the margin of error to ±4 minutes (half the original) while keeping the confidence level at 95%. Approximately how large must her new sample be?
She wants to reduce the margin of error to ±4 minutes (half the original) while keeping the confidence level at 95%. Approximately how large must her new sample be?
Unit 9 · Significance Testing
Q 18
P-value Interpretation — The Deepest Trap
P-value = P(data this extreme OR MORE | H₀ is TRUE) — NOT probability H₀ is true!
A pharmaceutical company tests whether a new drug reduces blood pressure more than the current standard. They set H₀: μ_new = μ_standard and obtain a p-value of 0.03.
A doctor says: "The p-value of 0.03 means there's only a 3% chance that the null hypothesis is true."
Is the doctor's interpretation correct?
A doctor says: "The p-value of 0.03 means there's only a 3% chance that the null hypothesis is true."
Is the doctor's interpretation correct?
Q 19
Type I vs. Type II Errors
TYPE I (α) = reject TRUE H₀ = "false alarm" | TYPE II (β) = fail to reject FALSE H₀ = "miss"
A quality control manager tests whether a machine produces bolts with the correct diameter (H₀: machine is working correctly). If the machine is actually broken (producing defective bolts), but the test fails to reject H₀, what has occurred?
Real-World Stakes
This means defective bolts continue to ship to customers — the broken machine goes undetected.
Q 20
Chi-Square Test — Expected Counts & Independence
Expected count = (Row Total × Column Total) / Grand Total
A researcher surveys 500 people on their preferred social media platform and their age group:
Instagram TikTok Facebook Total
18–35 140 110 50 300
36–60 40 20 140 200
Total 180 130 190 500
What is the expected count for the cell (36–60, TikTok) under the null hypothesis of independence between age and platform preference?
18–35 140 110 50 300
36–60 40 20 140 200
Total 180 130 190 500
🎓
Quiz Complete!
Here's how you did on all 20 questions
0/20
CORRECT ANSWERS