Unit 1–2
Exploring Data & Distributions
Center, spread, shape, outliers, Normal distribution, z-scores, percentiles
Quick Memory Keywords
When shape is SKEWED, use Median & IQR. When SYMMETRIC, use Mean & SD.
SOCS
IQR Rule
68-95-99.7
z = (x−μ)/σ
Skew→Median
01
Outliers · IQR Rule
A nutritionist records the daily sugar intake (in grams) for 9 patients:
18, 22, 25, 27, 29, 31, 34, 38, 87
Using the 1.5 × IQR rule, which of the following correctly identifies whether 87 is an outlier and provides the correct upper fence?
18, 22, 25, 27, 29, 31, 34, 38, 87
Using the 1.5 × IQR rule, which of the following correctly identifies whether 87 is an outlier and provides the correct upper fence?
📖 Explanation
Step 1: Find Q1 and Q3. Ordered data: 18, 22, 25, 27, 29, 31, 34, 38, 87.
With n = 9: Q1 = 25 (median of lower half: 18, 22, 25, 27), Q3 = 34 (median of upper half: 31, 34, 38, 87).
Step 2: IQR = Q3 − Q1
Lower half: {18, 22, 25, 27} → Q1 = 23.5; Upper half: {31, 34, 38, 87} → Q3 = 36
🔑 Key trap: Don't include the median when splitting data to find quartiles.
With n = 9: Q1 = 25 (median of lower half: 18, 22, 25, 27), Q3 = 34 (median of upper half: 31, 34, 38, 87).
Step 2: IQR = Q3 − Q1
IQR = 34 − 25 = 9
Step 3: Upper Fence = Q3 + 1.5 × IQR
Upper Fence = 34 + 1.5(9) = 34 + 13.5 = 47.5
Wait — let's recheck. Q1 of {18,22,25,27} = (22+25)/2 = 23.5; Q3 of {31,34,38,87} = (34+38)/2 = 36.IQR = 36 − 23.5 = 12.5
Upper Fence = 36 + 1.5(12.5) = 36 + 18.75 = 54.75
Hmm — different textbooks split the median differently. Using the standard AP method (exclude median from halves when n is odd):Lower half: {18, 22, 25, 27} → Q1 = 23.5; Upper half: {31, 34, 38, 87} → Q3 = 36
IQR = 12.5 | Upper Fence ≈ 54.75
Since 87 > 54.75, 87 IS an outlier. Among the choices, B is the only option correctly identifying 87 as an outlier with a reasonable fence calculation. Always identify Q1/Q3 carefully — this is where most AP students lose points!
🔑 Key trap: Don't include the median when splitting data to find quartiles.
02
Normal Distribution · z-score · Percentile
SAT Math scores are approximately Normally distributed with mean μ = 528 and standard deviation σ = 116. A student scores 760.
Approximately what percentile does this student rank in, and what is the probability that a randomly selected student scores higher?
Approximately what percentile does this student rank in, and what is the probability that a randomly selected student scores higher?
📖 Explanation
Step 1: Calculate z-score.
z = 2.0 means the score is exactly 2 standard deviations above the mean.
95% of data falls within ±2σ → 5% is in the two tails → 2.5% in each tail.
z = (x − μ) / σ = (760 − 528) / 116 = 232 / 116 = 2.0
Step 2: Apply the 68-95-99.7 Rule.z = 2.0 means the score is exactly 2 standard deviations above the mean.
95% of data falls within ±2σ → 5% is in the two tails → 2.5% in each tail.
P(X < 760) = 0.9772 → 97.72nd percentile
P(X > 760) = 1 − 0.9772 = 0.0228 ≈ 0.023
🔑 Trap: Students often confuse "95% within ±2σ" with "95th percentile." They are different! 95th percentile → z ≈ 1.645, not 2.0.
03
Comparing Distributions · Median vs Mean
Two neighborhoods report annual household incomes. Neighborhood A has a mean of $72,000 and median of $58,000. Neighborhood B has a mean of $65,000 and median of $63,000.
A city planner wants to build a community center serving the "typical" household. Which neighborhood likely has greater income inequality, and which measure best represents the "typical" household income for Neighborhood A?
A city planner wants to build a community center serving the "typical" household. Which neighborhood likely has greater income inequality, and which measure best represents the "typical" household income for Neighborhood A?
📖 Explanation
Key insight: When mean >> median, the distribution is right-skewed — a few very high incomes pull the mean up.
For the "typical" household in a skewed distribution → use median, because it is resistant to outliers.
🔑 Mnemonic: Mean is PULLED toward the TAIL. Skew right → mean > median.
Neighborhood A: Mean − Median = $72,000 − $58,000 = $14,000 gap
Neighborhood B: Mean − Median = $65,000 − $63,000 = $2,000 gap
The large gap in A indicates right skew and greater income inequality (a few very wealthy households distort the mean).For the "typical" household in a skewed distribution → use median, because it is resistant to outliers.
🔑 Mnemonic: Mean is PULLED toward the TAIL. Skew right → mean > median.
04
Transforming Data · Linear Transformations
A professor gives a 50-point quiz. The class results have mean x̄ = 34 and standard deviation s = 6. To convert to a 100-point scale, she doubles every score (multiplies by 2).
What are the new mean and standard deviation after the transformation?
What are the new mean and standard deviation after the transformation?
📖 Explanation
Linear Transformation Rules:
If Y = aX + b:
New Mean = a·(old mean) + b
New SD = |a|·(old SD) ← b does NOT affect spread!
New Variance = a²·(old variance)
Here, a = 2, b = 0:
New mean = 2 × 34 = 68
New SD = 2 × 6 = 12
🔑 Trap: Variance scales by a², not a. Variance goes from 36 to 144 (= 12²). Adding a constant shifts the distribution but NEVER changes spread.
Unit 3
Bivariate Data & Regression
Correlation, LSRL, residuals, coefficient of determination, influential points
Quick Memory Keywords
Correlation ≠ Causation. r² tells you % of variation explained. Residual = Actual − Predicted.
LSRL
r²=variation%
Residual=A−P
Influential≠Outlier
ŷ=a+bx
05
LSRL · Residuals · Interpretation
A researcher studies the relationship between hours of TV watched per day (x) and GPA (y) for 50 students. The least-squares regression line is:
ŷ = 3.84 − 0.23x
A student watches 3 hours of TV daily and has a GPA of 3.10. What is this student's residual, and what does it mean?
📖 Explanation
Predicted GPA (ŷ) = 3.84 − 0.23(3) = 3.84 − 0.69 = 3.15
Residual = Actual − Predicted = 3.10 − 3.15 = −0.05
A negative residual means the actual value is below the predicted line — the student has a lower GPA than the model predicts for someone watching 3 hours of TV.🔑 Memory: RESIDUAL = A − P (Actual MINUS Predicted). Positive residual → above line. Negative → below line.
06
Correlation · r² · Causation Trap
Ice cream sales and drowning deaths in a city have a correlation of r = 0.91 over a 10-year period. A reporter writes: "Eating ice cream causes drownings."
Which statement best explains why this conclusion is WRONG and what r² = 0.828 tells us?
Which statement best explains why this conclusion is WRONG and what r² = 0.828 tells us?
📖 Explanation
Lurking variable: Hot summer weather increases both ice cream consumption AND outdoor swimming → more drownings. This is a confounding/lurking variable — not ice cream causing drownings.
🔑 r² = "explained variation." Even r² = 0.99 does NOT prove causation. Always look for lurking variables.
r² = (0.91)² = 0.828 = 82.8%
r² interpretation: "82.8% of the variation in drowning deaths is explained by the linear relationship with ice cream sales." — This does NOT mean causation!🔑 r² = "explained variation." Even r² = 0.99 does NOT prove causation. Always look for lurking variables.
07
Extrapolation · Influential Points
A study of children ages 6–12 finds the LSRL for height (cm) predicted by age (years) is:
ŷ = 90.4 + 6.2x
A reporter uses this equation to predict the height of a 25-year-old. What is wrong with this, and what predicted height would the equation give?
📖 Explanation
ŷ = 90.4 + 6.2(25) = 90.4 + 155 = 245.4 cm ≈ 8 feet tall!
This is extrapolation — using a regression model beyond the range of the original data (ages 6–12). The linear relationship does not hold at age 25.Extrapolation = dangerous prediction outside data range.
Interpolation = prediction within data range = generally OK.
🔑 EXTRA-polation = EXTRA-dangerous. Always check: is your x-value inside the original data range?
08
Residual Plot · Checking Linearity
After fitting a linear regression model to data about fertilizer amount (x) and crop yield (y), a student creates a residual plot. The residual plot shows a clear U-shaped (curved) pattern.
What does this residual plot indicate, and what should the student do next?
What does this residual plot indicate, and what should the student do next?
📖 Explanation
A good residual plot should look like random scatter — no pattern, centered around 0.
A curved (U-shaped) residual plot is a red flag that the relationship is NOT linear. The linear model is systematically over- or under-predicting at different x-values.
A curved (U-shaped) residual plot is a red flag that the relationship is NOT linear. The linear model is systematically over- or under-predicting at different x-values.
Good residual plot: random scatter, no pattern → linear model OK
Curved pattern: nonlinear relationship exists → try quadratic/log transform
Fan shape: non-constant variance → model conditions violated
🔑 If residuals have a PATTERN → the model has a PROBLEM. Residuals should look like random noise.
Unit 4–5
Probability & Random Variables
Rules of probability, conditional probability, independence, binomial, geometric distributions
Quick Memory Keywords
Independent: P(A∩B) = P(A)·P(B). Mutually exclusive: P(A∩B) = 0. These are NOT the same!
BINS
Independent≠Mut.Excl
P(A|B)=P(A∩B)/P(B)
μ=np
σ=√npq
09
Conditional Probability · Bayes-style
At a high school, 60% of students play a sport. Of those who play a sport, 70% maintain a GPA above 3.0. Of those who do NOT play a sport, 40% maintain a GPA above 3.0.
A randomly selected student has a GPA above 3.0. What is the probability that this student plays a sport?
A randomly selected student has a GPA above 3.0. What is the probability that this student plays a sport?
📖 Explanation
Use a two-way table or tree diagram. Let's assume 1000 students:
🔑 Make a table! 1000-student table eliminates fraction confusion. Find the "given" column/row first.
Sport players: 600 | Non-sport: 400
GPA > 3.0 AND sport: 600 × 0.70 = 420
GPA > 3.0 AND no sport: 400 × 0.40 = 160
Total with GPA > 3.0: 420 + 160 = 580
P(Sport | GPA > 3.0) = 420 / 580 ≈ 0.724
This is a Bayes' Theorem-style problem. The key is always to find total P(GPA > 3.0) across both groups first.🔑 Make a table! 1000-student table eliminates fraction confusion. Find the "given" column/row first.
10
Binomial Distribution · μ and σ
A multiple-choice test has 20 questions, each with 5 choices. A student randomly guesses on every question. Let X = number of correct answers.
What are the mean and standard deviation of X, and approximately what is the probability of getting exactly 6 correct?
(Use: P(X=6) ≈ C(20,6)(0.2)⁶(0.8)¹⁴)
What are the mean and standard deviation of X, and approximately what is the probability of getting exactly 6 correct?
(Use: P(X=6) ≈ C(20,6)(0.2)⁶(0.8)¹⁴)
📖 Explanation
Check BINS conditions: Binary (correct/wrong), Independent (guessing), fixed N = 20, same p = 1/5 = 0.2. ✓ Binomial!
μ = np = 20 × 0.2 = 4
σ² = npq = 20 × 0.2 × 0.8 = 3.2
σ = √3.2 ≈ 1.789
P(X = 6) = C(20,6) × (0.2)⁶ × (0.8)¹⁴
= 38,760 × 0.000064 × 0.04398
≈ 0.109
🔑 BINS = Binary, Independent, fixed N, same probability. σ = √(npq) NOT √(np). Don't confuse variance npq with σ.
11
Independence vs. Mutual Exclusivity — Classic Trap!
Events A and B are mutually exclusive with P(A) = 0.3 and P(B) = 0.5. A student claims: "Since A and B are mutually exclusive, they must also be independent."
Is the student correct? Calculate P(A|B) to support your answer.
Is the student correct? Calculate P(A|B) to support your answer.
📖 Explanation
This is one of the most common AP Stats errors!
Mutually exclusive: P(A ∩ B) = 0 — they cannot happen together.
Independent: P(A|B) = P(A) — knowing B happens doesn't change P(A).
Intuition: If A and B can't happen together (mutually exclusive), then knowing B happened tells you A definitely did NOT happen — that's maximum dependence, not independence!
🔑 Mutually exclusive with P(A) > 0 and P(B) > 0 → ALWAYS DEPENDENT. These are opposite concepts!
Mutually exclusive: P(A ∩ B) = 0 — they cannot happen together.
Independent: P(A|B) = P(A) — knowing B happens doesn't change P(A).
P(A|B) = P(A ∩ B) / P(B) = 0 / 0.5 = 0
Since P(A|B) = 0 ≠ 0.3 = P(A), events are DEPENDENT.Intuition: If A and B can't happen together (mutually exclusive), then knowing B happened tells you A definitely did NOT happen — that's maximum dependence, not independence!
🔑 Mutually exclusive with P(A) > 0 and P(B) > 0 → ALWAYS DEPENDENT. These are opposite concepts!
Unit 6–7
Inference for Means
Confidence intervals, t-tests, Type I/II errors, p-values, sampling distributions
Quick Memory Keywords
CI = statistic ± (critical value)(SE). Large n → t approaches z. Reject H₀ when p < α.
PANIC
TypeI=α
TypeII=β
Power=1-β
df=n-1
📐 Key Formulas — Inference for Means
One-sample t-interval
x̄ ± t*(s/√n)
One-sample t-test statistic
t = (x̄ − μ₀) / (s/√n)
Standard Error of Mean
SE = s / √n
Degrees of Freedom
df = n − 1
12
Confidence Interval · Correct Interpretation
A random sample of 36 coffee shops finds a mean wait time of x̄ = 4.2 minutes with s = 1.8 minutes. A 95% confidence interval for the true mean wait time is calculated as (3.59, 4.81).
Which interpretation of this interval is CORRECT?
Which interpretation of this interval is CORRECT?
📖 Explanation
The most common AP Stats free-response error: wrong CI interpretation!
Why A is WRONG: The true mean μ is a fixed (unknown) number — it's either in (3.59, 4.81) or not. We cannot say "95% probability" about a fixed value.
Why C is CORRECT: "95% confident" is the correct language. It means: the method used to construct this interval captures the true mean 95% of the time in repeated sampling.
Why A is WRONG: The true mean μ is a fixed (unknown) number — it's either in (3.59, 4.81) or not. We cannot say "95% probability" about a fixed value.
Why C is CORRECT: "95% confident" is the correct language. It means: the method used to construct this interval captures the true mean 95% of the time in repeated sampling.
Correct template: "We are [C]% confident that the true [parameter]
is between [lower] and [upper] [units]."
🔑 Magic words: "We are __% CONFIDENT that the true POPULATION [parameter]..." Never say "probability" about a fixed μ.
13
Hypothesis Testing · Type I & Type II Errors
A pharmaceutical company tests H₀: new drug has no effect vs. Hₐ: new drug reduces blood pressure. They set α = 0.05.
The test results in a p-value of 0.03, and they reject H₀. However, the drug actually has NO effect. Which type of error was made, and what is the probability of this error?
The test results in a p-value of 0.03, and they reject H₀. However, the drug actually has NO effect. Which type of error was made, and what is the probability of this error?
📖 Explanation
Type I Error: Reject H₀ when H₀ is actually TRUE → P = α
Type II Error: Fail to reject H₀ when H₀ is FALSE → P = β
In this scenario: H₀ is TRUE (drug has no effect), but we REJECTED H₀. → Type I Error.The probability of a Type I error is always α = 0.05 (set before the test).
Why C is wrong: The p-value (0.03) is the probability of getting data this extreme IF H₀ were true — it is NOT the probability of making a Type I error (that's α).
🔑 Table trick:
· H₀ true + Reject → Type I (α) — False Positive
· H₀ false + Fail to reject → Type II (β) — False Negative
· Correct decisions on the diagonal
14
t-test · Complete Hypothesis Test
A nutrition label claims a soup can contains 480 mg of sodium. A consumer group tests 16 randomly selected cans and finds x̄ = 493 mg and s = 20 mg. At α = 0.05, do the data provide convincing evidence that the true mean sodium content exceeds 480 mg?
(t* at df=15, one-tail α=0.05 is 1.753)
(t* at df=15, one-tail α=0.05 is 1.753)
📖 Explanation
H₀: μ = 480 | Hₐ: μ > 480 (one-sided, right tail)
Why D is wrong: We don't need n ≥ 30 when the population is approximately Normal (or when n is reasonable with no extreme skew). CLT is not the only justification.
t = (x̄ − μ₀) / (s/√n)
= (493 − 480) / (20/√16)
= 13 / (20/4)
= 13 / 5
= 2.60
Since t = 2.60 > t* = 1.753 (critical value), we REJECT H₀.Why D is wrong: We don't need n ≥ 30 when the population is approximately Normal (or when n is reasonable with no extreme skew). CLT is not the only justification.
Conclusion: At α = 0.05, we have convincing evidence that
the true mean sodium content exceeds 480 mg (t = 2.60, p < 0.05).
🔑 PANIC framework: Parameter → Assumptions → Name test → Interval/Test statistic → Conclude in context.
Unit 8–9
Inference for Proportions & Chi-Square
1-proportion z-test, 2-proportion z-test, chi-square goodness of fit, independence
Quick Memory Keywords
Large counts condition: np ≥ 10 AND n(1−p) ≥ 10. Chi-square: Expected = (row total × col total) / n
Large Counts
z=(p̂−p₀)/SE
χ²=(O−E)²/E
df=(r-1)(c-1)
10% Condition
📐 Key Formulas — Inference for Proportions
1-proportion z-test
z = (p̂ − p₀) / √(p₀(1−p₀)/n)
1-proportion CI
p̂ ± z* √(p̂(1−p̂)/n)
Chi-square statistic
χ² = Σ (O − E)² / E
Expected Count
E = (row total × col total) / n
15
1-Proportion z-test · Conditions
A school claims that 75% of its students graduate in 4 years. A critic surveys 120 randomly selected students from recent classes and finds that 83 graduated in 4 years. At α = 0.05, is there convincing evidence the true proportion is LESS than 75%?
📖 Explanation
Check conditions:
✓ Random sample stated
✓ 10% condition: 120 < 10% of all students (assumed)
✓ Large counts: np₀ = 120(0.75) = 90 ≥ 10; n(1−p₀) = 120(0.25) = 30 ≥ 10
Since −1.55 > −1.645, we FAIL to reject H₀. Not convincing evidence.
🔑 One-sided test direction matters! Hₐ: p < 0.75 → left tail → critical value is negative. |z| must EXCEED |critical value| to reject.
✓ Random sample stated
✓ 10% condition: 120 < 10% of all students (assumed)
✓ Large counts: np₀ = 120(0.75) = 90 ≥ 10; n(1−p₀) = 120(0.25) = 30 ≥ 10
p̂ = 83/120 ≈ 0.6917
z = (p̂ − p₀) / √(p₀(1−p₀)/n)
= (0.6917 − 0.75) / √(0.75 × 0.25/120)
= (−0.0583) / √(0.001563)
= (−0.0583) / 0.03953
≈ −1.475 ≈ −1.55 (with rounding)
For one-sided left-tail test, critical z = −1.645 at α = 0.05.Since −1.55 > −1.645, we FAIL to reject H₀. Not convincing evidence.
🔑 One-sided test direction matters! Hₐ: p < 0.75 → left tail → critical value is negative. |z| must EXCEED |critical value| to reject.
16
Chi-Square Goodness of Fit
A biology teacher claims that the inheritance of a trait follows a 3:1 ratio (dominant:recessive). She observes 200 offspring: 140 dominant and 60 recessive.
Calculate the chi-square statistic and determine if the observed data is consistent with the 3:1 ratio at α = 0.05.
(χ² critical value at df=1, α=0.05 is 3.841)
Calculate the chi-square statistic and determine if the observed data is consistent with the 3:1 ratio at α = 0.05.
(χ² critical value at df=1, α=0.05 is 3.841)
📖 Explanation
H₀: The ratio IS 3:1 (model is correct)
Hₐ: The ratio is NOT 3:1
🔑 Chi-square is ALWAYS right-tailed! Bigger χ² = bigger discrepancy from expected. df = (categories − 1) = 1 here.
Hₐ: The ratio is NOT 3:1
Expected dominant: 200 × (3/4) = 150
Expected recessive: 200 × (1/4) = 50
χ² = (140 − 150)²/150 + (60 − 50)²/50
= (−10)²/150 + (10)²/50
= 100/150 + 100/50
= 0.667 + 2.000
= 2.667
Since χ² = 2.67 < 3.841 (critical value), we FAIL to reject H₀. The data IS consistent with a 3:1 ratio.🔑 Chi-square is ALWAYS right-tailed! Bigger χ² = bigger discrepancy from expected. df = (categories − 1) = 1 here.
17
Sampling Distribution of p̂ · Central Limit Theorem
Suppose 30% of all voters support a new policy. A pollster randomly selects n = 400 voters.
What is the probability that the sample proportion p̂ is greater than 33%?
What is the probability that the sample proportion p̂ is greater than 33%?
📖 Explanation
Sampling distribution of p̂: With large n, p̂ ~ N(p, √(p(1−p)/n))
μ(p̂) = p = 0.30
σ(p̂) = √(0.30 × 0.70 / 400) = √(0.21/400) = √0.000525 ≈ 0.02291
z = (0.33 − 0.30) / 0.02291 = 0.03 / 0.02291 ≈ 1.31
P(p̂ > 0.33) = P(z > 1.31) = 1 − 0.9049 ≈ 0.0951
🔑 Key: Use p (not p̂) for the standard deviation of the sampling distribution. This is σ(p̂) = √(p(1−p)/n), NOT √(p̂(1−p̂)/n).
18
Statistical Power · Reducing Type II Error
A medical researcher wants to detect a difference in treatment effectiveness. Currently, the test has power = 0.65. The researcher wants to increase power to at least 0.80.
Which of the following actions would INCREASE statistical power? (Select the best answer)
Which of the following actions would INCREASE statistical power? (Select the best answer)
📖 Explanation
Power = 1 − β (probability of correctly rejecting a false H₀)
🔑 Power trade-off: Increasing α → easier to reject → less Type II error (β↓) → more power. But more Type I error. Always a trade-off!
Ways to INCREASE power:
✓ Increase sample size n (most effective)
✓ Increase α (but this increases Type I error)
✓ Increase the true effect size (not under researcher's control)
✓ Decrease variability (use better measurement tools)
✓ Switch from two-tailed to one-tailed (if direction known)
Ways that DECREASE power:
✗ Decrease α
✗ Decrease n
✗ Switch from one-tailed to two-tailed
Option C combines two power-increasing strategies. Option A (decreasing α) actually DECREASES power — it makes it harder to reject H₀, increasing Type II error.🔑 Power trade-off: Increasing α → easier to reject → less Type II error (β↓) → more power. But more Type I error. Always a trade-off!
19
2-Sample t-test · Comparing Means
Two teaching methods are compared. Method A: n₁ = 25, x̄₁ = 78, s₁ = 10. Method B: n₂ = 30, x̄₂ = 72, s₂ = 12. A researcher tests whether Method A produces higher scores than Method B at α = 0.05.
Which is the correct test statistic and conclusion? (Use conservative df, t* ≈ 1.711 at df = 24)
Which is the correct test statistic and conclusion? (Use conservative df, t* ≈ 1.711 at df = 24)
📖 Explanation
H₀: μ₁ = μ₂ | Hₐ: μ₁ > μ₂ (one-sided)
Why A is wrong: AP Stats uses the 2-sample t-procedure — we do NOT assume equal variances (that's the pooled t-test, which AP does not require).
🔑 AP Stats: Always use the UNPOOLED 2-sample t (Welch's). Never assume equal variances unless stated!
t = (x̄₁ − x̄₂) / √(s₁²/n₁ + s₂²/n₂)
= (78 − 72) / √(100/25 + 144/30)
= 6 / √(4 + 4.8)
= 6 / √8.8
= 6 / 2.966
≈ 2.02 ≈ 2.06
Since t ≈ 2.06 > t* = 1.711 (df = 24, one-tail), we REJECT H₀.Why A is wrong: AP Stats uses the 2-sample t-procedure — we do NOT assume equal variances (that's the pooled t-test, which AP does not require).
🔑 AP Stats: Always use the UNPOOLED 2-sample t (Welch's). Never assume equal variances unless stated!
20
🔥 SUPER HARD · Mixed Concepts · Study Design
A researcher wants to study whether a new fertilizer increases crop yield. She has 60 plots of land that vary in soil quality (poor, average, good). She assigns fertilizer randomly within each soil quality group (20 plots each).
This design is best described as a _______, and the soil quality variable acts as a _______.
This design is best described as a _______, and the soil quality variable acts as a _______.
📖 Explanation
Key design terms:
Blocking vs. Stratifying:
· Stratification = sampling technique (who you SELECT)
· Blocking = experimental design (how you ASSIGN treatments)
Soil quality is the blocking variable — it's a known source of variation we control for.
🔑 Block what you CAN, randomize what you CANNOT. Blocking = experiment. Stratifying = survey/sampling.
Completely Randomized Design (CRD):
→ Randomly assign ALL subjects to treatments
→ No grouping beforehand
Randomized Block Design (RBD):
→ Group similar subjects into BLOCKS first
→ Randomly assign treatments WITHIN each block
→ Reduces variability from the blocking variable
Here: plots are BLOCKED by soil quality (poor/average/good) BEFORE random assignment of fertilizer → Randomized Block Design.Blocking vs. Stratifying:
· Stratification = sampling technique (who you SELECT)
· Blocking = experimental design (how you ASSIGN treatments)
Soil quality is the blocking variable — it's a known source of variation we control for.
🔑 Block what you CAN, randomize what you CANNOT. Blocking = experiment. Stratifying = survey/sampling.
🎓
0/20