CONFOUNDING VARIABLEHidden factor that affects both variables
PLACEBO = Control groupStill a control even if they receive a fake treatment
DOUBLE-BLINDNeither subject NOR researcher knows group assignment
UNDERCOVERAGE BIASWhole group is excluded from sampling frame
RESPONSE BIASPeople answer dishonestly or inaccurately
LARGER n ≠ FIX BIASMore biased responses = more wrong, not more right
VOLUNTARY ⚠️ WORSTSelf-selected = greatest bias of all methods
Section A — Sampling Methods (Advanced)
Stratified · Cluster · Systematic — The Hard Cases
Q 01
MCQ
AP
A pollster calls every 20th name in a phone directory, starting with the 7th name chosen at random. Halfway through, the directory switches from alphabetical to registered-voter order. What concern does this raise?
📖 Why B
Systematic sampling's weakness: if the list has a repeating pattern (e.g., every 20th voter is a precinct captain), picking every 20th person will over-sample that group — periodicity bias. Switching to a voter-registered list doesn't disqualify the method but raises this red flag. Systematic sampling is NOT always unbiased — it just has equal spacing, not equal probability.
Q 02
True / False
Medium
True or False: In stratified sampling, it is acceptable to have different sample sizes from each stratum, as long as the total sample represents the population proportionally.
📖 TRUE
In proportional stratified sampling, strata with larger populations get larger samples. Example: if 60% of a school is 9th graders and 40% is 10th graders, a proportional sample takes 60% of the total from 9th grade and 40% from 10th. Unequal stratum sizes are not just acceptable — they're often necessary for accurate representation.
Q 03
Fill in Blank
AP
A researcher randomly selects 4 cities from a country, then surveys every resident in those 4 cities. This is ______ sampling. If instead she had randomly selected 50 residents from each of those 4 cities, it would be ______ sampling.
Hint: think about whether ALL or SOME members of the group are included.
📖 Answer: cluster / stratified
Cluster: entire cities are included → all residents surveyed. Stratified: cities become strata, and a random sample of individuals is drawn from each. The key difference is whether you include the whole group or just a random portion from each selected group.
Q 04
MCQ
AP
A company has employees in 5 departments: HR (20), Marketing (50), Engineering (200), Sales (80), Finance (30). A researcher needs a sample of 38 people using proportional stratified sampling. How many should come from Engineering?
Total employees = 380. Engineering = 200 out of 380.
📖 Why C
Proportional stratified formula: (stratum size ÷ total population) × sample size. Engineering: (200 ÷ 380) × 38 = 0.526 × 38 ≈ 20 people. This ensures Engineering's proportion in the sample matches its proportion in the company (~52.6%).
Q 05
True / False
AP
True or False: A simple random sample guarantees that the sample will perfectly represent the population.
📖 FALSE
This is a classic AP trap. Simple random sampling gives every individual an equal chance — but it does not guarantee a perfect representative sample. By chance, the sample might over-represent one group. What SRS guarantees is the absence of systematic bias, not perfect representation. Larger samples reduce this random variation, but never eliminate it.
Section B — Bias Deep Dive
Find the Flaw. Name It. Explain It.
Q 06
Short Answer
AP
A researcher surveys gym members about their weekly exercise habits. Which type of bias is MOST likely, and why?
📖 Undercoverage Bias
The sampling frame (gym members) excludes a huge portion of the population (people who don't go to gyms). Gym members exercise significantly more than average — the sample systematically misrepresents the population's exercise habits. This is undercoverage: a whole group is absent from the pool of potential participants.
Q 07
Fill in Blank
Medium
A study finds a strong positive correlation between ice cream sales and drowning rates. A student concludes: "Ice cream causes drowning." The actual explanation is that a ______ variable (hot weather) causes both. This illustrates why ______ studies cannot prove causation.
📖 Answer: confounding / observational
A confounding variable is a third variable that influences both the independent and dependent variable, creating a false appearance of causation. Hot weather → more ice cream sales AND more swimming → more drownings. Only a randomized experiment can control for confounders. Observational studies can only show association.
Q 08
True / False
AP
True or False: Increasing the sample size from 500 to 5,000 will eliminate undercoverage bias in a survey.
📖 FALSE — Most Important Concept
This is the #1 AP Statistics misconception. Larger sample size reduces random sampling error (variability) but does NOT fix bias. If your sampling method systematically excludes a group, surveying more people from that same flawed method just gives you more biased data. The famous example: the 1936 Literary Digest poll surveyed 10 million people but still predicted the wrong winner — because their method excluded poor voters.
Q 09
MCQ
AP
A survey asks: "Given that the new park has improved community safety, do you support continued park funding?" This introduces which type of bias?
📖 Response Bias — Leading Question
The question embeds an unproven assumption: "the park has improved safety." Respondents who accept this premise are primed to say yes. This is response bias via leading question — the question wording influences the answer. Neutral version: "Do you support continued park funding?" No assumptions embedded.
Q 10
MCQ
AP
A health survey is administered by a patient's own doctor, who asks: "You do follow the diet I recommended, right?" Why is this problematic even if the sample was randomly selected?
📖 Social Desirability Bias
Even a perfectly random sample can suffer from response bias. When people are asked by authority figures (doctors, bosses, teachers), they tend to answer in ways they think are expected — not truthfully. This is social desirability bias. Solution: anonymous surveys administered by neutral parties. Random sampling fixes selection bias; it does NOT fix response bias.
Section C — Experiment Design
Control · Randomize · Replicate · Blind
Q 11
True / False
Medium
True or False: In a double-blind experiment, neither the participants nor the researchers analyzing the results know which group received the treatment.
📖 TRUE
Double-blind: both the subjects AND the researchers/administrators are unaware of group assignments. This prevents: (1) participants changing behavior because they know they got the real drug (placebo effect), and (2) researchers unconsciously influencing results or interpreting data favorably. Single-blind: only participants don't know. Double-blind is the gold standard.
Q 12
MCQ
AP
Researchers test a new fertilizer on crops. They use 3 plots for the new fertilizer and 3 plots for the existing fertilizer. All plots are in the same field. Which design principle does this NOT satisfy well?
📖 Replication
There IS a control group (existing fertilizer). Blinding isn't typically required for plant studies. There IS randomization (plots in the same field share similar conditions). The weak point is replication — 3 plots is very few; a single unusual plot could skew results. More replication = more confidence that observed differences are real, not random chance.
Q 13
Fill in Blank
AP
In an experiment, the group that receives the actual treatment being tested is called the ______ group. The group used for comparison that receives no treatment (or a placebo) is the ______ group.
📖 Answer: experimental / control
The experimental group (also called treatment group) receives the treatment being tested. The control group receives no treatment or a placebo — it's the baseline for comparison. Without a control group, you can't isolate whether the treatment caused any observed change.
Q 14
MCQ
AP
A drug trial gives Group A a new pill and Group B a sugar pill. Patients in Group B report feeling 30% better. Researchers conclude the drug is effective because Group A improved 60%. What phenomenon explains Group B's improvement?
📖 The Placebo Effect
The placebo effect is a real physiological/psychological improvement caused by the mere belief of receiving treatment. This is why experiments need control groups — without one, researchers can't tell if improvements are due to the treatment or just expectation. The drug is still considered effective: 60% vs 30% shows the drug effect beyond placebo.
Q 15
True / False
AP
True or False: An observational study in which researchers find that smokers have higher cancer rates proves that smoking causes cancer.
📖 FALSE — Correlation is NOT Causation
Technically, no observational study can prove causation — you cannot randomly assign people to smoke for ethical reasons. However, strong, consistent associations across many studies, combined with biological plausibility (we understand the mechanism), create overwhelming evidence. Statistically speaking, the study shows strong correlation. Proof of causation requires either a randomized experiment or strong converging evidence.
Section D — AP Scenario Marathon
Read. Classify. Justify. — The Hardest 5
Q 16
MCQ
AP
A university randomly assigns incoming freshmen to either a standard dorm or a learning community dorm. After one year, GPA is compared. Which is the BEST description of this study design?
📖 Randomized Experiment
The defining feature of an experiment is random assignment of a treatment. Here, students are randomly assigned to different dorm types (the treatment). Treatment doesn't have to be medical — it's any condition being manipulated. Because of random assignment, any GPA difference can be attributed to dorm environment, not pre-existing differences between students.
Q 17
Short Answer
AP
A researcher studies whether listening to classical music improves math test scores. She compares scores of students who voluntarily listen to classical music vs. those who don't. Why is this study fundamentally flawed, and what would be the correct fix?
📖 Confounding via Self-Selection
Students who voluntarily choose classical music may differ in many ways — higher motivation, better study habits, higher baseline achievement. These confounders make it impossible to isolate whether music caused better scores. The fix: randomly assign students to "listen to classical music" or "no music" groups. This distributes confounders equally, isolating the music variable.
Q 18
Fill in Blank
AP
A study has ______ if it systematically produces results that misrepresent the population. A sample is biased if one group is ______ likely to be included than another, OR if a question could influence answers in some way.
📖 Answer: bias / more
Straight from your notes: "A study has bias if it systematically produces results that misrepresent a population." And: "Bias may arise if a sample is more likely to include a certain population than another." Memorize this definition word-for-word.
Q 19
MCQ
AP
A county surveys residents by randomly selecting 10 precincts and mailing surveys to every household in those precincts. 43% of surveys are returned. The response rate is low. What is the PRIMARY statistical concern?
📖 Non-Response Bias
When a large portion of a sample doesn't respond, those non-respondents may differ systematically from respondents. Here, 57% didn't respond — maybe they're busier, less engaged civically, or more satisfied/dissatisfied. This non-response bias can severely distort results. The cluster sampling design is fine; the problem is who actually returned the survey.
Q 20
MCQ
AP — BOSS
A researcher wants to test whether a new teaching method improves reading comprehension. She selects 4 schools at random from 20, then randomly assigns half the classes in each school to the new method and half to the traditional method. At the end of the year, reading scores are compared. Which statement BEST describes this design?
This combines two sampling concepts in one design.
📖 Multi-Stage Design — The Ultimate Boss Question
This is a two-stage design: (1) Cluster sampling selects 4 schools from 20 — entire schools are the clusters. (2) Within each school, random assignment of classes to treatment/control makes it a randomized experiment. It's NOT stratified because schools aren't strata from which individuals are sampled — they're clusters that are wholly selected. The random assignment of classes within schools is what makes it an experiment, allowing causal conclusions.