Final Exam Level

Statistics
Final Exam

25 exam-level questions. Each option is designed to look plausible — read carefully, don't rush. Watch out for the trap choices.

Exam Info
Questions25
Sections6
LevelFinal
Score
0 / 25
0 0 Left: 25
Section 1 · Types of Studies
Exam Trap
The key is who assigns the treatment. If the researcher controls it → Experiment. If subjects are simply watched → Observational. If they only ask questions → Survey.
Q 01
A health researcher notices that people who drink coffee tend to have higher blood pressure. She reviews medical records without contacting any patients. What type of study is this?
⚠ Watch: "notices" and "reviews records" are key words
Explanation
No treatment is assigned — the researcher simply observes existing records. This is an observational study.
Options A and D trick you because coffee seems like a "treatment," but the researcher didn't assign anyone to drink coffee. Subjects chose on their own.
Q 02
A school randomly splits 200 students into two groups. Group A uses a new math app for 8 weeks; Group B uses textbooks only. Test scores are compared at the end. What type of study is this, and what is Group B called?
Explanation
Treatment (math app) is randomly assignedExperiment. Group A receives the treatment (experimental group). Group B receives no new treatment → it is the control group, used for comparison.
Option D flips the group labels — Group B gets NO treatment, so it cannot be the "experimental" group.
Q 03
A city mails questionnaires to 500 randomly chosen residents asking about their satisfaction with public transportation. What type of study is this?
Explanation
A sample survey collects data by asking questions to a sample in order to draw conclusions about a larger population. No treatment, no passive observation — just asking.
Option B tempts because you might think "they're observing opinions," but observational studies involve watching behavior, not asking for responses.
Section 2 · Sampling Methods
Exam Trap
Cluster vs. Stratified — the #1 mix-up. Cluster: randomly pick whole groups and take everyone. Stratified: divide into groups, pick a few from each. If you survey all of a group → Cluster. If you sample from each group → Stratified.
Q 04
A principal divides 600 students by grade (9th, 10th, 11th, 12th), then randomly selects 15 students from each grade to take a survey. Which sampling method is this?
Explanation
Groups are formed by a shared characteristic (grade level), then a random sample is drawn from each groupStratified.
Option A is the classic trap. Cluster also divides into groups, but in cluster sampling you randomly select entire groups and survey everyone in them. Here, only 15 per grade are chosen — not all students in each grade.
Q 05
A researcher randomly selects 4 out of 20 city blocks, then surveys every household on those 4 blocks. Which method is this?
Explanation
City blocks are the clusters. Entire clusters (blocks) are randomly selected, and every household in the selected clusters is surveyed → Cluster sampling.
Stratified looks similar but requires sampling a few people from each of ALL groups — here you only survey selected blocks, not a few from every block.
Q 06
Students are listed 1–400. A researcher picks student #7 at random, then selects every 8th student after that (7, 15, 23, 31…). How many students are selected in total?
Explanation
This is systematic sampling: start at 7, interval = 8. Sequence: 7, 15, 23, … The last one ≤ 400 is \(7 + 8 \times 49 = 7 + 392 = 399\). So indices go from \(k=0\) to \(k=49\) → 50 students.
Option A (40) is a trap: \(400 ÷ 8 = 50\), but many students divide 400 by 8 and then subtract something incorrectly. The count is simply \(\lfloor(400-7)/8\rfloor + 1 = 50\).
Q 07
A news website posts a poll: "Do you think social media is harmful to teenagers?" Visitors who see the poll can choose to click and respond. What sampling method is this, and what is the main problem?
Explanation
When individuals choose whether to participate, it is a voluntary response (self-selected) sample. The problem: people who feel strongly (either very pro or very anti) are far more likely to vote, skewing results away from neutral opinions.
Convenience is tempting — but convenience refers to sampling whoever is physically nearby/easy to reach. Self-selected specifically refers to voluntary participation, which is what's happening here.
Section 3 · Bias
Q 08
A survey about gym habits is conducted by standing outside a gym at 6 AM and interviewing people entering. Which statement BEST describes the bias?
Explanation
Convenience sampling creates bias because the sample isn't representative of the whole population. People at a gym at 6 AM are already more fitness-oriented than average.
Option D tempts: "some were not interviewed" — but undercoverage means a whole group is excluded from the sampling frame entirely (e.g., non-gym members). The issue here is the location and time of sampling.
Q 09
A survey asks students: "Given how unhealthy fast food is, how often do you eat it?" What problem does this question have?
Explanation
The phrase "given how unhealthy fast food is" is a leading phrase that pressures respondents to downplay their fast food consumption. This is response bias — the question itself influences the answer.
Option D is a trap because the statement "fast food is unhealthy" may be factually true — but factual accuracy does NOT eliminate bias from loaded wording.
Q 10
Which of the following sampling methods produces results with the LEAST bias?
Explanation
A Simple Random Sample (SRS) gives every individual an equal chance of selection, eliminating systematic favoritism. It produces the least biased results.
Option C (Systematic) seems orderly and scientific — but if the population list has a hidden pattern, every-nth selection could systematically over- or under-represent a group. SRS has no such vulnerability.
Section 4 · Mean · Median · Mode · Range · IQR
Exam Trap
Always sort the data first before finding median or quartiles. For even-count data, the median is the average of the two middle values — not one of them. And remember: Q1 and Q3 are medians of the lower and upper halves (excluding the median itself for odd-count sets).
Q 11
Data set: {14, 8, 22, 8, 5, 17, 11}. Which measure of center has the greatest value?
💡 Sorted: 5, 8, 8, 11, 14, 17, 22 → Mean = 85 ÷ 7 ≈ 12.1 · Median = 11 · Mode = 8
Explanation
Sorted: 5,8,8,11,14,17,22. Mode = 8, Median = 11 (middle value), Mean = \(\frac{5+8+8+11+14+17+22}{7} = \frac{85}{7} \approx 12.14\). The larger values (17, 22) pull the mean above the median.
Many students pick the median without computing the mean. Always calculate all three before comparing.
Q 12
Data: {3, 3, 5, 7, 7, 9, 12}. What is the mode?
Explanation
Both 3 and 7 appear twice, more than any other value. A data set can have two modes (bimodal). The mode is not limited to one value.
Options A and B both give only one mode — the classic trap. If two values appear with the same highest frequency, both are modes.
Q 13
A data set has a mean of 15. A new value of 45 is added to the set. What happens to the mean?
Explanation
Adding a value greater than the current mean always pulls the mean upward. Since 45 > 15, the new mean will be between 15 and 45. The exact new mean depends on how many values were originally in the set.
Option C is a trap: the mean does NOT simply average 15 and 45. The new mean depends on the number of original values, not just the two endpoints.
Q 14
Data (sorted): {6, 9, 11, 14, 18, 23}. What is the median?
Explanation
6 values (even) → two middle values at positions 3 and 4: 11 and 14. Median = \(\frac{11+14}{2} = \frac{25}{2} = \mathbf{12.5}\).
Options A and B (just picking one middle value) are the most common mistakes. With an even count, you MUST average the two middle values — you never just pick one.
Q 15
Consider: {2, 50, 51, 52, 53, 54}. Compare the mean and median. Which statement is TRUE?
Explanation
Mean = \(\frac{2+50+51+52+53+54}{6} = \frac{262}{6} \approx 43.7\). Median = \(\frac{51+52}{2} = 51.5\). The outlier 2 drastically lowers the mean while barely affecting the median. So Median (51.5) > Mean (≈43.7).
Option D is tempting because the high values (50–54) dominate the data, but it's the ONE low outlier (2) that disproportionately drags the mean down.
Section 5 · Quartiles · IQR · Box Plots
Exam Trap
When finding Q1 and Q3 for an odd-count set, exclude the median from both halves. For an even-count set, split exactly in half. Every quartile cut represents exactly 25% of the data.
Q 16
Data (sorted): {3, 7, 8, 10, 12, 15, 18, 20}. What is the IQR?
💡 8 values (even) → lower half: {3,7,8,10} · upper half: {12,15,18,20}
Explanation
Lower half {3,7,8,10}: Q1 = \(\frac{7+8}{2} = 7.5\). Upper half {12,15,18,20}: Q3 = \(\frac{15+18}{2} = 16.5\). IQR = \(16.5 - 7.5 = \mathbf{9}\).
Option A (10) comes from incorrectly using the 4th and 5th values as Q1/Q3. Option D (17) is the range (Max−Min = 20−3), not the IQR.
Q 17
Data (sorted, 7 values): {4, 6, 9, 13, 17, 21, 25}. What is Q1?
⚠ Odd count — exclude the median (13) before finding Q1
Explanation
Median = 13 (position 4). Lower half (excluding median) = {4, 6, 9}. Q1 = middle of this group = 6.
Option B (9) is the trap — that's the last value of the lower half, not the middle. Option A (7.5) comes from incorrectly averaging 6 and 9, as if the lower half had an even count.
Q 18
A box plot has: Min = 20, Q1 = 35, Median = 55, Q3 = 70, Max = 90. What percent of the data is between 35 and 70?
Explanation
35 = Q1 and 70 = Q3. By definition, 50% of all data lies between Q1 and Q3 (the "box" in the box plot). Each quartile represents 25%, and the box spans two quartiles.
Option B (75%) is a common trap — 75% of the data lies below Q3, not between Q1 and Q3. The question asks for the range between them.
Q 19
Two data sets are shown:

Set A: Min=10, Q1=20, Med=30, Q3=40, Max=80
Set B: Min=10, Q1=15, Med=30, Q3=55, Max=80

Which set has a greater IQR, and what does that tell you?
Explanation
Set A IQR = \(40 - 20 = 20\). Set B IQR = \(55 - 15 = 40\). Set B has the greater IQR (40 > 20), meaning its middle 50% of data is spread across a wider range.
Option B is a trap — both sets have the same Range (80−10=70), but IQR measures internal spread, not overall spread. Same range ≠ same IQR.
Q 20
A dataset has Q1 = 40 and Q3 = 70. A value of 120 is added (an outlier). Which measure is most affected by this outlier?
Explanation
The mean is the most affected because it uses every value in its calculation — a large outlier directly inflates the sum and therefore the average. The median barely changes (it just shifts one position). The IQR is resistant to outliers because it only uses the middle half of data.
Option A is the trap — IQR is specifically known for being RESISTANT to outliers. It is often preferred over range precisely because of this.
Section 6 · Histograms (Advanced)
Q 21
A histogram of test scores shows these bars:
Score RangeFrequency
50 – 593
60 – 697
70 – 7912
80 – 899
90 – 994
How many students scored below 80?
Explanation
"Below 80" means scores in the 50–59, 60–69, and 70–79 bars: \(3 + 7 + 12 = \mathbf{22}\). Do NOT include the 80–89 bar.
Option B (31) includes the 80–89 bar (3+7+12+9=31). The phrase "below 80" excludes the 80s entirely — don't add that bar.
Q 22
Using the same histogram above (total = 35 students). What percent scored 80 or above? Round to the nearest whole percent.
Explanation
80 or above: 80–89 bar (9) + 90–99 bar (4) = 13 students. Total = 35. Percent = \(\frac{13}{35} \approx 0.371 = \mathbf{37\%}\).
Option C (63%) is 22/35 — that's the percent scoring BELOW 80, not above. Flipping the inequality is one of the most common exam mistakes.
Q 23
From the same histogram, can you determine the exact score of the highest-scoring student?
Explanation
A histogram groups data into intervals. You can tell 4 students scored in 90–99, but you cannot know if they scored 90, 95, 97, or 99 — individual values are lost in a histogram.
Option B is a trap: just because the bar ends at 99 doesn't mean anyone scored 99. The bar only tells you the range — all 4 students could have scored 91.
Q 24
A histogram has bars at 0–10, 10–20, 20–30. The 10–20 bar has height 8. A student scored exactly 10. Which bar would this student be counted in?
Explanation
By standard histogram convention, each interval is written as \([10, 20)\) — inclusive on the left, exclusive on the right. So a score of exactly 10 belongs in the 10–20 bar, not the 0–10 bar.
Option A is the instinctive trap — "10 is the end of 0–10, so it belongs there." But standard histograms always include the left boundary and exclude the right.
Q 25
A researcher wants to study whether a new study technique improves exam scores. She randomly assigns half of 80 students to use the new technique and the other half to study as normal. She gives both groups the same exam afterward. Which of the following correctly describes her study?
Explanation
This is a textbook experiment: (1) subjects are randomly assigned to treatment or control, (2) a treatment is applied, (3) results are measured. "Random" in sampling ≠ experiment; it's random assignment of treatment that defines an experiment.
Options A and C both use "randomly" — trap! Randomly selecting participants (sampling) is different from randomly assigning treatments. The second is what makes it an experiment.
— / 25