Questions 1-4: Multiple choice: circle only one answer per question (1 mark per question).

1. The probability of passing a Statistics 203 final exam is 0.80. Which of the following statements

gives a valid interpretation of this probability?

a. Out of every 10 students, 8 will pass the final exam.

b. In the long run, the proportion of students passing the final exam is 0.80.

c. For any group of 10 students, at least 8 students will pass the final exam.

d. In the long run, the proportion of students passing the final exam is 0.50.

2. Suppose you have a large random sample from a population. Furthermore, suppose that the

population distribution of measurements does not follow a normal distribution. What does the

central limit theorem tell us?

a. For large samples, the distribution of the data is approximately normal.

b. For large samples, the distribution of the population mean,

€

?, is approximately normal.

c. For large samples, the distribution of the sample mean,

€

X , is approximately normal.

d. For large samples, the sample mean,

€

X , is very close to

€

?.

3. Hypothesis: Individuals who listen to music whilst studying for exams will achieve significantly

higher exam grades than will individuals who study in silence. A research study is conducted to

see if there is evidence in favour of the hypothesis. Thinking about this research hypothesis,

which of the below would be an appropriate summary of a statistical significant difference in this

setting?

a. The observed average exam grades for students who listen to music are about the same as

the average exam grades for students who study in silence.

b. The observed difference in average exam grade for students who listen to music from the

average exam grade for students who study in silence can be attributed to chance.

c. The observed average exam grade for students who listen to music is larger than the

average exam grade for students who study in silence.

d. The observed difference between the average exam grade for students who listen to

music and the average exam grade for students who study in silence is so large as to be

unlikely to have occurred by chance.

4. According to the US Census Bureau, the average number of children per American family is 2.2.

Which of the following most adequately describes this mean for the American population.

a. The mean of 2.2 children makes no sense because a family cannot have 0.2 children.

b. The mean of 2.2 is the long-term average number of children based on repeatedly

sampling families from the American population.

c. The mean of 2.2 children implies that American families have 1, 2 or 3 children.

d. American families have between 2 and 3 children.

5. (10 marks) Motivated students from across Canada can participate in an annual mathematics

competition. A random sample of 1,000 students is taken from each of three regions (Maritimes

and Newfoundland, Central Canada and Western Canada) to compare student performance on the

competition. The test was out of 60 marks. A boxplot of the 2013 results is shown below. Use

this plot to answer the following questions.

a. Use the above plot to compare the distributions of student scores by region.

To get full marks, must correctly compare centres and spread … some examples:

Centre: The centres of the 3 distributions appear to be different. Western Canada has a larger median,

followed by the Maritimes and then Central Canada. There is little overlap in the boxes (ie middle 50%

from each distribution).

Spread: The scores in the Maritimes and Newfoundland are much less spread out than the other regions,

while the range and IQR for western Canada is much larger than the other two regions.

Outliers: There are potential outliers shown in each population.

b. What is the interval covering the middle 50% of test scores observed in Western Canada

(explain how you determined this interval)?

[50,54)

c. What percentage of students from the Western Canada scored higher than 50 marks?

Q1=50, so 75% of scores were larger than 50 marks.

d. Roughly, what is the score for the worst performing student from Central Canada in this

sample of students?

About 45 marks

e. Can you tell from this plot whether any of the distributions are uni-modal (explain)?

No, a box-plot does not display information about the modes (peaks). A histogram would display this

(no requirement to mention histogram)

6. (8 marks) Metabolic rate is important in studies of dieting and exercise. The lean body mass (kg)

and resting metabolic rate (cal./24hrs) for 100 men participating in a study on dieting were

recorded. The histogram of the recorded resting metabolic rates is plotted below. A scatter-plot of

the resting metabolic rates versus the lean body mass is also shown.

a. What percentage of men in the study had a resting metabolic rate of larger than 1350?

There are 5 men in the [1350-1450) interval and 1 in each of the next three intervals.

8%

b. In which bin would you expect the median resting metabolic rate for this study.

There are 40 men in the first bin and 20 in the second. So, the median is in the second

bin… [950-1050)

c. Suppose that the correlation between resting metabolic rate and lean body mass is

computed. What does correlation attempt to measure in this setting?

It is attempting to measure the strength and direction of the linear association between

resting metabolic rate and lean body mass

d. Is it appropriate to use the correlation to describe the relationship between resting

metabolic rate and lean body mass (why or why not)?

No, since the association is not linear.

7. (4 marks) An advice columnist asked divorced readers, via her advice column, whether they

regretted their decision to divorce. About 30,000 responses were received, of which about 23,000

were from women. Nearly 75% of respondents said that they were glad that they divorced.

a. What type of survey is this?

Voluntary response survey

b. Briefly explain why this survey is likely to be biased.

There are many good answers.

For example, people who are motivated typically will reply to voluntary surveys.

Thus those who are really happy to be divorced may have written in a response.

Could also say that had 75% women and maybe men are more/less happy than

women.

8. (2 marks) A university has 10,000 undergraduate and 5,000 graduate students. A survey of the

students’ opinions is conducted by first randomly selecting 100 of the 10,000 undergraduate

students and then 50 of the 5,000 graduate students. Very briefly explain why this is not a simple

random sample.

A simple random sample requires that each sample of size n have the same probability of

being selected. In this question, it is impossible to get a sample of, say, 150

undergraduates only.

1 mark if only give 1st sentence only.

Give one mark if only state that it is a stratified random sample.

9. (2 marks) Average before-tax income in the City of Burnaby in 2005 for female single parent

households was $46,228. This statistic was reported in the 2006 City of Burnaby Neighborhood

Profile. Briefly explain why reporting the median income is likely to be a better measure of the

centre for the distribution of female single parent household in 2005 than reporting the average.

Incomes typically follow a right-skewed distribution (no not have to mention this). In this case, a few

well-paid single mothers will cause the average to be higher and not reflect the centre of the

distribution.

10. (8 marks) The distribution of moisture content per pound of dehydrated protein concentrate is

normally distributed with a mean of 3.5% and standard deviation of 0.6%.

a. Interpret the meaning of the standard deviation of 0.6% in this setting.

Based on repeated samples from this distribution, we would expect the average

distance of observations from the population mean (3.5%) to be roughly 0.60%.

Lose ½ mark for each missing bolded idea

b. A random sample of 36 one-pound specimens is taken and the moisture content of each is

measured. What is the distribution of the sample mean moisture content?

Mean of the distribution of the sample mean is 3.5%

Standard deviation of the sample mean is ? / n = 0.6 / 36 = 0.6 / 6 = 0.1

So, distribution of the sample mean is Normal with a mean of 3.5% and standard deviation of 0.1%

Or N(3.5,0.1)

c. What is the probability that the sample mean of the 36 specimens in part b is larger than

3.8%?

Sample mean follows a N(3.5,0.1) distribution

P(X > 3.8) =1? P(X ? 3.8)

=1? P(Z ?

3.8?3.5

0.1

)

=1? P(Z ?

0.3

0.1

)

1? P(Z ? 3) =1? 0.9987

=0.0013

d. Find the 99th percentile of the distribution of sample mean moisture content based on a

sample of 36 specimens as in part b?

From Table A, 99th percentile of the standard normal distribution is z=2.33

To get the 99th percentile, we set the standardized value to the 99th percentile of the standard

normal and solve for x.

2.33 =

x ??

? / n

=

x ?3.5

0.1

?0.233 = x ?3.5

?x = 3.5+ 0.233 = 3.733

So, the 99th percentile is 3.733%

11. (6 marks) A simple game-of-chance at a high school fund-raising day used a single six-sided die

(Note: die is the singular of dice). It costs $2 to play the game. If after rolling the die the numbers

1 or 6 are showing, the player is given a brand new $5 bill. If the numbers 2-5 are showing the

player loses and has to do a silly dance.

a. What is the probability distribution for the expected monetary return of this game from

the player’s point of view?

X is the random variable denoting the gambler’s return.

X

-2 3

P(X) 4/6 2/6

To get full marks, must list outcomes and associated probabilities

b. What is the expected monetary return of this game from the player’s point of view?

X

k

i

i i x x p X E ? = =?=1

( ) ( )

E(X)=-2(4/6)+3(2/6)= -2/6 dollars or -0.33 dollars or -33 cents

c. What is the minimum amount that the high school should charge to play the game if it is

to expect to make a profit?

Currently, they charge $2.

Let W be the random variable denoting the school’s profit.

Denote the amount the school charges as y.

W

y y-5

P(X) 4/6 2/6

For the school to make a profit, E(W) must be greater than $0.

E(W)= y(4/6) +(y-5)(2/6) > 0

4y/6 + 2y/6 -10/6 >0

6y/6 -10/6 >0

6y – 10 >0

y>10/6

y>1.66666

So, the minimum amount they can charge and make a profit is $1.67 (you cannot charge a fraction

of a cent)

12. (3 marks) Volunteers were given a 5×5 square puzzle to solve and the time it took them to solve it

was measured in seconds. The data recorded are listed below:

132, 141, 142, 143, 143, 147, 148, 149, 150, 158, 163

Find quartiles for these data.

1 mark each

There are n=11 observations

Q1: To get 25th percentile, compute np=11(.25)=2.75

The position of the 25th percentile in the sorted sample is 3.

Q1=142

Q2: The position of the median in the ordered sample is (n+1)/2=6

So, Q2=147

Q3: To get 75th percentile, compute np=11(.75)=8.25

The position of the 75th percentile in the sorted sample is 9.

Q1=150

Formula Sheet

Descriptive Statistics

Interested in a PLAGIARISM-FREE paper based on these particular instructions?...with 100% confidentiality?