Individual Incomes. In this assignment, you will use the 1,000 observation subset of income data from the Bureau of Labor Statistics covering over 50,000 randomly selected U.S. households. This is the same data you used in the case study exercise for Week 1. In this case study we want to illustrate sampling variability and
sampling distributions that considers the people as a population. You will use this data to generate simple random samples, as well as associated descriptive statistics for each of these random samples. You will then compare the sample-based results to those that you generated in Week 1 for the entire dataset. You should perform the analysis using software and prepare a report comparing results from the different samples to each other as well as to the results when using the entire data set.
An IQ test. The Wechsler Adult Intelligence Scale (WAIS) is a common “IQ test” for adults. The distribution of WAIS scores for persons over 16 years of age is approximately Normal with mean 100 and standard deviation 15. In this case study you will compare probabilities of individual IQs, and IQs of a group falling in a certain range.
In the optional case study you will examine portfolio returns and variability of returns.
You will be analyzing earnings for this exercise. Open the data in the software. Use the posted 1,000 observations..
A. Describe the distribution of the population of 1,000 people. Include a histogram and the five-number summary. Follow the instructions in the end of the chapter appendix on software and the Excel documentation at the publisher’s StatsPortal Web site to generate the samples. Specifically, in Excel you can use Tools > Data Analysis > Sampling to generate SRS samples; in Minitab, you can use Calc > Random data > Sample from columns.
B. Samples and sampling distributions. Choose an SRS of 30 members from this population. Make a histogram of the 30 incomes in the sample and find the five-number summary. Briefly compare the shape, center, and spread of the income distributions in the sample and in the population of 1,000 people. Then repeat the process of choosing an SRS of size 30 four more times (five in all). Does it seem reasonable to you from this small trial that an SRS of 30 people will usually produce a sample whose shape is generally representative of the population?
C. Statistical estimation. Do the medians and quartiles of the sample provide reasonable estimates of the population median and quartiles? Explain why we expect that the minimum and maximum of a sample will not satisfactorily estimate the population minimum and maximum. Now examine estimation of mean income in more detail. Use your software to produce an additional 25 (a total of 30 with the 5 from B) SRSs of size 30 from this population. Find the mean income for each sample and save these 30 sample means in a new column. Make a histogram of the distribution of the 30 sample means. How do the shape, center, and spread of this distribution of sample means compare with the distribution of individual incomes from part A? Does it appear that the sample mean from an SRS of size 30 is usually a reasonable estimator of the population mean? (The sampling distribution of the sample mean for samples of size 30 from this population is the distribution of the means of all possible samples. Your 30 samples give a rough idea of the nature of the sampling distribution.)
Using the distribution of the WAIS test, Normal with a mean 100 and standard deviation 15, answer the following questions
A. What is the probability that a randomly chosen individual has a WAIS score of 105 or higher?
B. What are the mean and standard deviation of the average WAIS score for an SRS of 60 people?
C. What is the probability that the average WAIS score of an SRS of 60 people is 105 or higher?
D. Would your answer to any of (A), (B), or (C) be affected if the distribution of WAIS scores in the adult population were distinctly non-Normal?
Prepare a business report presenting the conclusions of the analysis incorporating software results as evidence. Summarize your results using the measures learned in this week as a guide. Specifically compare the graphic and numeric descriptive statistics for the entire population to each of the sample-based values, and compare the samples to each other. Please discuss the sampling distribution of the sample mean as asked for in part (C) of the exercise.