For this week you should read all of Chapter 10 of the textbook. When you read the text that involves running R script you are expected to run the code by yourself on your computer, in parallel to reading it in the textbook, and compare what you get with the output presented in the textbook.Chapter 10: Point Estimation
Performance of estimators is assessed in the context of a theoretical model for the sampling distribution of the observations. Given a criteria for optimality, an optimal estimator is an estimator that performs better than any other estimator with respect to that criteria. A robust estimator, on the other hand, is an estimator that is not sensitive to misspecification of the theoretical model. Hence, a robust estimator may be somewhat inferior to an optimal estimator in the context of an assumed model. However, if in actuality the assumed model is not a good description of reality then the robust estimator will tend to perform better than the estimator denoted optimal.
Some say that optimal estimators should be preferred while others advocate the use of more robust estimators. What is your opinion?
When you formulate your answer to this question it may be useful to come up with an example from your own field of interest. Think of an estimation problem and possible estimators that can be used in the context of this problem. Try to identify a model that is natural to this problem and ask yourself in what ways may this model err in its attempt to describe the real situation in the estimation problem.
As an example, consider estimation of the expectation of a Uniform measurement. We demonstrated that the mid-range estimator is better than the sample average if indeed the measurements emerge from the Uniform distribution. However, if the modeling assumption is wrong then this may no longer be the case. If the distribution of the measurement, in actuality, is not symmetric or if the distribution is more concentrated in the center than in the tails then the performance of the mid-range estimator may deteriorate. The sample average, on the other hand is not sensitive to the distribution not being symmetric.
In this assignment we consider data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes their blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The current assignment involves data collected on a random sample of 748 donors. The data was obtained from the UCI Machine Learning Repository. This data was assembled by Prof. I-Cheng Yeh.The file “transfusion.csv” contains the data. The file is attached below. The file contains 5 variables:
- recency = The number of months since the last donation. (numeric)
- frequency = The total number of donations. (numeric)
- monetary = Total blood donated (in c.c.). (numeric)
- time = The number of months since the first donation. (numeric)
- march2007 = An indicator. Indicates those that donated blood in March, 2007. (factor)
In this assignment we consider the variables frequency and monetary.Descriptive StatisticsSave the data set in your computer and read it into R. Compute the mean, median, the interquartile range, the standard deviation of the variable frequency and plot it’s histogram. In Tasks 1-3 you are asked to describe the distribution of this variable on the basis of the computations and the plot.Estimating ParametersIn Tasks 4-6 you are asked to estimate the expectation and standard deviation of the variable frequency. An estimator is used to estimate the expectation. This estimator has a standard deviation. You are required to estimate this standard error, which is the standard deviation of the estimator. You are required to describe which estimator was used for each estimation task.Estimating the MSEConsider the variable monetary. We assume that the distribution of this variable is Exponential(λ) and are interested in the estimation of the parameter λ. The proposed estimator is 1/X, where X is the sample average. In Tasks 7-8 you are required to estimate the value of the parameter and estimate the mean square error (MSE) of the estimator.
You should estimate the MSE as follows. First, estimate the parameter λ from the given data. Then run a simulation to compute the MSE using as parameter the value of λ you have estimated.
Submitting the Assignment
For the assignment you should complete the following 8 tasks. Tasks 1-3 refer to the descriptive statistics problem presented above, Tasks 4-6 refer to the problem of estimating parameters and Tasks 7-8 refer to the task of estimating the parameter of an Exponential distribution and estimating the MSE of the estimators.
Your answers should be short and clear. We recommend that you copy and paste the tasks below into the form titled “Submit your Assignment using this Form”. You can then write you answers to the tasks in the designated positions that are marked in the text:TasksDescriptive Statistics:1. The distribution of the variable “frequency” is:__ Skewed to the left, __ Symmetric, __ Skewed to the right.Mark the most appropriate option and explain your selection2. The number of outlier observations in the variable “frequency” is: _____.Explain each step in the computation of the number of outlier observations3. Which of the following theoretical models is most appropriate to describe the distribution of the variable “frequency”?__ Binomial, __ Poisson, __ Uniform, __ Exponential, __ Normal.Mark the most appropriate option and explain your selectionEstimating Parameters:4. The estimated value of the expectation of the measurement “frequency” is:_____.Explain your answer5. The estimated value of the standard deviation of the measurement “frequency” is:_____.Explain your answer6. The estimated value of the standard deviation of the estimator that produced the estimate in 4. is:_____.Explain your answerEstimating the MSE:7. The estimated value of λ for the variable “monetary” is:____.Attach the R code for conducting the computation8. The estimated value of the MSE of the estimator of λ is:____.Attach the R code for conducting the computation
The Learning Journal should be updated regularly (on a weekly basis), as the learning journals will be assessed by your instructor as part of your Final Grade.
Your learning journal entry must be a reflective statement that considers the following questions:
1. Describe what you did. This does not mean that you copy and paste from what you have posted or the assignments you have prepared. You need to describe what you did and how you did it.
2. Describe your reactions to what you did.
3. Describe any feedback you received or any specific interactions you had. Discuss how they were helpful.
4. Describe your feelings and attitudes.
5. Describe what you learned.
Another set of questions to consider in your learning journal statement include:
1. What surprised me or caused me to wonder?
2. What happened that felt particularly challenging? Why was it challenging to me?
3. What skills and knowledge do I recognize that I am gaining?
4. What am I realizing about myself as a learner?
5. In what ways am I able to apply the ideas and concepts gained to my own experience?
Finally, describe one important thing that you are thinking about in relation to the activity.
Your Learning Journal should be a minimum of 500 words.