HW1_240_Solutions

A study of a very large number of pregnant women in Arkansas reports that the women gained, on average, \(14\) pounds during their pregnancy and that \(18\%\) of the women smoked. Which of the following is not a variable in this study?

a. Pregnancy status

Smoking status
Weight gain

The two variables in the Arkansas study are:

both categorical variables
both quantitative variables

c. one categorical variable and one quantitative variable

The Statistical Abstract of the United States, prepared by the Census Bureau, provides the number of single-organ transplants for the year \(2010\), by organ. The next two exercises are based on the following table:

\[ \begin{array}{|c|c|} \hline \text{Heart} & 2333 \\ \hline \text{Lung} & 1770 \\ \hline \text{Liver} & 6291 \\ \hline \text{Kidney} & 16898 \\ \hline \text{Pancreas} & 350 \\ \hline \text{Intenstine} & 151 \\ \hline \end{array} \]

The data on single-organ transplants can be displayed in:

a pie chart but not a bar graph
a bar graph but not a pie chart

c. either a pie chart or a bar graph

Kidney transplants represented what percent of single-organ transplants in \(2010\)?

a. Nearly \(61\%\)

One-sixth (nearly \(17\%\))
This percent cannot be calculated from the information provided in the table

The graphic below shows the percent of adults in the world who are overweight or obese, by type of country of residence based on that country’s income level. The following two exercises are based on this figure:

The graph above is:

a bar graph that can be made into one pie chart

b. a bar graph that cannot be made into one pie chart

a histogram with a clear right skew

Which of the following conclusions can be reached from the graph above?

The majority of adults who are overweight and obese live in high-income countries

b. The majority of adults who live in high-income countries are overweight obese

Both conclusions are correct

Below is a histogram of the takeoff angles of \(54\) videotaped jumps of adult hedgehog fleas, Archaeophyllus erinacei. The following two exercised are based on this histogram:

What percent of jumps have a takeoff angle of \(35\) degrees or less?

\[10/54 = 0.185 * 100 = 18.5\%\]

The shape of the distribution of takeoff angles in the graph above is:

a. skewed to the right

roughly symmetric
skewed to the left

Researchers examined a new treatment for advanced ovarian cancer in a mouse model. They created a nanparticle-based delivery system for a suicide gene therapy to be delivered directly to the tumor cells. The grafted tumors were injected either with the new treatment or with only some buffer solution to serve as a comparison. The following data give the fold increase in tumor size after two weeks in \(20\) mice. A \(1\) represents no change, a \(2\) represents a doubling in volume of the tumor.

\[ \begin{array}{|c|} \hline \text{Buffer Solution}\\ \hline 9.1 \quad 8.1 \quad 7.8 \quad 7.0 \quad 6.8 \quad 5.4 \quad 5.4 \quad 4.1 \quad 3.8 \quad 3.3\\ \hline \end{array} \]

\[ \begin{array}{|c|} \hline \text{Nanoparticle-delivered gene therapy}\\ \hline 4.1 \quad 3.5 \quad 2.1 \quad 2.1 \quad 1.8 \quad 1.8 \quad 1.4 \quad 1.2 \quad 1.1 \quad 1.1\\ \hline \end{array} \]

Make two dotplots, one for each group, using the same scale on the horizontal axis for both. Describe the distribution of tumor increase in each treatment group.

The data for the buffer solution is approximately symmetric, heavily spread, and pushed more to the right than the gene therapy treatment data which is right skewed, low spread, and pushed close to \(1\).

Report the approximate midpoints of both groups. What are the most important differences between the two groups? What can you conclude from the study findings?

\[\text{Median Buffer}=6.1\]

\[\bar X_{\text{Buffer}}=6.08\]

\[\text{Median Trt}=2.02\]

\[\bar X_{\text{Trt}}=1.8\]

Looking at both dotplots and midpoints, there’s a clear effect of treatment from the gene therapy versus the buffer solution. The data for the gene therapy is much less spread and pushed almost entirely towards \(1\).

Spider silk is the strongest known material, natural or man-made, on a weight basis. A study examined the mechanical properties of spider silk using 21 female golden orb weavers, Nephila clavipes. Here are data on silk yield stress, which represents the amount of force per unit area needed to reach permanent deformation of the silk strand. The data are expressed in megapascals (MPa):

\[ \begin{array}{|c|c|c|c|c|c|c|} \hline 164.0 & 478.7 & 251.3 & 351.7 & 173.0 & 448.9 & 300.6\\ \hline 362.0 & 272.4 & 740.2 & 329.0 & 327.2 & 270.5 & 332.1\\ \hline 288.8 & 176.1 & 282.2 & 236.1 & 358.2 & 270.5 & 290.7\\ \hline \end{array} \]

Find the mean and median yield stress. Compare these two values.

\[\bar x={1\over n}\sum_{i=1}^n x_i = {1 \over 21}(164.0 + 478.7 + \ ... \ +290.7)=319.2476\]

\[\text{Median} = 290.7\]

Mean \(>\) Median. Given this, the data is skewed to the right with a higher density of values between \(200\) and \(400\). The median would provide a better representation of this density, but the mean would be more inclusive to the variation occuring in the data set.

Find the standard deviation in yield stress. Interpret your results in reference to your results from (11).

\[\sigma=\sqrt{{1\over (n-1)}\sum_{i=1}^n(x_i-\bar x)^2} \newline = \sqrt{{1\over 20}((164.0-319.2476)^2+(478.7-319.2476)^2 + \ ... \ + (290.7-319.2476)^2)} \newline =\sqrt{{1\over 20}*312078.9}=\sqrt{15603.95}=124.9158\]

The standard deviation being so high relative to the mean (\(\approx 39\%\) of the mean) explains just how strong the effect of the single major outlier value in the data set is. Due to this extreme spread we would likely want to describe this data set with the median to represent the bulk of our data or use a histogram when using the mean to describe the data to be inclusive of the shape of the data.

Fun fact: If we exclude the highest value in the data set, \(740.2\), the standard deviation drops to \(81.44\)

HW1_240_Solutions

2025-01-27