Mean Absolute Deviation

Directions: Give an example of two sets of numbers that form identical box plots (also called box-and-whisker plots) but have different mean absolute deviation values.

Hint

What values of two sets of numbers have to be the same to form the same box plot?  What values are left that could be changed?

Answer

I believe that there should be an infinite quantity of answers.  One example is {2, 4, 6, 8, 20} and {2, 4, 4, 4, 6, 6, 6, 8, 8, 8, 20}.  They both produce identical box plots but the first set has a mean absolute deviation of 4.8 while the second set has a mean absolute deviation of ~2.98.

Source: Robert Kaplinsky with help from Pamela Franklin

Print Friendly, PDF & Email

Check Also

Greatest Common Factor

Directions: Using the digits 0 to 9 at most one time each, place a digit …

17 comments

  1. Wouldn’t the mean absolute deviation of the second data set be approximately 2.98? The mean would be 6.91. Also the mean absolute deviation can never be negative, correct? Or am I missing something?

    • Robert Kaplinsky

      Thanks for catching this Pamela. Here’s what I’m seeing:
      – I somehow calculated MAD incorrectly. The first set has a MAD of 4.8 and the second of, like you said, ~2.98.
      – In regards to MAD never being negative, you are correct. Perhaps the ~6.91 looked like a -6.91? The first one had a tilda for approximately and the second one had a negative.

      Either way, I’ll fix this accordingly.

      • Wouldn’t the mean absolute deviation of the second data set be approximately 2.98? The mean would be 6.91. Also, the mean absolute deviation can never be negative, correct? Or am I missing something?

  2. The mean absolute deviation of a dataset is the average distance between each data point and the mean. It gives us an idea about the variability in a dataset.

    • I believe that there should be an infinite quantity of answers. One example is {2, 4, 6, 8, 20} and {2, 4, 4, 4, 6, 6, 6, 8, 8, 8, 20}. They both produce identical box plots but the first set has a mean absolute deviation of 4.8 while the second set has a mean absolute deviation of ~2.98.

  3. I believe that there should be an infinite quantity of answers. One example is {2, 4, 6, 8, 20} and {2, 4, 4, 4, 6, 6, 6, 8, 8, 8, 20}. They both produce identical box plots but the first set has a mean absolute deviation of 4.8 while the second set has a mean absolute deviation of ~2.98.

  4. I believe that there should be an infinite quantity of answers. One example is {2, 4, 6, 8, 20} and {2, 4, 4, 4, 6, 6, 6, 8, 8, 8, 20}. They both produce identical box plots but the first set has a mean absolute deviation of 4.8 while the second set has a mean absolute deviation of ~2.98

  5. I believe that there should be an infinite quantity of answers. One example is {2, 4, 6, 8, 20} and {2, 4, 4, 4, 6, 6, 6, 8, 8, 8, 20}. They both produce identical box plots but the first set has a mean absolute deviation of 4.8 while the second set has a mean absolute deviation of ~2.98.

  6. Person aka kaleya

    believe that there should be an infinite quantity of answers. One example is {2, 4, 6, 8, 20} and {2, 4, 4, 4, 6, 6, 6, 8, 8, 8, 20}. They both produce identical box plots but the first set has a mean absolute deviation of 4.8 while the second set has a mean absolute deviation of ~2.98.

  7. How are those identical box plots? the first data set has a lower quartile of 3 while the second data set has a lower quartile of 4.

  8. Rudolf Österreicher

    Here’s an easy way to construct two such sets:

    Take any set with an uneven number of elements (*where the sum of the absolute distances between the quartiles and the mean is unequal to two times the MAD of the first list), for example {2, 4, 6, 8, 10} (the sum of the absolute distance between the quartiles and the mean is |3-6|+|9-6|=6, which is not equal to 2*MAD = 2*(|2-6|+|4-6|+|6-6|+|8-6|+|10-6|=24)
    Now add the first und last quartile as elements. In this example, q1 = 3 and q2 = 9, so the new list is {2, 3, 4, 6, 8, 9, 10}. By adding these, the minimum und maximum don’t change, the median doesn’t change because you add one element on each side and the quartiles don’t change because by adding two numbers between the first and third quartile to the list, the 1st and 3rd quartiles move half a position inwards, but by adding the quartiles as elements you make sure that the quartiles of the new list are the same as before.

    Similarily, take any set with an even number of elements (where the sum of the absolute distances between the quartiles and the mean and between the median and the mean is unequal to 3*MAD of the first list), for example {2, 4, 6, 8} (where the sum of the absolute distances between the quartiles and the mean and between the median and the mean is |3-5|+|5-5|+|7-5|=4, which is unequal to 3*MAD = 3*(|2-5|+|4-5|+|6-5|+|8-5|)=3*8 = 24). Now add the 1st and 3rd quartile and the median to the list. The new list will have the same boxplot and a different MAD.

    * Proof: Let s be the sum of the absolute distances between the n data points and the mean. Then MAD = s/n. By adding the two quartiles q1 and q3, the MAD becomes (s + |q1 – mean| + |q3 – mean|)/(n+2).
    By solving the equation s/n = (s + |q1 – mean| + |q3 – mean|)/(n+2) for |q1 – mean| + |q3 – mean|, you can show that the MADs are different if |q1 – mean| + |q3 – mean| is not equal to 2s/n = 2*MAD (or in the case of also adding the median to a list with even elements: if |q1 – mean| + |median – mean| + |q3 – mean| is not equal to 3s/n = 3*MAD)

    If this criterion is not met, the MAD will stay the same, for example: The quartiles of the list {1, 1, 1, 1} are q1=1, q2=median=1 and q3=1, which together have a distance of 0 from the mean, which is equal to 3*MAD = 3*0 = 0. So the boxplot of {1, 1, 1, 1, 1, 1, 1} is the same, but the MAD too.

    • Rudolf Österreicher

      As you already conjectured, there are infinitely many solutions, so here are just a few examples constructed the way that I described in my comment above (including examples with decimal numbers, negative numbers, repeated numbers, outliers and samples with only unique numbers):

      {1, 1, 2, 2} and {1, 1, 1, 1.5, 2, 2, 2}

      {2, 4, 6, 8, 20} and {2, 3, 4, 6, 8, 14, 20}

      {0, 1.5, 1.5, 2, 2, 2} and {0, 1.5, 1.5, 1.5, 1.75, 2, 2, 2, 2}

      {1, 2, 3, 4, 100} and {1, 1.5, 2, 3, 4, 52, 100}

      {-1, 1, 2, 6, 12} and {-1, 0, 1, 2, 6, 9, 12}

    • Rudolf Österreicher

      Quick note since the definition of the quartiles is not universally agreed upon: I’m specifically using the R2-Type quartiles = averaged inverted cdf type quartiles because that’s the one I’m used to.

  9. Rudolf Österreicher

    If we restrict ourselves to non-negative integers only (not written as fractions), the two data sets necessarily have to share numbers (because both need to have the same minimum and the same maximum). But even so, there are many solutions for two or more data sets that only contain integers between 0 and 9 (and at most once in each data set) and have the same boxplot. There are 99 different boxplots that describe two or more such data sets. Here are 4 examples of data sets that only contain distinct numbers between 0 and 9 and produce the same boxplot:

    (0, 3, 5, 7, 9) and (0, 3, 4, 6, 7, 9) and (0, 3, 4, 5, 6, 7, 9) and (0, 1, 3, 4, 5, 6, 7, 8, 9) and (0, 2, 3, 4, 5, 6, 7, 8, 9)

    (1, 3, 7, 9) / (1, 2, 3, 7, 8, 9) and (1, 2, 5, 8, 9) and (1, 2, 4, 6, 8, 9) and (1, 2, 3, 5, 6, 8, 9) and (1, 2, 3, 5, 7, 8, 9) and (1, 2, 4, 5, 6, 8, 9) and (1, 2, 4, 5, 7, 8, 9)

    (0, 2, 4, 6, 9) and (0, 2, 3, 5, 6, 9) and (0, 2, 3, 4, 5, 6, 9) and (0, 1, 2, 3, 4, 5, 6, 7, 9) and (0, 1, 2, 3, 4, 5, 6, 8, 9)

    (0, 2, 4, 7, 9) and (0, 2, 3, 5, 7, 9) and (0, 2, 3, 4, 5, 7, 9) and (0, 2, 3, 4, 6, 7, 9) and (0, 1, 2, 3, 4, 5, 7, 8, 9) and (0, 1, 2, 3, 4, 6, 7, 8, 9)

    • Rudolf Österreicher

      and different MADs of course, except for the two data sets separated by a forward slash

    • Rudolf Österreicher

      I mistakenly used the “wrong” definition for the quartiles to calculate these. With the Moore & McCabe / TI-83 definition of the quartiles, we get 174 different boxplots that represent 2 or more different data sets that only contain the digits 0-9 at most once per set.

      Here are 4 examples of sets of data sets where each data set only contains distinct numbers between 0 and 9 and all data sets in a set produce the same boxplot (now with the Moore & McCabe-definition of the quartiles), but have different MADs (except for the ones separated by forward slashes):

      (0, 1, 2, 4, 5, 7, 8) / (0, 1, 3, 4, 6, 7, 8) and (0, 1, 2, 4, 6, 7, 8) and (0, 1, 3, 4, 5, 7, 8) and (0, 2, 6, 8) / (0, 1, 2, 6, 7, 8) and (0, 1, 3, 5, 7, 8) and (0, 2, 4, 6, 8)

      (1, 3, 7, 9) / (1, 2, 3, 7, 8, 9) and (1, 3, 5, 7, 9) and (1, 2, 4, 6, 8, 9) and (1, 2, 3, 5, 7, 8, 9) and (1, 2, 4, 5, 6, 8, 9) and (1, 2, 4, 5, 7, 8, 9) / (1, 2, 3, 5, 6, 8, 9)

      (0, 2, 4, 7, 9) and (0, 1, 2, 4, 5, 8, 9) and (0, 1, 2, 4, 6, 8, 9) and (0, 1, 2, 4, 7, 8, 9) and (0, 1, 3, 4, 5, 8, 9) and (0, 1, 3, 4, 6, 8, 9) and (0, 1, 3, 4, 7, 8, 9) and (0, 1, 2, 6, 8, 9) and (0, 1, 3, 5, 8, 9)

      (0, 2, 5, 7, 9) and (0, 1, 2, 5, 6, 8, 9) and (0, 1, 2, 5, 7, 8, 9) and (0, 1, 3, 5, 6, 8, 9) and (0, 1, 3, 5, 7, 8, 9) and (0, 1, 4, 5, 6, 8, 9) and (0, 1, 4, 5, 7, 8, 9) and (0, 1, 3, 7, 8, 9) and (0, 1, 4, 6, 8, 9)

Leave a Reply to Robert Kaplinsky Cancel reply

Your email address will not be published. Required fields are marked *