# cumulative frequency graphs - grouped data?

## Recommended Posts

Hi everyone,

I'm drawing a cumulative frequency table & graph for coursework on different lengths of words between an newspaper and magazine. I've read in my study guide that data must by grouped before it can be processed into a cumulative frequency table and graph.

I understand the reason for this when the range of data is large - but my data ranges from 1 letter to the biggest word I found with 15 letters. Must I group this relatively small data set if I want to plot a frequency graph? Will I get lower marks if I don't? I could plot it in 5 groups of 3 letters - but I don't see the point of this considering that 1 - 15 isn't that large. I'd be grateful to know what you think when it comes to whether or not grouping data is necessary1

Thanks

Gav

##### Share on other sites

I am not familiar with your mark scheme which will have the final say on it :@

But I would have thought if you justify your grouping (so your groups would be 15 groups with 1 word length in each group) then you would not lose marks.

##### Share on other sites

Can I just ask one more thing? I'm not sure what the overall benefit of cumulative frquency analysis is. Is it the fact that it enables us to read upper and lower quartiles? So we can see if the data ranges between to data sets and maybe has a higher upper quartile in one set than in an other?

I think that what you said is correct and, in addition, it show the distribution of your data ie. the steepest part of the curve shows the highest density of results. Similar to a histogram I guess.

##### Share on other sites

thanks for that Klaynos - I will have 15 groups, each increase by 1 letter to a maximum of 15 letters.

Can I just ask one more thing? I'm not sure what the overall benefit of cumulative frquency analysis is. Is it the fact that it enables us to read upper and lower quartiles? So we can see if the data ranges between to data sets and maybe has a higher upper quartile in one set than in an other?

##### Share on other sites

I'm going to have to admit not really knowing after at least 4 years of not drawing them and my stats text book is buried at the bottom of my wardrobe...

##### Share on other sites

lol! ok thanks anyway

##### Share on other sites

1 thing cumulative graphs are good for are more easier comparisons between curves. That is, because the cumulative graphs have already been normalized between 0 and 1, you can directly compare two sets of data, no matter if the actual ranges of the data are very different or not.

Some of the numbers of interest are more easily found on a cumulative graph. Like the mean is the point when the graph crosses 0.5. Doesn't matter if the actual data is bimodal or trimodal or non-modal at all. Some of those multimodal graphs can be hard to guess what the mean is, with the different sized peaks.

Finally, the integral under the cumulative distribution has to be one. That has some nice properties, especially when converting variables or means. For example, you use the integral properties when you want to calculate the geometric mean or the harmonic mean from a set of data that is set up to calculate the number mean. Conversion between data sets comes into play, too. For example, say you have a particle diameter distribution, and you want to convert that to a particle mass distribution. Integral properties come into play there, also, and it is usually much easier to work with the cumulative functions rather than the frequency distributions.

All that said, I think that people in general have a much greater intuition about the frequency distributions. For example, they can look at your length of word frequency distribution and see that the mean length of the word is 3 1/2 letters for example because that is where the peak will be in the middle of that curve. It would probably take a person a long time to look at the cumulative distribution and gather than info. But, both forms of the curve have their uses.

## Create an account

Register a new account