User not logged in - login - register
Home Calendar Books School Tool Photo Gallery Message Boards Users Statistics Advertise Site Info
go to bottom | |
 Message Boards » » Box and Whisker plot in Excel Page [1]  
neodata686
All American
11577 Posts
user info
edit post

So I'm running into some issues and would like every ones opinion. I have the following data:

0 - 1724
1 - 449
2 - 546
3 - 255
4 - 167
5 - 140
6 - 142
etc...

The first column is time and the second column is hits. So 546 people had a time of 2. It's easy enough to graph this because I can use time as an X axis and volume or % of total as the Y axis. This gives me a nice curve.

But I want an average, mean, etc and a box column plot. I can't simply do the math on column 2 because it's total. The only way I can think to do it is manually split it out. So:

0 1724 times
1 449 times
2 546 times etc.

I could probably create a logic statement to expand out the cells in that manner but I'm too lazy. Any suggestions?

2/16/2011 4:21:12 PM

wdprice3
BinaryBuffonary
45908 Posts
user info
edit post

you can manually create a box/whisker plot using stacked columns, difference between the values, error bars, and different shading/line colors on the columns.

is that what you want in the end?

ehh, maybe not. I don't get what you're asking here...

ah, I get it now. /long day

I can only think of a manual way. highlight, say A1, in the cell location box, type :A1724 (for your 0 value) then hit enter, then type 1724, ctrl+shft+enter and it will populate the cells. of course, if you have much more data than ^ this would be a pain, if not, it's quick and easy

[Edited on February 16, 2011 at 5:41 PM. Reason : .]

2/16/2011 5:17:31 PM

neodata686
All American
11577 Posts
user info
edit post

Yeah sorry if I was a little vague. I know how to create a manual box and whisker plot in excel. There's plenty of guides online but my issue is converting the data into usual data. AKA so I'm able to find an average, each quartile, min, max, etc.

I have a time in seconds that a customer talks with someone:

1 - 449
2 - 546
3 - 255
4 - 167
5 - 140
6 - 142
etc...

So 449 customers had a talk time of 1 second etc all the way up to like 2500 seconds.

My problem here is I can't get the data from that column because it's volume and not split out. You would have to create 449 rows with a "1" in it, and 546 rows with a "2" in it etc to get the stats I need.

Make sense?

Would a weighted average give me what I want? I have a cumulative percentage for each.

^heh. my range is from 0 to 9000+

[Edited on February 16, 2011 at 5:43 PM. Reason : s]

2/16/2011 5:38:21 PM

darkone
(\/) (;,,,;) (\/)
11608 Posts
user info
edit post

You're data isn't normally distributed, so a box and whisker plot isn't really appropriate. Means and quartiles don't really mean much in such cases and trying to use them can actually obfuscate the actual nature of the distribution of your data. Just plot a histogram.

2/16/2011 5:41:36 PM

wdprice3
BinaryBuffonary
45908 Posts
user info
edit post

weighting them may work.

but from my post ^^:

I can only think of a manual way. highlight, say A1, then in the cell location box, type :A1724 (for your 0 value) then hit enter, then type 0, ctrl+shft+enter and it will populate the column of cells with 0s. of course, if you have much more data than ^ this would be a pain, if not, it's quick and easy
9001 times isn't fun

histogram was my next suggestion, however, your data doesn't have to be normally distributed. box plots are non-parametric. they're meant to show skewness.


[Edited on February 16, 2011 at 5:44 PM. Reason : .]

[Edited on February 16, 2011 at 5:47 PM. Reason : .]

2/16/2011 5:41:42 PM

neodata686
All American
11577 Posts
user info
edit post

Quote :
"You're data isn't normally distributed, so a box and whisker plot isn't really appropriate. Means and quartiles don't really mean much in such cases and trying to use them can actually obfuscate the actual nature of the distribution of your data. Just plot a histogram."




I guess that's not a normal distribution. I still want the average, mean, etc.

Still quantiles don't ONLY apply to normal distributions right? Quantiles are taken from any distribution? Not just a normal one?


[Edited on February 16, 2011 at 5:51 PM. Reason : d]

2/16/2011 5:47:39 PM

wdprice3
BinaryBuffonary
45908 Posts
user info
edit post

histogram was my next suggestion, however, your data doesn't have to be normally distributed. box plots are non-parametric. they're meant to show skewness.

2/16/2011 5:51:03 PM

darkone
(\/) (;,,,;) (\/)
11608 Posts
user info
edit post

Quantiles still exists but they don't have the same utility in communicating the spread of the distribution. They're much harder to interpret in long tailed distributions.

Based on the plot in ^^ post, the mode is probably your most important descriptive statistic.

[Edited on February 16, 2011 at 5:53 PM. Reason : look at the purdy picture]

2/16/2011 5:51:37 PM

neodata686
All American
11577 Posts
user info
edit post

Quote :
"your data doesn't have to be normally distributed. box plots are non-parametric. they're meant to show skewness."


^^ True.

^ok makes sense. Long tailed distributions. I like that term.

Yes mode and average would be my key points.

I do have averages from another source (the vertical lines, which BTW took me a hot minute to figure out how to do in excel)

[Edited on February 16, 2011 at 5:58 PM. Reason : s]

2/16/2011 5:53:16 PM

darkone
(\/) (;,,,;) (\/)
11608 Posts
user info
edit post

<soapbox>
Box and whisker plots are obsolete as a modern statistical communication tool. Their origin lie in a time where complex high-resolutions images weren't feasible to produce. So the box and whisker plot makes clever use of the simple graphical elements that were available at the time for print and classroom use. But we don't have those same limitations any more. Looks at all the different options for plotting distribution information in this figure:

Which do you find most descriptive?
<\soapbox>

2/16/2011 6:06:54 PM

neodata686
All American
11577 Posts
user info
edit post

I guess I didn't specifically mean box and whisker. Maybe just a nice distribution visualization.

I like figure 1 (top right). I see your point though.

2/16/2011 7:59:51 PM

darkone
(\/) (;,,,;) (\/)
11608 Posts
user info
edit post

Sorry. Clear communication of data distributions in one of my buttons. I've recommended editors reject scientific papers I've reviewed because authors didn't disclosed the nature of their distributions and relied solely on mean and median values.

2/16/2011 9:25:55 PM

neodata686
All American
11577 Posts
user info
edit post

Gotcha no I completely understand. I just think in this instance while the mode is something that needs to be pointed out the mean is the key metric we're looking at. For example say we're looking at the cost of time then while the mode would tell us where the population is trending towards the mean is still going to give us how much money is spent on a particular user population (which requires taking into consideration the outliers).

I guess I didn't mean to use box and whisker in the traditional sense. Maybe just a visualization that represents the distribution. The chart I have is kind of messy. You're the expert though. My original question was how to convert the excel data in the first place. How to represent the data is more of a second step.

2/16/2011 11:17:01 PM

lewisje
All American
9196 Posts
user info
edit post

guys guys guys what you really need is a stem-and-leaf plot

2/17/2011 1:07:58 AM

darkone
(\/) (;,,,;) (\/)
11608 Posts
user info
edit post

I love TWW. You won't see a stats troll in too many other places.

2/17/2011 1:49:32 AM

rtc407
All American
6217 Posts
user info
edit post

Looks like you have gamma distributions:

http://en.wikipedia.org/wiki/Gamma_distribution

I'm working on travel time data with similar distributions but haven't jumped into the hard statistics yet

2/17/2011 4:59:51 PM

neodata686
All American
11577 Posts
user info
edit post

Gamma distribution!!!

2/17/2011 5:18:01 PM

 Message Boards » Tech Talk » Box and Whisker plot in Excel Page [1]  
go to top | |
Admin Options : move topic | lock topic

© 2024 by The Wolf Web - All Rights Reserved.
The material located at this site is not endorsed, sponsored or provided by or on behalf of North Carolina State University.
Powered by CrazyWeb v2.38 - our disclaimer.