Wyloch All American 4244 Posts user info edit post |
I only know very basic statistics.
I have data that is calculated by two different codes. The results should ideally be identical but never are.
I need a way to calculate how closely the two sets of data match. They share the same domain of independant values.
Am I aiming for R-squared? Mean square error? Std dev? I don't even know what to google. 10/6/2007 9:09:42 PM |
The Coz Tempus Fugitive 26104 Posts user info edit post |
What codes? What are you talking about? 10/6/2007 10:36:09 PM |
Wyloch All American 4244 Posts user info edit post |
Can't talk about the codes other than to say they perform simulations. One of the codes calculates 'predicted data.' The other code uses real-world results to back-calculate and produce 'measured data.'
I need to verify that both sets of results agree with each other to within a certain amount (let's say, within 90%).
The datasets cannot be fit via any method as they are sporadic in shape.
My initial thought was the following: Dataset A (measured data) Dataset B (predicted data) % difference = (A - B) / B
Would that be an adequate way to verify agreement? 10/6/2007 10:45:25 PM |
darkone (\/) (;,,,;) (\/) 11610 Posts user info edit post |
I think you want a correlation. 10/6/2007 11:03:08 PM |
philly4808 All American 710 Posts user info edit post |
what is it you are trying to compare? Means, variance, etc. If you are looking to compare means, you probably need something like a two-sample t-test or something along those lines. That way you can see if they differ statistically significant from one another. A correlation measures the linear relationship between two variables and r-squared tells you how much variation is explained by your model so I'm not sure if either of those is what you are looking for. 10/6/2007 11:10:19 PM |
Wyloch All American 4244 Posts user info edit post |
Hmm.. Problem is I'm too ignorant to even know what I actually want other than to say I'd like a single number which would be an indicator of how close the two sets match. Much like how R-squared is a 0 - 1 value, 1 being the best, that tells you how well a dataset fits its own mean curve. 'Cept I'm dealing with two datasets that cannot be fit. 10/6/2007 11:30:16 PM |
moron All American 34144 Posts user info edit post |
If you're looking for a single number, what's wrong with just averaging the data and comparing the average?
Also the t test would result in a single number. 10/6/2007 11:36:06 PM |
Wyloch All American 4244 Posts user info edit post |
Because the data is all time dependant. The averages mean nothing because the y values are changing constantly.
Will check out t test.
[Edited on October 6, 2007 at 11:37 PM. Reason : ] 10/6/2007 11:36:52 PM |
The Coz Tempus Fugitive 26104 Posts user info edit post |
Hope the results of this analysis have no impact on public safety. 10/7/2007 12:16:05 AM |
skokiaan All American 26447 Posts user info edit post |
^ who knows
Quote : | "Major : Nuclear engineering" |
10/7/2007 12:28:12 AM |
Wyloch All American 4244 Posts user info edit post |
Of course it doesn't. I'm posting on TWW. This is a personal side project.
Additionally, nuclear technology today does not pose a threat to public safety.
[Edited on October 7, 2007 at 12:36 AM. Reason : ] 10/7/2007 12:35:25 AM |
darkone (\/) (;,,,;) (\/) 11610 Posts user info edit post |
You want a correlation. 10/7/2007 12:55:20 AM |