View Full Version : Pearson Product Moment

jsteinfo

07-30-2009, 02:40 PM

Hi Folks,

Let me first start of by apologizing for my ignorance. I am not very versed in the Statistics domain. I am a software developer supporting statisticians and engineers.

I have been asked to implement a Regression analysis that results in something called a "Pearson Product Moment". I have been looking through the JMSL reference manual and do not see such a thing.

Does JMSL support this analysis function? I was given this link for the definition of the analysis function: http://www.mnstate.edu/wasson/ed602pearsoncorr.htm but so far do not see such a thing in the JMSL docs.

Thanks!

Jesse

Disclaimer: I am also not a statistician, so I am not sure this applies.

Have a look at the JMSL class ContingencyTable (http://www.vni.com/products/imsl/jmsl/v501/manual/api/com/imsl/stat/ContingencyTable.html). This is the only reference for "Pearson" in the JMSL documentation in the context of the Chi-squared statistic. If this isn't what you're looking for, I can certainly file an enhancement request for consideration to include this functionality in a future release.

Jesse,

FYI, I am not a statistician either like you and Ed are. I am just a fan of Karl Pearson, who is credited for his work on Pearson?s family of distributions. So I couldn?t resist replying to this thread.

If you are just looking to find the degree of association between the two variables in your regression analysis routine, below is a code example that calculates the Pearson Product Moment Correlation Coefficient, also called ?Pearson's r?.

Note: I used data form the link you provided.

public class PearsonProductMoment {

/**

* @param args

*/

public static void main(String[] args) throws Exception {

// TODO Auto-generated method stub

// input data

double x[] = { 3, 7, 2, 9, 8, 4, 1, 10, 6, 5 };

double y[] = { 11, 1, 19, 5, 17, 3, 15, 9, 15, 8 };

// Testing

System.out.println("PPM CC using definition formula: "

+ definition(x, y));

System.out.println("PPM CC using row score formula: "

+ Row_Score(x, y));

}

/*

* Method 1: This method calculates the Pearson Product Moment Correlation

* Coefficient by using the definition formula

*/

public static double definition(double dataX[], double dataY[]) {

double meanX = 0.0;

double meanY = 0.0;

// Calculate the number of elements

final int N = dataX.length;

for (int i = 0; i < N; i++) {

meanX += dataX[i];

meanY += dataY[i];

}

meanX /= N;

meanY /= N;

double upperZxy = 0.0;

double varX = 0.0, varY = 0.0;

for (int i = 0; i < N; i++) {

double Zxy = (dataX[i] - meanX) * (dataY[i] - meanY);

upperZxy += Zxy;

double varx = dataX[i] - meanX;

varX += varx * varx;

double vary = dataY[i] - meanY;

varY += vary * vary;

}

double r_definition = upperZxy

/ (N * Math.sqrt(varX / N) * Math.sqrt(varY / N));

return r_definition;

}

/*

* Method 2: This method calculates the Pearson Product Moment Correlation

* Coefficient by using the row score formula

*/

public static double Row_Score(double dataX[], double dataY[]) {

final int N = dataX.length;

double sumX = 0.0, sumY = 0.0, sumXY = 0.0, sumXX = 0.0, sumYY = 0.0;

for (int i = 0; i < N; i++) {

sumX += dataX[i];

sumY += dataY[i];

sumYY += dataY[i] * dataY[i];

sumXX += dataX[i] * dataX[i];

sumXY += dataX[i] * dataY[i];

}

double numerator = (N * sumXY - (sumX * sumY));

double denominator = Math.sqrt(N * sumXX - sumX * sumX)

* Math.sqrt(N * sumYY - sumY * sumY);

double r_Row_Score = numerator / denominator;

return r_Row_Score;

}

}

The result from this calculation gives similar result as in the original example shown on the link you gave.

PPM CC using definition formula: -0.3611811566150104

PPM CC using row score formula: -0.3611811566150105

totallyunimodular

08-04-2009, 03:18 PM

The Pearson Product Moment is just the "usual" correlation, or r value, one might think of. fik was nice enough to provide the code you could use, or alternatively use the JMSL CrossCorrelation class and take the lag=0 correlation value from the results. For example, here is doing this in Python usin PyIMSL

In [3]: from imsl.stat.crossCorrelation import crossCorrelation

In [4]: a = [3, 7, 2, 9, 8, 4, 1, 10, 6, 5]

In [5]: b = [11, 1, 19, 5, 17, 3, 15, 9, 15, 8]

In [6]: crosscorrelation(a,b,1)

Out[6]: array([ 0.54774167, -0.36118116, 0.47072949])

So, at lag = -1 the cross correlation between a and b is 0.54774167

At lag = +1 the cross correlation is 0.47072949

And at lag = 0, i.e., the Pearson Product Moment, the cross correlation -0.36118116:)

Powered by vBulletin® Version 4.2.3 Copyright © 2020 vBulletin Solutions, Inc. All rights reserved.