Tuesday, March 4, 2014

Correlation and Regression with SPSS



The purpose of this paper is to state the assumptions for the Pearson correlation coefficient and a simple linear regression, develop null and alternative hypotheses, determine whether to reject or retain the null hypothesis, report on the SPSS analysis, generate a scatterplot and syntax and output files in SPSS.

Statistical Assumptions
The two statistical assumptions of the Pearson correlation are that the variables are bivariately normally distributed, the cases represent randomly selected samples from the population, and scores on variables for one case are independent of scores on these variables for other cases (Green & Salkind, 2014).

Brief Analysis

The research question is: Does age and the number of hours worked last week relate in a statistically significant linear fashion?

The null hypothesis is: Ho: ρ= 0; There is no correlation between the variables.

The alternative hypothesis is: H1: ρ ≠ 0; there is a real correlation between the variables.

The independent variable is age and the dependent variable is hours worked last week. Correlation coefficients were computed among the two continuous variables of age and hours worked last week. To control for Type 1 error across the two correlations, I utilized the Bonferroni approach to calculate a p value of less than .025 (.05/2 = .025) was required for significance. The results in the table 1 shows that both correlations were statistically significant at the .01 level of significance. I found r(1483) = .32, p > .000. There is a significant negative relationship between the age of participants and the number of hours worked last week. I reject the null hypothesis. The effect size is .1

A linear regression analysis was conducted to evaluate the prediction of age as it affects hours worked last week. The scatter plot for the two variables, as shown in Figure 1 indicates that the two variables are linearly related such that as age increases, the number of hours worked last week decreases.



Syntax and Output Files
Notes
Output Created
01-FEB-2014 09:14:02
Comments

Input
Data
C:\Users\Deborah\Desktop\Stats\gss04student_corrrected.sav
Active Dataset
DataSet1
Filter
<none>
Weight
<none>
Split File
<none>
N of Rows in Working Data File
1500
Missing Value Handling
Definition of Missing
User-defined missing values are treated as missing.
Cases Used
Statistics are based on all cases with valid data for all variables in the model.
Syntax
UNIANOVA INCOME BY RACE
  /METHOD=SSTYPE(3)
  /INTERCEPT=INCLUDE
  /POSTHOC=RACE(TUKEY QREGW C)
  /EMMEANS=TABLES(RACE)
  /PRINT=ETASQ HOMOGENEITY DESCRIPTIVE
  /CRITERIA=ALPHA(.05)
  /DESIGN=RACE.
Resources
Processor Time
00:00:00.08
Elapsed Time
00:00:00.08


Correlations

CORRELATIONS
  /VARIABLES=AGE HRS1
  /PRINT=TWOTAIL NOSIG
  /STATISTICS DESCRIPTIVES
  /MISSING=PAIRWISE.



Descriptive Statistics


Mean
Std. Deviation
N

AGE OF RESPONDENT
46.22
16.679
1495

NUMBER OF HOURS WORKED LAST WEEK
26.94
23.570
1490








Table 1.
Correlations

AGE OF RESPONDENT
NUMBER OF HOURS WORKED LAST WEEK
AGE OF RESPONDENT
Pearson Correlation
1
-.325**
Sig. (2-tailed)

.000
N
1495
1485
NUMBER OF HOURS WORKED LAST WEEK
Pearson Correlation
-.325**
1
Sig. (2-tailed)
.000

N
1485
1490
**. Correlation is significant at the 0.01 level (2-tailed).

GRAPH
  /SCATTERPLOT(MATRIX)=AGE HRS1
  /MISSING=LISTWISE.

Graph
 [DataSet1] C:\Users\Deborah\Desktop\Stats\gss04student_corrrected.sav

GET
  FILE='C:\Users\Deborah\Desktop\Stats\gss04student_corrrected.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
CORRELATIONS
  /VARIABLES=AGE HRS1
  /PRINT=TWOTAIL NOSIG
  /STATISTICS DESCRIPTIVES
  /MISSING=PAIRWISE.

Correlations
Notes
Descriptive Statistics


Mean
Std. Deviation
N

AGE OF RESPONDENT
46.22
16.679
1495

NUMBER OF HOURS WORKED LAST WEEK
26.94
23.570
1490


Correlations

AGE OF RESPONDENT
NUMBER OF HOURS WORKED LAST WEEK
AGE OF RESPONDENT
Pearson Correlation
1
-.325**
Sig. (2-tailed)

.000
N
1495
1485
NUMBER OF HOURS WORKED LAST WEEK
Pearson Correlation
-.325**
1
Sig. (2-tailed)
.000

N
1485
1490
**. Correlation is significant at the 0.01 level (2-tailed).
REGRESSION
  /DESCRIPTIVES MEAN STDDEV CORR SIG N
  /MISSING LISTWISE
  /STATISTICS COEFF OUTS CI(95) R ANOVA
  /CRITERIA=PIN(.05) POUT(.10)
  /NOORIGIN
  /DEPENDENT HRS1
  /METHOD=ENTER AGE.

Regression
Descriptive Statistics

Mean
Std. Deviation
N
NUMBER OF HOURS WORKED LAST WEEK
26.97
23.572
1485
AGE OF RESPONDENT
46.22
16.697
1485

Correlations

NUMBER OF HOURS WORKED LAST WEEK
AGE OF RESPONDENT
Pearson Correlation
NUMBER OF HOURS WORKED LAST WEEK
1.000
-.325
AGE OF RESPONDENT
-.325
1.000
Sig. (1-tailed)
NUMBER OF HOURS WORKED LAST WEEK
.
.000
AGE OF RESPONDENT
.000
.
N
NUMBER OF HOURS WORKED LAST WEEK
1485
1485
AGE OF RESPONDENT
1485
1485

Variables Entered/Removeda
Model
Variables Entered
Variables Removed
Method
1
AGE OF RESPONDENTb
.
Enter
a. Dependent Variable: NUMBER OF HOURS WORKED LAST WEEK
b. All requested variables entered.

Model Summary
Model
R
R Square
Adjusted R Square
Std. Error of the Estimate
1
.325a
.105
.105
22.302
a. Predictors: (Constant), AGE OF RESPONDENT

ANOVAa
Model
Sum of Squares
df
Mean Square
F
Sig.
1
Regression
86941.814
1
86941.814
174.798
.000b
Residual
737619.214
1483
497.383


Total
824561.028
1484



a. Dependent Variable: NUMBER OF HOURS WORKED LAST WEEK
b. Predictors: (Constant), AGE OF RESPONDENT

Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
95.0% Confidence Interval for B
B
Std. Error
Beta
Lower Bound
Upper Bound
1
(Constant)
48.162
1.704

28.267
.000
44.820
51.504
AGE OF RESPONDENT
-.458
.035
-.325
-13.221
.000
-.526
-.390
a. Dependent Variable: NUMBER OF HOURS WORKED LAST WEEK



Charts (Figure 1.)          





No comments:

Post a Comment