2.6 Draw Scatter Plots and Best Fitting Lines

 

Scatter plot

A graph of a set of data pairs (x, y)

 

Positive correlation

The relationship between paired data when y tend to increase as x increases

 

Negative correlation

The relationship between paired data when y tends to decrease as x increases

 

Correlation coefficient

A number, denoted by r, from -1 to 1 that measures how well a line fits a set of data pairs (x, y)

 

Best-fitting line

The line that lies as close as possible to all the data points

 

Example 1

Estimate correlation coefficients

For each scatter plot, describe the correlation shown and tell whether the correlation coefficient is closest to -1, -0.5, 0, 0.5, or 1.

a.                
 

b.                  


Solution

a.   The scatter plot shows a _strong negative_ correlation. So, the best estimate

given is r = _-1_.

b.   The scatter plot shows a _weak positive_ correlation. So, r is between _0_ and _1_ but not too close to either one. The best estimate given is r = _0.5_.

 

APPROXIMATING A BEST-FITTING LINE

Step 1 Draw a _scatter plot_ of the data.

Step 2 Sketch the _line_ that appears to follow most closely the trend given by the data points. There should be about as many points _above_ the line as _below_ it.

Step 3 Choose _two points_ on the line, and estimate the coordinates of each point.

Step 4 Write an _equation_ of the line that passes through the two points from Step 3.

 

Example 2

Approximating a best-fitting line

 

The table below gives the number of people y who atended each of the first seven football games x of the season. Approximate the best-fitting line for the data.

x

1

2

3

4

5

6

7

y

722

763

772

826

815

857

897

1.        Draw a _scatter plot_.

 

2.        Sketch the best-fit line.

Be sure that about the same number of points lie above your line of fit as below it

 
 


3.        Choose two points on the line. For the scatter plot shown, you might

choose(1, _722_ ) and (2, _750_ ).


4.         Write an equation of the line. The line that passes through the two points

has a slope of:

m =                      = _28_

Use the point-slope form to write the equation.

y - y1 = m(x - x1)

Point-slope form

y - _722_ = _28(x - 1)_

Substitute for m, x1 and y1

y = _28x + 694

Simplify.

An approximation of the best-fitting line is y = _28x + 694_.

 Example 3

Use a line of fit to make predictions

 

Use the equation of the line of best fit from Example 2 to predict the number of people that will attend the tenth football game.

 

Because you are predicting the tenth game, substitute _10_ for x in the equation from Example 2.

y = _28x + 694_ = _28(10) + 694_ = _974_

You can predict that _974_ people will attend the tenth football game.

 

Complete the following exercises.

 

For each scatter plot (a) tell whether the data has positive correlation, negative correlation, or no correlation, and (b) tell whether the correlation coefficient is closest to -1, -0.5, 0, 0.5, or 1.

1.                         

a.      positive correlation

b.      1

2.                                                     
 


a.       no correlation

b.       0


3.   The table gives the average class score y on each chapter test for the first six chapters x of the textbook.

 

x

1

2

3

4

5

6

y

84

83

86

88

87

90

 

a. Approximate the best-fitting line for the data.

b. Use your equation from part (a) to predict the test score for the 9th test that the class will take.

a. y= 1.3x + 82.1

b. about 94