Chapter 8- Real Science with Excel

8.1 Solving a Problem

The sulfate ion concentration in natural water can be determined by measuring the turbidity that results when an excess of BaCl2 is added to a measured quantity of the sample. A turbidimeter, the instrument used in this analysis, was calibrated with a series of standard Na2SO4 solutions. The following data were obtained in the calibration:

Mg (SO4)2- / L, Cx Turbidimeter Reading, R
0.000 0.052
5.000 1.665
10.00 2.322
15.00 3.912
20.00 4.999

First, carry out a linear regression for these data with the use of Microsoft Excelª . Put the value of Cx on the x (horizontal) axis and R on the y (vertical) axis. Second, plot the data with the use of Microsoft Excelª. Third, Print the data, the information from the linear regression analysis and the plot together with your name.

8.2 Entering Data into Excel

Move the cursor (it looks like a large +) to cell A1, click on cell A1 to select it and type Calibration Data.Ó Now press the return or enter key. The words Calibration Data should appear in cell A1.

 Next, move the cursor to cell A3 and enter the words Magnesium Concn. Move the cursor to cell B3 and type Turbidmeter Reading. Notice that the title Magnesium Concn is cut off. To show the entire title, change the width of the column by clicking on the line between the letters A and B near the top of the worksheet. The cursor should change so that it looks like this:

 

Move this new cursor between A and B and click and drag to adjust the width of column A. You could also use the autofit command mentioned in the previous section.

 Enter 0.000 in cell A4. Excel may format the data as 0. To format the number, drag the cursor over cells A4 to A9, choose format from the menu, cells from the drop down menu and number from the dialog box and 0.00. Type in two more zero's to get the dialog box to look like this:

  

Select OK. or hit the return button. Now drag the cursor from cell A4 to A8 and enter the data in the problem. Carry out a similar process to enter the Turbidimeter Reading in cells B4 to B8. Your spreadsheet should look something like this:

8.3 Charting Data

In order to prepare a graph of the data as an XY plot, click and drag to select all of the cell from A4 to A8 and B4 to B8. Your spreadsheet should look like this:

Then select the chart wizard tool on the menu

 

Move to the spread sheet and click and drag inside the spreadsheet to create a box. Don't worry about the placement of the box, you can move it or resize it later if needed. The following dialog box will appear:

This is correct, click on next or hit return. The next dialog box gives you a selection of charts.

In science we often use x-y charts. Click on this box to choose it and hit return or next. The next dialog box formats the chart. I usually like charts without lines so I choose the first type. Do this and hit return or next:

 

The next dialog box allows you to pick what column will include the data along the x axis. Hit next or choose return.

 

The final dialog box allows you to add or delete a legend, a title and to label the axis. I have made the choices shown below:

Finally hit return or choose finish to see the plot of Magnesium Concentration versus the Turbidimeter reading as shown below:

 

This initial chart can be modified to suit your needs. For example, the chart can be resized by clicking and dragging on the small black boxes on the side of the chart. If you want to know more about Excel, you should buy one of the many books available.

 From the chart the data look like they are reasonably linear. Next we will carry out a linear regression of the data.

 7. Linear Regression Analysis

 The linear regression analysis uses a least squaresÓ method to determine the equation of a straight line that best fits the data. When you carry out a regression analysis, the answer is returned as an equation in the form of

y = mx + b

 The goodness of fit is returned as an R2 value. To carry out the regression, double click on the graph to select the graph and then click on any point in the graph to select the points.

 

 

Go to the menu and select insert trendline:

 

In the dialog box click on options to get:

 

and choose automatic for trend line, display equation on chart and display R-squared Value on chart. Choose OK or hit return to get the chart and the equations you have chosen:

 

You may want to have more significant figures than those that are displayed. You can do this by double clicking on the box that contains the chart to select these labels and then choosing format from the menu bar and selected data labels from the drop down menu. Format the number as you did above.

 

In order to print the graph in a reasonable amount of time and to make it legible, remove the gray background by double clicking on the background to bring up a dialog box, Format Plot Area. Under area, click on none to remove the grey background and finally print the chart with the equation and R2 value.

 

Theory

The Meaning of R2

 R2 is a the coefficient of determination. It compares the estimated and the actual

y-values. It can have values from 0.00 to 1.00. If it is 1.00 there is no difference between the estimated y-values and the actual y-values. In other words, the fit between the line and the data is perfect and all the data fall on the straight line. On the other hand, if R2 is 0.00 the regression equation is no help in predicting a y-value.

 

8.4 The Next Step in Charting and Formatting

The data you are going to enter are typical data from a chemistry problem. These data consist of measurements of the sample solution pH as a function of volume of titrant base (in mL). For this example, the volume data will be entered into column A and the associated pH values will be placed in column B. Select cell A4, type mL Base and press Return. Then select cell B4, type pH and press Return. These data titles can be written in boldface by selecting the range of cells A4 to B4 and clicking the Bold button on the formatting tool bar, or by pressing the shift, command and B keys at the same time. These text entries can also be centered in their respective cells by clicking the center alignment button on the formatting tool bar.

The data for this titration are given as follows:

Titration Data

mL of Base

00.00

10.00

25.00

40.00

49.00

50.00

50.10

51.00

60.00

75.00

pH

2.88

4.16

4.76

5.36

6.45

8.73

10.00

11.00

11.99

12.35

Select cell A5 to begin entering the volume data. Type 0.00 and press Return. If the data are not formatted correctly (for example if the number appears with one place after the decimal) choose format cells from the menu bar, then choose numbers from the folders that are available. This will allow you to choose the format you want for the numbers you enter.

Prepare a chart of the data as an XY. Place a title on the chart and to put labels on the axes

 

Excel can modify the numbers you have entered. For example, you could use Excel take the antilog of the pH values in order to determine the Hydrogen ion concentration, [H+]. Next we will use Excel to make a linear least-squares fit to data and plot the resulting line with the data points.

Theory

 Although the theory of how the data you have entered can be converted to a straight line graph is beyond this course, here is a summary. From acid-base theory, it can be shown that the following equation is true:

 (10-pH)(mL of Base) =

Ka(mL at eq. Pt - mL of Base)

which is the formula for a straight line, y = mx + b. When you plot (10-pH)(mL of Base) versus the mL of Base you will get a straight line up to the equivalence point. The slope of that line is -Ka, and the x-axis intercept is the mL of base at the equivalence point.

In order to generate a straight line for the data you entered for the titration of a weak acid you will need to need to convert the data you have entered and then construct a Gran plot. This plot is used to when the slope of the line you generated previously is a maximum. The region of maximum slope is as shown in the figure below.

 

It is difficult to determine exactly where the slope is a maximum in the plot above. It is always easier to interpret data from straight line graphs. The following instructions will convert a portion of your data and allow you to obtain a linear relationship.

 

First, you must convert the data you have entered into the necessary data for the y axis. These data have the form [(10-pH)(mL of Base)]. This is accomplished using cell formulas. Select cell C5 to enter the first formula. All formulas in Excel must begin with an equal sign (=). The expression [(10-pH)(mL of Base)] is entered by substituting the cell references for the volume of base and the pH. Type =A5*10^-B5 and press Return. This instructs Excel to take the volume data point in cell A5 and multiply (*) it by 10 raised (^) to the negative value of the pH data point in cell B5. Since the value in cell A5 is zero, this product also equals zero, the result now shown in cell C5. If you did not get this result, erase the contents of cell C5 and carefully reenter the formula. If the formula was returned as the contents of cell C5, you forgot to include the equal sign.

 

Next title the data in column C so that it is pretty and informative. Select cell C4, type Vb*10-pH and press Return. Click the Bold button and center alignment button. Select Format from the Menu bar and choose Column, and Autofit Selection. The label should now fit and be boldface and centered.

 

One of the most powerful features of spreadsheets is their ability to automatically increment references. This enables you to apply the formula you just entered to the other data without having to repeatedly enter this formula. With C5 as the active cell, select Edit from the Menu bar and click Copy. Now select cells C6 to C15 by holding down the left mouse button, dragging the cursor to Cell C15 and releasing. Select Edit from the menu bar once again and click Paste. The formula in cell C5 should now be copied to the other cells in the selected range with the cell references automatically incremented. For example, click on cell C6 and you should find the expression =A6*10^-B6 in the Formula bar. Overall, your results should appear as shown below:

mL of Base

pH

Vb*10-pH

0.00

2.88

0.00E+00

10.00

4.16

6.92E-04

25.00

4.76

4.34E-04

40.00

5.36

1.75E-04

49.00

6.45

1.74E-05

49.90

7.46

1.73E-06

50.00

8.73

9.31E-08

50.10

10.00

5.01E-09

51.00

11.00

5.10E-10

60.00

11.99

6.14E-11

75.00

12.35

3.35E-11

 Notice that the results of the calculations have been expressed in scientific notation and have been formatted so that the numbers all have two places after the decimal. (2.09E-5 means 2.09 x 10-5).

 

There is a faster way to carry out the step described above. Click in cell C5 to make it active. Enter the formula =A5*10^-B5 and press Return. This will return the correct value. Now click in cell C5 and hold the mouse down to make the cursor look like a large + sign.

 

Finally, with the mouse held down drag to cell C15. This step is called a fill down, and should result in the same chart that you had previously, but without the added steps of cutting and pasting.

In order to determine which portion of the curve gives the best straight line you must prepare a chart of mL of Base along the x-axis versus the data as an X-Y plot so that looks like this:

 

From the chart the data look like they will be linear from 10.00 mL of Base to 50.00 mL of Base. This is the region that will be used to carry out a linear regression. Close or delete the chart. There is no need to save it.

 The linear regression analysis uses a least squares method to determine the equation of a straight line that best fits the data. When you carry out a regression analysis, the answer is returned as an array of 2 columns by 5 rows. In order to fit the data that will be returned form an array by selecting all the cells E6 to E10 and F6 to F10 This is a called a 2 column by 5 row array and the data that will be returned in the array is:

 

In the Menu bar choose Insert, and select function, next select Statistical and LINEST. The dialog box should look as it appears below:

  

Press return to get to the following dialog box.

 

Because of a bug in Excel, this dialog box does not work correctly. Therefore, click on Finish. This places the following formula in the formula bar:

 =LINEST(known_y's,known_x's,const,stats)

 

Double click on known_y's in the formula bar to select the words. Click and drag from C6 to C11. The words known y'sÓ in box labeled known_y's will be replaced with C6:C11. Click on known_x's and drag from A6 to A11. The box will now show A6:A11. Double click on the word constÓ and type in the word trueÓ and doubl click on the word labeled statÓ and type in the word true.Ó Now press the command and enter keys to return the array:

Vb*10-pH

     

0.00E+00

Statistical Analysis by LINEST

6.92E-04

-1.7309E-05

8.6595E-04

 

4.34E-04

2.7730E-08

1.1159E-06

 

1.75E-04

9.9999E-01

1.0229E-06

 

1.74E-05

3.8964E+05

4.0000E+00

 

1.73E-06

4.0772E-07

4.1856E-12

 

9.31E-08

     

5.01E-09

     

5.10E-10

     

6.14E-11

     

Microsoft has informed me the next version of the program will fix this problem with the function wizard.

From the array and using the fact that the array returns values as follows:

 

 

you can determine that the slope is -1.731 x 10-5, the y intercept is 0.00086595 and the R2 is 0.99999.

 You can now use Excel to calculate the value of the x-intercept. The x-intercept occurs when y = 0. At this point the formula of the line changes from y = mx + b to 0 = mx + b. Solving for x gives x = b/m . Pick a cell (E15 for example) and type y intercept =. Select another cell next to the cell you have chose (E16) and type the formula for the x intercept ( in this case it is = F6/E6). Press return to get the number 50.0273.

 It is possible to use another technique to obtain the same results. With this method almost all of the steps are automatic. Look at the bottom of the tools menu to see if the tool called data analysisÓ is available. If it does not appear at the end of the list, open analysis tools from the server disk (found in the Excel folder). Next choose Regression. Your dialog box should look like this:

 

Enter the y range and the x range as you have done before, press return and Excel will open a new worksheet (usually numbered 17) this has the complete statistical analysis for the regression line. Part of the data appear like this:

 

 

Coefficients

Intercept

0.00079207

X Variable 1

-1.553E-05

 

Where the slope is labeled as X Variable 1 and the intercept labeled Intercept.

 8.5 Advanced Techniques in Charting

In the previous section you have seen how put a trend line through the all of the data points. In many cases it is useful to superimpose a graph of the best straight line through some of the points. For example, it is clear that there a straight line through the all points below does not accurately reflect the data because the data are only linear between about 10 and about 50 mLs of base.

 

Excel can calculate the best straight line between 10 and 50 mLs of base so that you can plot a trend line through these points. The function you will enter is called TREND. The trend function is similar to the LINEST function because TREND also accepts four arguments: TREND (known_y's, known_x's, new x's,const). The first argument, known_y's, is the known values of your dependent variable, Vb*10-pH. The second argument, known_x's, is the known values of your independent variable, mL Base. To calculate the trend line data points that best fit your known data, simply omit the third and fourth arguments from this function. The TREND function will then generate an array that will be the same size as your known x's range. Remember the function wizard does not work with arrays and you have to enter the arguments by hand.

The data that is returned should look like this:

mL of Base

pH

Vb*10-pH

 

0.00

2.88

0.00E+00

Trend

10.00

4.16

6.92E-04

0.00069285

25.00

4.76

4.34E-04

0.00043321

40.00

5.36

1.75E-04

0.00017357

49.00

6.45

1.74E-05

1.7784E-05

49.90

7.46

1.73E-06

2.2051E-06

50.00

8.73

9.31E-08

4.7416E-07

50.10

10.00

5.01E-09

 

51.00

11.00

5.10E-10

 

60.00

11.99

6.14E-11

 

75.00

12.35

3.35E-11

 

 

Again because of a bug in the program, you may obtain data that does not look like the data above. The instructor may have illustrate this technique.

 The data in D6 to D11 are the points on the trend line that best describes the data set.

 To graph these data you will use some of the same techniques as you used before in section 6.6. You will use the data in the following cells: A5:A15, C5:C15 and D5:D15. First use the command key to select non contiguous columns and select A5: A15and C5: C15 and D5: D15. From the proper choice of formatting commands, you can show all of the points from 10 to 60 mLs of base, and plot, without showing the points, the trend line from 10 to 50 mLs of base. Your chart, after many adjustments should look like this:

 

Now make the chart look better by adding labels to the chart axes, adding a title, attaching comments to data points, changing the patterns of the border, adding a legend, and other chart techniques. We will stop here.