Graphing

From Chemistry Resource
Jump to: navigation, search

Many important relationships can be expressed by graphing. For the analysis of materials, a standard curve is used to determine the concentration of a substance (i.e. - the unknown sample).

The analyst first prepares samples of the substance in various known concentrations (i.e. - the standards). The method of preparation should be similar to the unknown that is being measured. The analyst subjects the standards, one at a time, to an analytical instrument and records the data pair (Sample Concentration, Instrument Response) for each standard.

From data to graph

The following is a data collected for a set of standards.

Sample Concentration (g/mL) Instrument Response (mV)
0.0 0.00
10.0 0.14
20.0 0.28
30.0 0.41
40.0 0.52
50.0 0.71
65.0 0.84
75.0 1.04
80.0 1.15
Sample with unknown concentration 0.78 mV


A graph of the instrument response is plotted on the y-axis and the concentration on the X-axis. The known variable is always plotted on the x-axis and the measured variable is always plotted on the Y-axis. The graph shows a linear relationship between the instrument response and the concentration of the sample. The line drawn on the graph is known as the trendline or the best-fit line. The equation of the trendline can be used to determine the unknown concentration of the sample.

GraphThis sampleGraph.png

Trendline

When the instrument response to the concentration of the sample data points are graphed, the data points exhibit a linear relationship. That is, the mathematical model that the data follows is:

 Instrument response = m * (Concentration) + b 

where m is the slope, and b is the intercept. The relationship is similar to the equation of a line,

<math>y=mx+b</math>

where y = instrument response, and x = concentration of the sample.

Error.png

If we assume that errors affecting the concentrations (the values of the x-axis) are insignificant, then any difference between an experimental data point and the model would be due to error in measuring the instrument's response.

For each data point, the difference between the voltage measurement, Vmeasured, and the predicted voltage, Vpredicted, is a residual error, RE.

 RE = (Vmeasured – Vpredicted) 

Because these residual errors can be positive or negative, each data point’s RE is first squared and then summed to give a total residual error, REtot.

 REtot = Σ(Vmeasured – Vpredicted)2 

If different values for the slope and intercept for the line are picked, it would lead to different total residual errors. Therefore, the best-fit line is the line that has the best values for the slope and intercept, such that it would lead to the smallest total residual error.

Below are the equations to calculate the slope, m, and intercept, b of the best-fit line.

Slope.png
Intercept.png

The equation to calculate the correlation coefficient, r, which gives a measure of the reliability of the linear relationship between the x and y values, is shown below.

R value.png

A value of r = 1 indicates an exact linear relationship between x and y. Values of r close to 1 indicate excellent linear reliability. If the correlation coefficient is much less than 1, the predictions based on the linear relationship will not be reliable. The r2 value is the square of the correlation coefficient, r.

Manual calculation of the best-fit line

Click the image on the right to view these manual calculations to get the best-fit line for the "Instrument Response VS. Concentration" graph.

The simplest way to get the best-fit line through a set of point in Excel is to add a trendline. To show the trendline for the "Instrument Response VS. Concentration" graph, we select the Trend/Regression Type as Linear, set the Intercept to be (0,0) and add a trendline.

External Link - Algorithm to obtain the best-fit line

Screen capture from http://www.dynamicgeometry.com

[The Geometer's Sketchpad Resource Center]

Click on the above link to see if you can find the best-fit line of a given set of data point using least squares visually.

   * Place your mouse over the red dot that is labeled y-intercept and drag the line up and down the y-axis.
   * Place your mouse over the red dot that is labeled slope and change the slope of the line.

Use a combination of these motions to obtain a line that will give the smallest total area of all the squares. Click the image to the right to view a screen capture of the best-fit line obtained. My best effort is 811. What’s yours?

This is the algorithm that a computer program uses to find a best-fit line for a set of data that follows a linear mathematical model.

Points to note when generating the graph

When using Excel to generate this graph, be sure to pay attention to the following points:

  • When adding the trendline, right-mouse click over a data point. From the context menu, select:
         * Linear as the Trend/Regression Type
         * Set intercept 0,0
         * Display line on chart
         * Display R-squared value on chart

    FormatTrendline.png
  • For this graph, the slope of the trendline needs to display 3 significant figures in order to get 3 significant figures for the sample’s unknown concentration. Right-mouse click over the equation of the trend line. From the context menu, selectFormat Trendline label.</pre>
    FormatTrendlineLabel.png

  • Select Number on the left menu. In the Category menu, select Number. In the Decimal Places field, type ’4′. Click Close.
    FormatTrendlineLabel2.png