Overview
In the last post I introduced Red and Green, my two digital
temperature loggers.
In this really simple experiment I’ve put them next to
each other on the window sill of my office and let them run for as long as
they can.
According to the manufacturer, they can measure and record
the temperature every ten seconds for a total of 28,800 measurements.
That’s a period of 3 days and 8 hours.
|
The questions I want to answer are:
How much of the data I collect is actually useful?
What issues do I have in collecting and comparing the data?
Since I’ve placed them next to each other, I expect them to
record the same temperature. Do they?
You may remember the manufacturer claims they measure
temperature to within 0.5 º C, that is, within one-half of one degree Celsius.
Our instrumental error is 0.5 º C.
Comparing the measurements
In the last post I pointed out that Red and Green each have
a button on their fronts to start collecting data. I haven’t mastered pressing
the two buttons at the same time and in this experiment I started Red at 18/03/2015 16:08:28, but started
Green five seconds later at 18/03/2015
16:09:33.
This is a systematic error. My measurement of the
difference in temperature between the two thermometers will be wrong in every
single case because they’re measuring temperatures at different times. As a
experimenter I can only hope the difference isn’t too much.
Here’s the first
few sets of measurements:
Data Points
|
Date/Time
|
Temperature(°C)
|
|||
1 |
18/03/2015 16:08:28
|
29.2
|
|||
2 |
18/03/2015 16:08:38
|
29.2
|
|||
3 |
18/03/2015 16:08:48
|
29.2
|
|||
4 |
18/03/2015 16:08:58
|
29.1
|
|||
5 |
18/03/2015 16:09:08
|
29.1
|
|||
6 |
18/03/2015 16:09:18
|
29
|
Data Points
|
Date/Time
|
Temperature(°C)
|
7 |
18/03/2015 16:09:28
|
29
|
1 |
18/03/2015 16:09:33
|
27
|
8 |
18/03/2015 16:09:38
|
29
|
2 |
18/03/2015 16:09:43
|
27
|
9 |
18/03/2015 16:09:48
|
29
|
3 |
18/03/2015 16:09:53
|
27
|
10 |
18/03/2015 16:09:58
|
29
|
4 |
18/03/2015 16:10:03
|
27.1
|
11 |
18/03/2015 16:10:08
|
29
|
5 |
18/03/2015 16:10:13
|
27.2
|
12 |
18/03/2015 16:10:18
|
29
|
6 |
18/03/2015 16:10:23
|
27.3
|
13 |
18/03/2015 16:10:28
|
29
|
7 |
18/03/2015 16:10:33
|
27.4
|
14 |
18/03/2015 16:10:38
|
29
|
8 |
18/03/2015 16:10:43
|
27.4
|
15 |
18/03/2015 16:10:48
|
29
|
9 |
18/03/2015 16:10:53
|
27.5
|
As you can see the first six measurements of Red don’t have
any corresponding values in Green.
Similarly, Green has six extra measurements that Red doesn’t.
To make a fair comparison, I need to exclude the six pairs
of non-matched measurements, reducing my total from 28,800 to 28,794. Remember
I’m talking about comparing pairs of measurements. When I exclude six
measurements from each set, that’s six pairs, not, twelve measurements.
This is a missing data
problem. There are lots of ways of dealing with missing data. This is the
simplest: just don’t use data pairs with missing data. Other techniques try to replace missing data
with some ‘appropriate’ value. The Australian Bureau of Meteorology and other
weather ‘authorities’ use homogenisation, where they replace the temperature
from one weather station with that from another station, possibly one hundreds
of kilometres away.
OK, what does the data look like? The best way to ‘see’ the
relationships in data is to draw a graph. Plus, I LOVE graphs.
There are several things about graphs that should be ticked
off in the mind when looking at one:
- The graph’s title should tell you what the graph shows. This one is about measured temperature on the vertical axis against time on the horizontal axis.
- The axes should have labels that tell you how the compared quantities are measured. Temperature in º C and time is in tens of thousands of seconds starting at a particular day and time..
- The colour of the line tells you which temperature logger is reported. Note that the lines are so similar that they’re indistinguishable except for the little corner right at the start.
The Graph tells us lots of things. Here’s a few:
- As noted, the loggers report close to the same temperature. We’ll look at how close later.
- The temperature goes up and down as we expect. We all know it’s hotter during the day and colder at night. To answer more detailed questions like “What’s the hottest time of the day?” we’ll have to look more closely at the data.
- It looks like the temperature got generally cooler as time progressed. The top temperature the first day (the first up-pointing lump) appears to be around 30 º C. The top temperature for the second day is about 25 º C and the top for the third day is about 23 º C. The lowest temperature for each day also appears to be cooler than that on the previous day.
If I got back to the data, I can extract the highest and
lowest temperatures. Since there’s lots of data (28,794 measurements to be
exact) I won’t do the extraction myself. I’ll do one of my favourite things:
I’ll write a program to do it. Here’s
the results:
Note that I’m using Australian date format with the day
first, then the month.
For each day I’ve shown:
- The time of the first and last temperature reading for the day (Start and Finish);
- The number of measurements made that day (Count);
- The minimum temperature that day and the first time that temperature was reached (Minimum and Time of Minimum);
- The maximum temperature that day and the first time that temperature was reached (Maximum and Time of Maximum);
- The average temperature reached that day (Average); and
- The range of temperatures for that day, that is, the difference between the maximum and minimum temperatures (Range);
The Time of Minimum and Time of Maximum are the earliest
times when that temperature was reached. On 19/3/2015, for example, the minimum
was actually reached nine different times, in the 10 minutes 25 seconds between
07:16:38 and 07:27:03. During that period the temperature bounced between 18.6
and 18.7 º C.
I could have also reported the latest time the lowest
temperature was reached or the time of the middle of the 10 minute period. All
would be correct.
Also, of course, the clocks in the thermometers aren’t exact
either and may well be slightly different to the clock on my laptop, so there’s
instrumental error in the time as
well as in the temperature.
Finally, we can compare how the two, supposedly identical
thermometers measured the same temperature over the same time.
How well do the two temperature loggers agree? This graph
shows us. For most of the measurements, they differ by either + 0.1 or – 0.2 º
C. The differences at the start were considerable larger as noted on the graph.
These differences are well within the manufacturer’s stated accuracy of 0.5 º C
All this seems pretty pedantic. Most of us just see the
pattern in the wavy line and have a feel for what it represents. The tabular
data adds details that (or may not) be of interest.
There’s one important thing we can check. For many years,
the average temperature was calculated as the average of the highest and lowest
temperatures.
As the picture shows, it’s not all that hard to produce a
thermometer that measures the high and low temperatures for a particular day.
|
The table below shows the minimum, maximum and average
temperatures calculated from the data we collected every ten seconds over a
three-day period. I’ve added a column for ‘Calculated Average’ that’s computed
by adding the maximum and minimum and dividing by two.
None of the averages calculated this way agree with the
average of the 8,640 measurements taken each day. In one case,, on the 19th,
the difference is 1.2 º C. That’s more
than twice the error in accuracy that the manufacturer claims for the
temperature logger.
|
Finally
In this mind numbingly simple experiment, we actually
uncovered a few important matters:
- The measurements contained several sources of error.
- There was instrumental error; the temperature can only measure accurately to 0.5 º C. The clocks in the loggers also will have some error, but we don’t have any estimate of that.
- We had missing data. Not all of the data we gathered could be compared as the loggers gathered data from slightly different periods of time.
- We the data loggers experienced a systematic errors, where Red started about 2 º C warmer and about five seconds earlier than Green. The first of these systemic errors got smaller as the experiment unfolded but the second remained constant throughout.
- Calculating the average for the day based on just on two values, the maximum and minimum temperature, does not give the same value as calculating the average of 8,640 separate measurements.
Based on this simple experiment, we can certainly wonder how
accurate the average temperatures claimed in the past would compare to their true
values. In other words, what’s the
uncertainty in past temperature records?
After all, we asked to believe in dangerous, anthropogenic
(human induced) global warming. Uncertainties in the temperature record need to
be clearly understood before policy decisions are made. It doesn’t matter what
scientists believe. What matters is, how
good are their numbers?