Data Collection
and Analysis
The
previous page
told you a bit about how to describe a group of numbers. Let's now look
at how these numbers should be collected during an experiment. For the
results of an experiment to be valid, there are several important steps
that must be followed.
Contents of Statistics Pages
Numbers
Collection and
Analysis
How to Lie and Cheat
The Sample
To collect data, you need to have something to measure. Your source of
data could be from people, animals, plants...anything that will provide
you the data you need. It is essential that the
data come from a representative sample of the overall population that you
want to describe. For example, if you wanted to report on the "average
American", it would be impossible to test every person in the United
States. So, instead, you take a "sample" of all of these people. It is
important that the sample of subjects has people from the all over
the country, of all ages, male and female, high and low income, etc., etc.
The results of an experiment or survey that sampled women only may be
different from the results if men were included.
Because it is usually impossible to test every subject in a whole
population, only a small portion (a "sample") of the entire population is
tested. For the most accurate estimation of the whole population, it is
best if the subjects in an experiment are selected at random. This means that everyone in the population has
an equal chance of being in the experiment. In this way the results from
the sample group are used to estimate what would happen within the whole
population. Example populations that researchers might sample
include: middle school students, parents, cats, or computer users.
It is also important that there are enough subjects in the sample group to
make meaningful statements about the results. Do you think that an
average coming from 3 people is better or worse than an average coming
from 100 people? The accuracy of the sample data is dependent on the size
of the sample: in general, a larger sample group will provide a more
accurate "picture" of the population.
"Blind" Testing
It may be the case that people in an experiment and researchers may
consciously or unconsciously influence ("bias") the results of an
experiment when they know too much about the subjects in their groups.
People may try to please the researcher with a particular type of
response if they know what treatment they have received. A researcher may
unconsciously treat subjects differently if he or she knows which
treatment a subject has received.
What would happen if an researcher knew which people received a drug for
pain and which people got a fake pill filled only with sugar? It is
possible that the researcher may influence the subjects to respond in
different ways. Perhaps the researcher would treat the subjects
differently knowing what treatment each subject received. If a person
knew he or she was only getting a sugar pill and not a pain killer, it is
possible that they would think, "Hey, of course this will not get rid of
my pain." On the other hand, if the person, knew that they were
receiving
a drug, they might think, "Hey, the Doc is giving me a drug that is sure
to cure me." Knowing what "should happen" may change the results of the
experiment.
To eliminate this
possibility, it is important that experiments be performed "blind". This means that the subject will not know the
treatment that he or she is receiving. A "double blind" experiment is one
in which both the researcher and the subject do not know what group the
subject is in. In this way, the subjects and researchers cannot
influence the results because they do not have any expectations about how
each subject should perform. When all of the data have been collected,
then the researchers and subjects can be told which group they were
in.
Let's go back to that "sugar pill" for a minute. It is possible that just
the thought of getting a real treatment can create the same effect as a
real treatment. This is called the "placebo
effect". A placebo is a drug or treatment that really has no
"active ingredient." It is important to have some subjects in every
experiment receive the placebo treatment. This allows the researcher to
separate the "real" effects of a drug or treatment from the effects of
merely being in the experiment. Also, some illnesses will cure
themselves. For example, the common cold will get better in about 7-10
days without any treatment. A placebo treatment will allow a researcher
to measure this spontaneous recovery. Sometimes placebos can have very
strong effects, but no one is really sure how placebos work.
Measurement
Experiments need data. To get data, a researcher must measure something.
Measurements come in many different varieties. For example, it is
possible to measure time, weight, length, number of responses, height,
pleasantness and brightness. The way numbers represent a particular
measurement is called the "scale" (scales of measurement).
Type of Scale
Example
- Nominal Scale
- A nominal scale classifies data
according to a category only. For example, an experiment may examine
which color people select. No assumptions are made that any color has
more or less value than any other color. Colors differ qualitatively from
one another, but they do not differ quantitatively. A number could be
assigned to each color, but it would not have any value. The number
serves only to identify the color.
A
Nominal Scale
- Ordinal Scale
- An ordinal scale classifies data according to rank. With ordinal
data, it is fair to say that one response is greater or less than another.
For example, if people were asked to rate the hotness of three chili
peppers, a scale of "hot", "hotter" and "hottest" could be used. Values
of "1" for "hot", "2" for "hotter" and "3" for "hottest" could be
assigned. However, and this is important, you
cannot say that the difference between the hot pepper and the hotter
pepper is the same as the difference between the hotter pepper and the
hottest pepper. It may be that you can eat a hot pepper without feeling
any pain. You may also be able to eat the hotter pepper, but your mouth
just tingles a bit. However, the hottest pepper is really, really
hot...so hot your whole mouth burns.
An
Ordinal Scale
- Interval Scale
- An interval scale assumes that
the measurements are made in equal units. However, an interval scale does
not have to have a true zero. Good examples of interval scales are the
Fahrenheit and Celsius temperature scales. A temperature of "zero" does
not mean that there is no temperature...it is just an arbitrary zero
point.
An Interval Scale
- Ratio Scale
- Ratio scales are similar to interval scales. A ratio scale allows you
to compare differences between numbers. For example, if you measured the
time it takes 3 people to run a race, their times may be 10 seconds (Racer
A), 15 seconds (Racer B) and 20 seconds (Racer C). You can say with
accuracy, that it took Racer C twice as long as Racer A. Unlike the
interval scale, the ratio scale has a true zero value.
A Ratio Scale
Did you know?
The word "placebo" comes from the Latin phrase that means "I will
please."