4.
Process Modeling
4.1.
Introduction to Process Modeling
4.1.4.
What are some of the different statistical methods for model building?
4.1.4.1.
Linear Least Squares Regression
Modeling Workhorse
Linear least squares regression is by far the most widely used
modeling method. It is what most people mean when they say they have
used "regression", "linear regression" or "least squares" to fit a model
to their data. Not only is linear least squares regression the most widely
used modeling method, but it has been adapted to a broad range of
situations that are outside its direct scope. It plays a strong underlying
role in many other modeling methods, including the other methods discussed
in this section:
nonlinear least squares regression,
weighted least squares regression and
LOESS.
Definition of a Linear Least Squares Model
Used directly, with an
appropriate data
set, linear least squares regression can be used to fit the data
with any function of the form
$$ f(\vec{x};\vec{\beta}) = \beta_0 + \beta_1x_1 + \beta_2x_2 + \ldots $$
in which
- each explanatory variable in the function is multiplied by an unknown
parameter,
- there is at most one unknown parameter with no corresponding
explanatory variable, and
- all of the individual terms are summed to produce
the final function value.
In statistical terms, any function that meets these
criteria would be called a "linear function". The term "linear" is
used, even though the function may not be a straight line, because if the
unknown parameters are considered to be variables and the explanatory variables
are considered to be known coefficients corresponding to those "variables",
then the problem becomes a system (usually overdetermined) of linear equations
that can be solved for the values of the unknown parameters. To differentiate
the various meanings of the word "linear", the linear models being discussed
here are often said to be "linear in the parameters" or "statistically linear".
Why "Least Squares"?
Linear least squares regression also gets its name from the way the
estimates of the unknown parameters are computed. The "method of least
squares" that is used to obtain parameter estimates was independently
developed in the late 1700's and the early 1800's by the mathematicians
Karl Friedrich Gauss, Adrien Marie Legendre and (possibly) Robert Adrain
[Stigler (1978)]
[Harter (1983)]
[Stigler (1986)]
working in Germany, France and America, respectively. In the least squares
method the unknown parameters are estimated by minimizing the sum of the
squared deviations between the data and the model. The minimization process
reduces the overdetermined system of equations formed by the data to a
sensible system of \(p\),
(where \(p\)
is the number of parameters
in the functional part of the model) equations in \(p\)
unknowns.
This new system of equations is then solved to obtain the parameter estimates.
To learn more about how the method of least squares is used to estimate the
parameters, see
Section 4.4.3.1.
Examples of Linear Functions
As just mentioned above, linear models are not limited to being straight lines
or planes, but include a fairly wide range of shapes. For example, a simple
quadratic curve,
$$ f(x;\vec{\beta}) = \beta_0 + \beta_1x + \beta_{11}x^2 ,円 ,$$
is linear in the statistical sense. A straight-line model in
\(\log(x)\),
$$ f(x;\vec{\beta}) = \beta_0 + \beta_1\ln(x) ,円 , $$
or a polynomial in \(\sin(x)\),
$$ f(x;\vec{\beta}) = \beta_0 + \beta_1\sin(x) + \beta_2\sin(2x) + \beta_3\sin(3x) ,円 , $$
is also linear in the statistical sense because they are linear in the
parameters, though not with respect to the
observed explanatory variable, \(x\).
Nonlinear Model Example
Just as models that are linear in the statistical sense do not
have to be linear with respect to the explanatory variables, nonlinear
models can be linear with respect to the explanatory variables, but
not with respect to the parameters. For example,
$$ f(x;\vec{\beta}) = \beta_0 + \beta_0\beta_1x $$
is linear in \(x\),
but it cannot be written in the general form of a linear model presented
above. This
is because the slope of this line is expressed as the product of two
parameters. As a result, nonlinear least squares regression could be
used to fit this model, but linear least squares cannot be used. For further
examples and discussion of nonlinear models see the next section,
Section 4.1.4.2.
Advantages of Linear Least Squares
Linear least squares regression has earned its place as the primary tool
for process modeling because of its effectiveness and completeness.
Though there are types of data that are better described by functions
that are nonlinear in the parameters, many processes in science and
engineering are well-described by linear models. This is because
either the processes are inherently linear or because, over short ranges, any process
can be well-approximated by a linear model.
The estimates of the unknown parameters obtained from linear least squares
regression are the optimal estimates from a broad class of possible
parameter estimates under the usual assumptions used for process modeling.
Practically speaking, linear least squares regression makes very efficient
use of the data. Good results can be obtained with relatively small data sets.
Finally, the theory associated with linear regression
is well-understood and allows for construction of different types of
easily-interpretable statistical intervals for predictions, calibrations,
and optimizations. These statistical intervals can then be used
to give clear answers to scientific and engineering questions.
Disadvantages of Linear Least Squares
The main disadvantages of linear least squares are limitations in the shapes
that linear models can assume over long ranges, possibly poor extrapolation
properties, and sensitivity to outliers.
Linear models with nonlinear terms in the predictor variables curve relatively slowly, so for
inherently nonlinear processes it becomes increasingly difficult to find
a linear model that fits the data well as the range of the data increases.
As the explanatory variables become extreme, the output of the linear model will
also always more extreme. This means that linear models
may not be effective for extrapolating the results of a process for which data
cannot be collected in the region of interest. Of course extrapolation is
potentially dangerous regardless of the model type.
Finally, while the method of least squares
often gives optimal estimates of the unknown parameters, it is very sensitive
to the presence of unusual data points in the data used to fit a model. One or
two outliers can sometimes seriously skew the results of a least squares
analysis. This makes
model validation,
especially with respect to outliers,
critical to obtaining sound answers to the questions motivating the construction
of the model.