function (unclassified-examples goals h performance-element)
function (examples goals h performance-element)
Coded examples have goal values (in a single list) followed by attribute values, both in fixed order function (examples attributes goals)
function (example attributes goals)
function (example attributes goals)
function (problem &optional stream depth)
File learning/algorithms/learning-curves.lisp
Functions for testing induction algorithms
Tries to be as generic as possible
Mainly for NN purposes, allows multiple goal attributes
A prediction is correct if it agrees on ALL goal attributes
function (induction-algorithm
examples -> hypothesis
performance-element
hypothesis + example -> prediction
examples
attributes
goals
trials
training-size-increment
&optional
error-fn)
this version uses incremental data sets rather than a new batch each time function (induction-algorithm examples -> hypothesis performance-element hypothesis + example -> prediction examples attributes goals trials training-size-increment &optional error-fn)
function (h performance-element test-set goals &optional error-fn)
function (examples
attributes
goal
&optional
prior)
function (examples
attributes
goal)
dtpredict is the standard "performance element" that
interfaces with the example-generation and learning-curve functions
function (dt
example)
function (k
examples
attributes
goal)
select-test finds a test of size at most k that picks out a set of
examples with uniform classification. Returns test and subset.
function (k
examples
attributes
goal)
function (k
examples
attributes
goal
test-attributes)
dlpredict is the standard "performance element" that
interfaces with the example-generation and learning-curve functions
function (dl
example)
make-connected-nn returns a multi-layer network with layers given by sizes
function (sizes
&optional
previous
g
dg)
nn-learning establishes the basic epoch struture for updating,
Calls the desired updating mechanism to improve network until
either all correct or runs out of epochs
function (problem
network
learning-method
&key
tolerance
limit)
nn-output is the standard "performance element" for neural networks
and interfaces to example-generating and learning-curve functions.
Since performance elements are required to take only two arguments
(hypothesis and example), nn-output is used in an appropriate
lambda-expression
function (network
unclassified-example
attributes
goals)
unit-output computes the output of a unit given a set of inputs
it always adds a bias input of -1 as the zeroth input
function (inputs
unit)
print-nn prints out the network relatively prettily
function (network)
perceptron learning - single-layer neural networks
make-perceptron returns a one-layer network with m units, n inputs each
function (n
m
&optional
g)
perceptron-learning is the standard "induction algorithm"
and interfaces to the learning-curve functions
function (problem)
Perceptron updating - simple version without lower bound on delta
Hertz, Krogh, and Palmer, eq. 5.19 (p.97)
function (perceptron
actual-inputs
predicted
target
&optional
learning-rate)
back-propagation learning - multi-layer neural networks
backprop-learning is the standard "induction algorithm"
and interfaces to the learning-curve functions
function (problem
&optional
hidden)
Backprop updating - Hertz, Krogh, and Palmer, p.117
function (network
actual-inputs
predicted
target
&optional
learning-rate)
function (rnetwork
network in reverse order
inputs
the inputs to the network
deltas
the "errors" for current layer
learning-rate)
function (layer
all-inputs
deltas
learning-rate)
compute-deltas propagates the deltas back from layer i to layer j
pretty ugly, partly because weights Wji are stored only at layer i
function (jlayer
ilayer
ideltas)
Given an MDP, determine the q-values of the states.
Q-iteration iterates on the Q-values instead of the U-values.
Basic equation is Q(a,i) <- R(i) + sum_j M(a,i,j) max_a' Q(a',j)
where Q(a',j) MUST be the old value not the new.
function (mdp
&optional
qold
&key
epsilon)
Compute optimal policy from Q table
function (q)
Choice functions select an action under specific circumstances
Pick a random action
function (s
q)
Pick the currently best action
function (s
q)
Pick the currently best action with tie-breaking
function (s
q)
Updating the transition model according to oberved transition i->j.
Fairly tedious because of initializing new transition records.
function (j
current state (destination of transition)
percepts
in reverse chronological order
m
transition model, indexed by state
)
(passive-policy M) makes a policy of no-ops for use in value determination
function (m)
initial learning rate parameter
function ()
function (actions
choice-function)
Update current model to reflect the evidence from the most recent action
function (mdp
current description of envt.
percepts
in reverse chronological order
action
last action taken
)
function (actions
choice-function)
Given an environment model M, determine the values of states U.
Use value iteration, with initial values given by U itself.
Basic equation is U(i) <- r(i) + max_a f(sum_j M(a,i,j)U(j), N(a,i))
where f is the exploration function. Does not applyt to terminal states.
function (mdp
&optional
uold
&key
epsilon)
File learning/algorithms/dtl.lisp
decision tree learning algorithm - the standard "induction algorithm"
returns a tree in the format
(a1 (v11 .
File learning/algorithms/dll.lisp
decision list learning algorithm (Rivest)
returns a decision list, each element of which is
a test of the form (x .term), where each term is
of the form ((a1 . v1) (a2 . v2) ... (an . vn)).
The last element is the test (0).
only works for purely boolean attributes.
function (k
problem)
File learning/algorithms/nn.lisp
Code for layered feed-forward networks
Network is represented as a list of lists of units.
Inputs assumed to be the ordered attribute values in examples
Every unit gets input 0 set to -1
type (parents
sequence of indices of units in previous layer
children
sequence of indices of units in subsequent layer
weights
weights on links from parents
g
activation function
dg
activation gradient function g' (if it exists)
a
activation level
in
total weighted input
gradient
g'(in_i)
)
File learning/algorithms/perceptron.lisp
File learning/algorithms/multilayer.lisp
File learning/algorithms/q-iteration.lisp
learning/algorithms/q-iteration.lisp
Data structures and algorithms for calculating the Q-table for an
MDP. Q(a,i) is the value of doing action a in state i.
function (q
a
i)
File learning/domains/restaurant-multivalued.lisp
learning/domains/restaurant-multivalued.lisp
Restaurant example from chapter 18, encoded
using multivalued input attributes suitable for
decision-tree learning.
variable
File learning/domains/restaurant-real.lisp
learning/domains/restaurant-real.lisp
Restaurant example from chapter 18, encoded
using real-valued input attributes suitable for
decision-tree learning or neural network learning.
variable
File learning/domains/restaurant-boolean.lisp
learning/domains/restaurant-boolean.lisp
Restaurant learning problem encoded using boolean attributes only,
as appropriate for decision-list learning.
Target is encoded as a decision list.
variable
File learning/domains/majority-boolean.lisp
variable
File learning/domains/ex-19-4-boolean.lisp
learning/domains/ex-19-4-boolean.lisp
Inductive learning example from exercise 19.4
variable
File learning/domains/and-boolean.lisp
learning/domains/and-boolean.lisp
Data for Boolean AND function
variable
File learning/domains/xor-boolean.lisp
learning/domains/xor-boolean.lisp
Data for Boolean XOR function
variable
File learning/domains/4x3-passive-mdp.lisp
Passive, stochastic 4x3 environment for chapter 20.
Just one possible action (no-op), uniformly arrives at neighbour square.
variable
File learning/agents/passive-lms-learner.lisp
learning/agents/passive-lms-learner.lisp
Passive LMS learning agent.
When a given training sequence terminates,
update the utility of each state visited in the sequence
to reflect the rewards received from then on.
function ()
File learning/agents/passive-adp-learner.lisp
learning/agents/passive-adp-learner.lisp
Reinforcement learning agent that uses dynamic
programming to solve the Markov process
that it learns from its experience. Thus, the
main job is to update the model over time.
Being a passive agent, it simply does no-op
at each step, watching the world go by.
function ()
File learning/agents/passive-td-learner.lisp
learning/agents/passive-td-learner.lisp
Passive temporal-difference learning agent.
After each transition, update the utility of the
source state i to make it agree more closely with that
of the destination state j.
variable
File learning/agents/active-adp-learner.lisp
learning/agents/active-adp-learner.lisp
Reinforcement learning agent that uses dynamic
programming to solve the Markov decision process
that it learns from its experience. Thus, the
main job is to update the model over time.
function (actions)
File learning/agents/active-qi-learner.lisp
learning/agents/active-adp-learner.lisp
Reinforcement learning agent that uses dynamic
programming to solve the Markov decision process
that it learns from its experience. Thus, the
main job is to update the model over time.
function (actions)
File learning/agents/exploring-adp-learner.lisp
learning/agents/exploring-adp-learner.lisp
Reinforcement learning agent that uses dynamic
programming to solve the Markov decision process
that it learns from its experience. Thus, the
main job is to update the model over time.
Unlike the active-adp-learner, this agent uses
an "intelligent" exploration policy to make sure it
explores the state space reasonably quickly.
variable
File learning/agents/exploring-tdq-learner.lisp
learning/agents/exploring-tdq-learner.lisp
Exploratory reinforcement learning agent using temporal differences.
Works without a model by using the stochastic sampling to
mirror the effect of averaging using the model.
function (actions)