AI programs that attempt to understand, manipulate, and interpret human language. It can make use of machine learning, as well as statistical, and computational linguistics. The rise of Large Language Models (LLMs) has provided exciting opportunities, but really hasn't changed the issues related to understanding and taking actions based on human language input; although chatGPT et all can reply in useful ways, it struggles to DO what we want instead of just talking.
Dictionary +
Microsoft
Chatbot AI +
See also:
-
https://blog.varunramesh.net/posts/intro-parser-combinators/
Great educational post about classical language parsing.+
-
https://medium.com/explore-artificial-intelligence/an-introduction-to-recurrent-neural-networks-72c97bf0912
The Recurrent Neural Network (RNN) allows
information to persist from one cycle to the next. Like a
state machine, part of the input is from
the output. Since they allow us to operate over sequences of vectors, we
can effectively implement variable models with not only one to one, but also
one to many, many to one, or many to many mappings in various shapes. There
are no constrains on the lengths of sequences, since the transformation is
fixed and can be applied as many times as we like. RNN's are implemented
as 3 matrixes: xh, applied to the current
input, hh, applied to the previous hidden state, and hy, applied to the output.
The RNN is first initialized with random values in those matrixes and then
they are updated over many training samples each with a sequence of inputs
and desired outputs, to produce the desired output at each step in the
sequence. One the RNN is trained, the matrixes are initialized with the learned
values at the start of each sequence, and the hidden matrix (hh) is modified
during each step. Example code:
class RNN:
# ...
def step(self, x):
# update the hidden state
self.h = np.tanh(np.dot(self.W_hh, self.h) + np.dot(self.W_xh, x))
# compute the output vector
y = np.dot(self.W_hy, self.h)
return y
The np.tanh (hyperbolic tangent) function implements a non-linearity
that squashes the activations to the range [-1, 1]. The input, x, is combined
with the xh matrix via the numpy dot product vector operation. It is added
to the dot product of the internal state and the hh matrix, then squashed
to produce a new internal state. Finally, the output is processed through
the hy matrix and returned.
-
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
The major limitation of standard RNN's is that they are not really able to
access data from several states before the current state. LSTMs address that
issue. In each step, LSTMs process the data from the prior step in several
ways each using "gates" which are composed of a signmoid NN and a dot product.
The first gate removes data from the prior state,
Ct-1, using the prior state and the new input,
xt. ft = ó(Wf
[ht-1, xt] +
bf)This resets data that is no longer applicable based
on input. E.g. when transitioning from one subject to another in a sentence.
"John is tall, but Paul is short" needs to forget "tall" when transitioning
from John to Paul.
The next gate adds in new information from the input vector. This is done
in two parts: A sigmoid NN decides which information to update,
it = ó(Wi
[ht-1, xt] +
bi) and a tanh layer builds new data from input
~Ct = tanh(Wc
[ht-1, xt] +
bC). These are combined with the prior gate to update
the internal state. Ct = ft
Ct-1 + it
~Ct
Finally, the output is built from a filtered copy of the cell sate. First
a sigmoid layer decides which part to output ot =
ó(Wo
[ht-1, xt] +
bo)Then a tanh is used to ensure the result is between
-1 and 1 and that is multiplied by the output gate. ht
= ot tanh(Ct).
Obviously, with the level of complexity here, training this is time consuming.
h and C are the same size. Wf, Wi, Wo
are all the dimension of [ht-1, xt] X dimension of
Ct. ht-1 is stacked on the top of xt to
form a single vertical one D vector. Let the total height be N. So
Wf is a matrix of height N and width M.So Wf
[ht-1/xt]===> [M x N][N x 1] gives [M x1] this the
dimension of Ct.
-
https://notebooks.azure.com/hoyean-song/projects/tensorflow-tutorial/html/LSTM-breakdown-eng.ipynb
A notebook on Azure that demonstrates the above LSTM.
-
http://proceedings.mlr.press/v37/jozefowicz15.pdf
More on training RNNs and LSTMs
-
https://towardsdatascience.com/a-practitioners-guide-to-natural-language-processing-part-i-processing-understanding-text-9f4abfd13e72
Start of a series on NLP, including the basics up through part of speech,
and other analysis.
-
https://chatbotsmagazine.com/contextual-chat-bots-with-tensorflow-4391749d0077
Contextual Chat Bots with Tensorflow+
-
http://www.wildml.com/2016/04/deep-learning-for-chatbots-part-1-introduction/
Deep learning for chatbots+
-
http://willschenk.com/bot-design-patterns/?imm_mid=0e50a2&cmp=em-data-na-na-newsltr_20160622
Bot Design Patterns+l
-
https://github.com/github/hubot Build your own bots based
on Github's Hubot+
file: /Techref/method/ai/natural_language.htm,
8KB, , updated: 2024年3月28日 19:47, local time: 2025年9月2日 23:10,
owner:
JMN-EFP-786,
©2025 These pages are served without commercial sponsorship. (No popup ads, etc...).Bandwidth abuse increases hosting cost forcing sponsorship or shutdown. This server aggressively defends against automated copying for any reason including offline viewing, duplication, etc... Please respect this requirement and DO NOT RIP THIS SITE.
Questions?<A HREF="http://massmind.org/techref/method/ai/natural_language.htm"> Natural Language processing.</A>
Did you find what you needed?
Welcome to massmind.org!
Welcome to massmind.org!
.