Natural Language processing.

Natural Language Machine Learning Processing.

AI programs that attempt to understand, manipulate, and interpret human language. It can make use of machine learning, as well as statistical, and computational linguistics. The rise of Large Language Models (LLMs) has provided exciting opportunities, but really hasn't changed the issues related to understanding and taking actions based on human language input; although chatGPT et all can reply in useful ways, it struggles to DO what we want instead of just talking.

Microsoft Chatbot AI +

See also:

https://blog.varunramesh.net/posts/intro-parser-combinators/ Great educational post about classical language parsing.+
https://medium.com/explore-artificial-intelligence/an-introduction-to-recurrent-neural-networks-72c97bf0912 The Recurrent Neural Network (RNN) allows information to persist from one cycle to the next. Like a state machine, part of the input is from the output. Since they allow us to operate over sequences of vectors, we can effectively implement variable models with not only one to one, but also one to many, many to one, or many to many mappings in various shapes. There are no constrains on the lengths of sequences, since the transformation is fixed and can be applied as many times as we like. RNN's are implemented as 3 matrixes: xh, applied to the current input, hh, applied to the previous hidden state, and hy, applied to the output. The RNN is first initialized with random values in those matrixes and then they are updated over many training samples each with a sequence of inputs and desired outputs, to produce the desired output at each step in the sequence. One the RNN is trained, the matrixes are initialized with the learned values at the start of each sequence, and the hidden matrix (hh) is modified during each step. Example code:
```
class RNN:
 # ...
 def step(self, x):
 # update the hidden state
 self.h = np.tanh(np.dot(self.W_hh, self.h) + np.dot(self.W_xh, x))
 # compute the output vector
 y = np.dot(self.W_hy, self.h)
 
 return y
```
The np.tanh (hyperbolic tangent) function implements a non-linearity that squashes the activations to the range [-1, 1]. The input, x, is combined with the xh matrix via the numpy dot product vector operation. It is added to the dot product of the internal state and the hh matrix, then squashed to produce a new internal state. Finally, the output is processed through the hy matrix and returned.
http://colah.github.io/posts/2015-08-Understanding-LSTMs/ The major limitation of standard RNN's is that they are not really able to access data from several states before the current state. LSTMs address that issue. In each step, LSTMs process the data from the prior step in several ways each using "gates" which are composed of a signmoid NN and a dot product.

The first gate removes data from the prior state, C_t-₁, using the prior state and the new input, x_t. f_t = ó(W_f •[h_t-1, x_t] + b_f)This resets data that is no longer applicable based on input. E.g. when transitioning from one subject to another in a sentence. "John is tall, but Paul is short" needs to forget "tall" when transitioning from John to Paul.

The next gate adds in new information from the input vector. This is done in two parts: A sigmoid NN decides which information to update, i_t = ó(W_i •[h_t-1, x_t] + b_i) and a tanh layer builds new data from input ~C_t = tanh(W_c •[h_t-1, x_t] + b_C). These are combined with the prior gate to update the internal state. C_t= f_t • C_t-₁ + i_t• ~C_t

Finally, the output is built from a filtered copy of the cell sate. First a sigmoid layer decides which part to output o_t = ó(W_o •[h_t_-1, x_t] + b_o)Then a tanh is used to ensure the result is between -1 and 1 and that is multiplied by the output gate. h_t = o_t • tanh(C_t).

Obviously, with the level of complexity here, training this is time consuming. h and C are the same size. W_f, W_i, W_o are all the dimension of [h_t-1, x_t] X dimension of C_t. h_t-1 is stacked on the top of x_t to form a single vertical one D vector. Let the total height be N. So W_f is a matrix of height N and width M.So W_f • [h_t-1/x_t]===> [M x N][N x 1] gives [M x1] this the dimension of C_t.
https://notebooks.azure.com/hoyean-song/projects/tensorflow-tutorial/html/LSTM-breakdown-eng.ipynb A notebook on Azure that demonstrates the above LSTM.
http://proceedings.mlr.press/v37/jozefowicz15.pdf More on training RNNs and LSTMs
https://towardsdatascience.com/a-practitioners-guide-to-natural-language-processing-part-i-processing-understanding-text-9f4abfd13e72 Start of a series on NLP, including the basics up through part of speech, and other analysis.
https://chatbotsmagazine.com/contextual-chat-bots-with-tensorflow-4391749d0077 Contextual Chat Bots with Tensorflow+
http://www.wildml.com/2016/04/deep-learning-for-chatbots-part-1-introduction/ Deep learning for chatbots+
http://willschenk.com/bot-design-patterns/?imm_mid=0e50a2&cmp=em-data-na-na-newsltr_20160622 Bot Design Patterns+l
https://github.com/github/hubot Build your own bots based on Github's Hubot+

file: /Techref/method/ai/natural_language.htm, 8KB, , updated: 2024年3月28日 19:47, local time: 2025年9月2日 23:10, owner: JMN-EFP-786,

^{40.74.122.252:LOG IN}

©2025 PLEASE DON'T RIP! This site will close without warning.

©2025 These pages are served without commercial sponsorship. (No popup ads, etc...).Bandwidth abuse increases hosting cost forcing sponsorship or shutdown. This server aggressively defends against automated copying for any reason including offline viewing, duplication, etc... Please respect this requirement and DO NOT RIP THIS SITE. Questions?

Please DO link to this page! Digg it! / MAKE!

<A HREF="http://massmind.org/techref/method/ai/natural_language.htm"> Natural Language processing.</A>

Did you find what you needed?

Welcome to massmind.org!

Welcome to massmind.org!

.