The original text is reprinted from “Liu Yue’s technology blog”https://v3u.cn/a_id_178
We are not unfamiliar with the concept of chat robot. Maybe you have been flirting with Siri in boredom or chatting with Xiao AI in your spare time. In any case, we have to admit that artificial intelligence has penetrated into our life. At present, there are numerous robots that provide three-party API on the market: Microsoft Xiaobing, Turing robot, Tencent chat, Qingyun guest robot, etc. as long as we want, we can access on the app end or web application at any time. However, how are the bottom layers of these applications implemented? Can we be like American TV series without internet accessWest worldAs described in it, robots only need to store in the local “mind ball” to communicate with human beings. If you are not satisfied with being a “bartender”, please follow our journey. This time, we will use the deep learning library keras / tensorflow to build our own local chat robot, which does not rely on any three-party interface and network.
First install the related dependencies:
pip3 install Tensorflow
pip3 install Keras
pip3 install nltk
Then write a test script\_ bot.py Import required libraries:
import nltk
import ssl
from nltk.stem.lancaster import LancasterStemmer
stemmer = LancasterStemmer()
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.optimizers import SGD
import pandas as pd
import pickle
import random
There is a hole here. Nltk will report an error:
Resource punkt not found
Normally, just add a line of downloader code
import nltk
nltk.download('punkt')
However, due to the academic Internet access, it is very difficult to download it through the python downloader. So we play a curve to save the nation and manually download the compressed package by ourselves
https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/tokenizers/punkt.zip
After decompression, put it in your user directory
C:\Users\liuyue\tokenizers\nltk_data\punkt
OK, to get to the point, the main challenge in developing a chat robot is to classify the user’s input information and to be able to identify the correct intentions of human beings (this can be solved by machine learning, but it’s too complicated. I’m lazy, so I use deep learning keras). The second is how to maintain the context, that is to analyze and track the context. Usually, we don’t need to classify the user’s intention. We only need to take the information input by the user as the answer to the chat robot’s question. Here we use the keras deep learning library to construct the classification model.
The intention of the chat robot and the pattern it needs to learn are defined in a simple variable. There is no need to go to t corpus. We know that if a robot player doesn’t have a corpus in his hand, he will be ridiculed, but our goal is to build a specific chat robot for a specific context. So the classification model, created as a small vocabulary, will only be able to identify a small set of patterns for training.
To put it bluntly, the so-called machine learning is that you repeatedly teach the machine to do one or several correct things. In training, you constantly demonstrate how to do it is correct, and then expect the machine to draw inferences from other examples in learning. Only this time, we don’t teach it many things, only one, to test its reaction. Is it a bit like what you train you at home pet dog? It’s just that dogs can’t talk to you.
Here, I will give a simple example of the intentional data variable. If you want, you can use the database to expand the variables infinitely
intents = {"intents": [
{"tag": "say hello",
"Patterns": ["hello", "hello", "excuse me", "anyone else", "Shifu", "excuse me", "beauty", "handsome boy", "pretty girl", "Hi"],
"Responses": ["hello", "it's you again", "have you eaten", "do you have anything to do"],
"context": [""]
},
{"tag": "farewell",
"Patterns": ["goodbye", "goodbye", "88", "see you later"],
"Responses": ["goodbye", "Bon Voyage", "see you next time", "goodbye to you"],
"context": [""]
},
]
}
As you can see, I inserted two contextual tags, greeting and farewell, including user input and machine response data.
Before we start the training of classification model, we need to build vocabulary. After the pattern is processed, the vocabulary is established. Each word will have a stem to produce a common root, which will help to match more combinations of user input.
for intent in intents['intents']:
for pattern in intent['patterns']:
# tokenize each word in the sentence
w = nltk.word_tokenize(pattern)
# add to our words list
words.extend(w)
# add to documents in our corpus
documents.append((w, intent['tag']))
# add to our classes list
if intent['tag'] not in classes:
classes.append(intent['tag'])
words = [stemmer.stem(w.lower()) for w in words if w not in ignore_words]
words = sorted(list(set(words)))
classes = sorted(list(set(classes)))
Print (len (classes), "context"
Print (len (words), "words"
Output:
2 context ['goodbye ','say Hello']
14 words ['88 ','excuse me', 'hello', 'goodbye', 'see you later', 'see you later', 'see you later', 'handsome boy', 'master', 'hello', 'goodbye', 'is there anyone', 'beauty', 'excuse me','pretty girl ']
Training will not be analyzed according to vocabulary, because vocabulary is meaningless to the machine, which is also a misunderstanding of many Chinese word segmentation libraries. In fact, the machine does not understand whether you input English or Chinese. We only need to convert the word or Chinese into a bag containing 0 / 1 array. The length of the array will be equal to the size of the vocabulary, and will be set to 1 when a word or word in the current pattern is at a given location.
# create our training data
training = []
# create an empty array for our output
output_empty = [0] * len(classes)
# training set, bag of words for each sentence
for doc in documents:
# initialize our bag of words
bag = []
pattern_words = doc[0]
pattern_words = [stemmer.stem(word.lower()) for word in pattern_words]
for w in words:
bag.append(1) if w in pattern_words else bag.append(0)
output_row = list(output_empty)
output_row[classes.index(doc[1])] = 1
training.append([bag, output_row])
random.shuffle(training)
training = np.array(training)
train_x = list(training[:,0])
train_y = list(training[:,1])
We started data training, and the model was built with keras, based on three layers. Because of the small data base, the classification output will be multi class array, which will help to identify the coding intent. Use softmax activation to generate multi class classification output (the result returns a 0 / 1 array: \ [1,0,0,…, 0 \] — which identifies the encoding intent).
model = Sequential()
model.add(Dense(128, input_shape=(len(train_x[0]),), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(train_y[0]), activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.fit(np.array(train_x), np.array(train_y), epochs=200, batch_size=5, verbose=1)
The training is executed in the way of 200 iterations. The batch processing capacity is 5. Because my test data sample is small, so 100 times can be used. This is not the point.
Start training:
14/14 [==============================] - 0s 32ms/step - loss: 0.7305 - acc: 0.5000
Epoch 2/200
14/14 [==============================] - 0s 391us/step - loss: 0.7458 - acc: 0.4286
Epoch 3/200
14/14 [==============================] - 0s 390us/step - loss: 0.7086 - acc: 0.3571
Epoch 4/200
14/14 [==============================] - 0s 395us/step - loss: 0.6941 - acc: 0.6429
Epoch 5/200
14/14 [==============================] - 0s 426us/step - loss: 0.6358 - acc: 0.7143
Epoch 6/200
14/14 [==============================] - 0s 356us/step - loss: 0.6287 - acc: 0.5714
Epoch 7/200
14/14 [==============================] - 0s 366us/step - loss: 0.6457 - acc: 0.6429
Epoch 8/200
14/14 [==============================] - 0s 899us/step - loss: 0.6336 - acc: 0.6429
Epoch 9/200
14/14 [==============================] - 0s 464us/step - loss: 0.5815 - acc: 0.6429
Epoch 10/200
14/14 [==============================] - 0s 408us/step - loss: 0.5895 - acc: 0.6429
Epoch 11/200
14/14 [==============================] - 0s 548us/step - loss: 0.6050 - acc: 0.6429
Epoch 12/200
14/14 [==============================] - 0s 468us/step - loss: 0.6254 - acc: 0.6429
Epoch 13/200
14/14 [==============================] - 0s 388us/step - loss: 0.4990 - acc: 0.7857
Epoch 14/200
14/14 [==============================] - 0s 392us/step - loss: 0.5880 - acc: 0.7143
Epoch 15/200
14/14 [==============================] - 0s 370us/step - loss: 0.5118 - acc: 0.8571
Epoch 16/200
14/14 [==============================] - 0s 457us/step - loss: 0.5579 - acc: 0.7143
Epoch 17/200
14/14 [==============================] - 0s 432us/step - loss: 0.4535 - acc: 0.7857
Epoch 18/200
14/14 [==============================] - 0s 357us/step - loss: 0.4367 - acc: 0.7857
Epoch 19/200
14/14 [==============================] - 0s 384us/step - loss: 0.4751 - acc: 0.7857
Epoch 20/200
14/14 [==============================] - 0s 346us/step - loss: 0.4404 - acc: 0.9286
Epoch 21/200
14/14 [==============================] - 0s 500us/step - loss: 0.4325 - acc: 0.8571
Epoch 22/200
14/14 [==============================] - 0s 400us/step - loss: 0.4104 - acc: 0.9286
Epoch 23/200
14/14 [==============================] - 0s 738us/step - loss: 0.4296 - acc: 0.7857
Epoch 24/200
14/14 [==============================] - 0s 387us/step - loss: 0.3706 - acc: 0.9286
Epoch 25/200
14/14 [==============================] - 0s 430us/step - loss: 0.4213 - acc: 0.8571
Epoch 26/200
14/14 [==============================] - 0s 351us/step - loss: 0.2867 - acc: 1.0000
Epoch 27/200
14/14 [==============================] - 0s 3ms/step - loss: 0.2903 - acc: 1.0000
Epoch 28/200
14/14 [==============================] - 0s 366us/step - loss: 0.3010 - acc: 0.9286
Epoch 29/200
14/14 [==============================] - 0s 404us/step - loss: 0.2466 - acc: 0.9286
Epoch 30/200
14/14 [==============================] - 0s 428us/step - loss: 0.3035 - acc: 0.7857
Epoch 31/200
14/14 [==============================] - 0s 407us/step - loss: 0.2075 - acc: 1.0000
Epoch 32/200
14/14 [==============================] - 0s 457us/step - loss: 0.2167 - acc: 0.9286
Epoch 33/200
14/14 [==============================] - 0s 613us/step - loss: 0.1266 - acc: 1.0000
Epoch 34/200
14/14 [==============================] - 0s 534us/step - loss: 0.2906 - acc: 0.9286
Epoch 35/200
14/14 [==============================] - 0s 463us/step - loss: 0.2560 - acc: 0.9286
Epoch 36/200
14/14 [==============================] - 0s 500us/step - loss: 0.1686 - acc: 1.0000
Epoch 37/200
14/14 [==============================] - 0s 387us/step - loss: 0.0922 - acc: 1.0000
Epoch 38/200
14/14 [==============================] - 0s 430us/step - loss: 0.1620 - acc: 1.0000
Epoch 39/200
14/14 [==============================] - 0s 371us/step - loss: 0.1104 - acc: 1.0000
Epoch 40/200
14/14 [==============================] - 0s 488us/step - loss: 0.1330 - acc: 1.0000
Epoch 41/200
14/14 [==============================] - 0s 381us/step - loss: 0.1322 - acc: 1.0000
Epoch 42/200
14/14 [==============================] - 0s 462us/step - loss: 0.0575 - acc: 1.0000
Epoch 43/200
14/14 [==============================] - 0s 1ms/step - loss: 0.1137 - acc: 1.0000
Epoch 44/200
14/14 [==============================] - 0s 450us/step - loss: 0.0245 - acc: 1.0000
Epoch 45/200
14/14 [==============================] - 0s 470us/step - loss: 0.1824 - acc: 1.0000
Epoch 46/200
14/14 [==============================] - 0s 444us/step - loss: 0.0822 - acc: 1.0000
Epoch 47/200
14/14 [==============================] - 0s 436us/step - loss: 0.0939 - acc: 1.0000
Epoch 48/200
14/14 [==============================] - 0s 396us/step - loss: 0.0288 - acc: 1.0000
Epoch 49/200
14/14 [==============================] - 0s 580us/step - loss: 0.1367 - acc: 0.9286
Epoch 50/200
14/14 [==============================] - 0s 351us/step - loss: 0.0363 - acc: 1.0000
Epoch 51/200
14/14 [==============================] - 0s 379us/step - loss: 0.0272 - acc: 1.0000
Epoch 52/200
14/14 [==============================] - 0s 358us/step - loss: 0.0712 - acc: 1.0000
Epoch 53/200
14/14 [==============================] - 0s 4ms/step - loss: 0.0426 - acc: 1.0000
Epoch 54/200
14/14 [==============================] - 0s 370us/step - loss: 0.0430 - acc: 1.0000
Epoch 55/200
14/14 [==============================] - 0s 368us/step - loss: 0.0292 - acc: 1.0000
Epoch 56/200
14/14 [==============================] - 0s 494us/step - loss: 0.0777 - acc: 1.0000
Epoch 57/200
14/14 [==============================] - 0s 356us/step - loss: 0.0496 - acc: 1.0000
Epoch 58/200
14/14 [==============================] - 0s 427us/step - loss: 0.1485 - acc: 1.0000
Epoch 59/200
14/14 [==============================] - 0s 381us/step - loss: 0.1006 - acc: 1.0000
Epoch 60/200
14/14 [==============================] - 0s 421us/step - loss: 0.0183 - acc: 1.0000
Epoch 61/200
14/14 [==============================] - 0s 344us/step - loss: 0.0788 - acc: 0.9286
Epoch 62/200
14/14 [==============================] - 0s 529us/step - loss: 0.0176 - acc: 1.0000
OK, after 200 times, now that the model has been trained, we declare a method for bag conversion:
def clean_up_sentence(sentence):
# tokenize the pattern - split words into array
sentence_words = nltk.word_tokenize(sentence)
# stem each word - create short form for word
sentence_words = [stemmer.stem(word.lower()) for word in sentence_words]
return sentence_words
def bow(sentence, words, show_details=True):
# tokenize the pattern
sentence_words = clean_up_sentence(sentence)
# bag of words - matrix of N words, vocabulary matrix
bag = [0]*len(words)
for s in sentence_words:
for i,w in enumerate(words):
if w == s:
# assign 1 if current word is in the vocabulary position
bag[i] = 1
if show_details:
print ("found in bag: %s" % w)
return(np.array(bag))
Test it to see if you can hit the word bag:
P = bow ("hello", words)
print (p)
Return value:
Found in bag: Hello
[0 0 1 0 0 0 0 0 0 0 0 0 0 0]
Obviously, the match was successful, and the words were in the bag.
Before we package the model, we can use the model.predict The function performs classification test on user input and returns user intention according to the calculated probability (multiple intentions can be returned and output in reverse order according to probability)
def classify_local(sentence):
ERROR_THRESHOLD = 0.25
# generate probabilities from the model
input_data = pd.DataFrame([bow(sentence, words)], dtype=float, index=['input'])
results = model.predict([input_data])[0]
# filter out predictions below a threshold, and provide intent index
results = [[i,r] for i,r in enumerate(results) if r>ERROR_THRESHOLD]
# sort by strength of probability
results.sort(key=lambda x: x[1], reverse=True)
return_list = []
for r in results:
return_list.append((classes[r[0]], str(r[1])))
# return tuple of intent and probability
return return_list
Test it:
print(classify_ Local ('Hello '))
Return value:
Found in bag: Hello
[('Hello ','0.999913')]
liuyue:mytornado liuyue$
Retest:
print(classify_local('88'))
Return value:
found in bag: 88
[('farewell ','0.9995449')]
Perfect, match the context tag of greeting. If you want, you can test several more to perfect the model.
After the test is completed, we can package the trained model so that there is no need to train before each call:
json_file = model.to_json()
with open('v3ucn.json', "w") as file:
file.write(json_file)
model.save_weights('./v3ucn.h5f')
Here, the model is divided into data file (JSON) and weight file (H5F). Save them and use them later.
Next, let’s build an API for a chat robot. Here we use the currently popular framework fastapi, put the model file into the project directory, and write it main.py :
import random
import uvicorn
from fastapi import FastAPI
app = FastAPI()
def classify_local(sentence):
ERROR_THRESHOLD = 0.25
# generate probabilities from the model
input_data = pd.DataFrame([bow(sentence, words)], dtype=float, index=['input'])
results = model.predict([input_data])[0]
# filter out predictions below a threshold, and provide intent index
results = [[i,r] for i,r in enumerate(results) if r>ERROR_THRESHOLD]
# sort by strength of probability
results.sort(key=lambda x: x[1], reverse=True)
return_list = []
for r in results:
return_list.append((classes[r[0]], str(r[1])))
# return tuple of intent and probability
return return_list
@app.get('/')
async def root(word: str = None):
from keras.models import model_from_json
# # load json and create model
file = open("./v3ucn.json", 'r')
model_json = file.read()
file.close()
model = model_from_json(model_json)
model.load_weights("./v3ucn.h5f")
wordlist = classify_local(word)
a = ""
for intent in intents['intents']:
if intent['tag'] == wordlist[0][0]:
a = random.choice(intent['responses'])
return {'message':a}
if __name__ == "__main__":
uvicorn.run(app, host="127.0.0.1", port=8000)
there:
from keras.models import model_from_json
file = open("./v3ucn.json", 'r')
model_json = file.read()
file.close()
model = model_from_json(model_json)
model.load_weights("./v3ucn.h5f")
It is used to import the model library just trained, and then start the service:
uvicorn main:app --reload
The effect is this:
Conclusion: there is no doubt that technology has changed our life. Chat robots can let us hear the singing of birds without the company of beautiful women. I believe that in the near future, the “mechanical lady” with smiling words and beautiful clothes will also accompany me, which is equal to the breeze and bright moon.
The original text is reprinted from “Liu Yue’s technology blog”https://v3u.cn/a_id_178