Deep Learning Chatbot using Keras and Python – Part 2 (Text/word2vec inputs into LSTM)


hello everyone, welcome to the semicolon in this tutorial we’re going to continue to build up our chatbot. in the last video we have pre-processed the data to keep it in the form of question and answers. our question was a vector of length 15 and each word in those 15 words was of length three hundred, and that is what we are going to input here into our lstm model so let’s get started. i am using all these libraries Keras dot models of sequential, gensim, layeres dot recurrent and test train split to perform the function and then from the conversation pickle the file which we have saved I am loading this vec_x and vec_y and required and now we’re going to transform it to numpy array. and as you can see our array is 86 sentences each of length 15 because each sentence was composed of 15 words and each word was of length 300 now this is our input and this is our output. so now we’re going to do test train split to test our accuracy but it doesn’t matter because it’s just words. now we’re going to implement our lstm which, which is going to have four layers so our lstm will have output dimension of 300, because 300 is the, 300 is the length of the word so each word will be 300 our input shape will be 15 cross 300 because one sentence is going to be 15 plus 300 and return sequences to be true because we need 15 of these, 15 output dimensions of 300 length and then we initialize it with glorot_normal and the inner initialization is also this, activation is sigmoid in every case. so we compile this, and our loss is cosine proximity because, the word to vec uses cosine proximity as the model of similarity between to words or the distance between to words. so I thought this would give a very good measure of error when we consider this. so let’s run it and it will take some time and then I’m going to fit it in steps of 500. so I start the training but I’ve already trained and saved the models as well. so I will start the training and you can look at the training and realize it’s too slow and that is why I’ve done it already. and as you can see the training for this has started and it is going on but I don’t want to get into it because i have already trained on a fastest computers and saved all these models and i have these files which i will be uploading on my github you can download it from there and test it on your laptop with the same code and, test the accuracy. so for now we will be using our chat program to measure or to judge how good our chatbot is. so let me stop the execution here itself and we will be going to a chat program. so our chat program is like this. I am importing Keras dot load model to load this model of mine and this might take some time. so yes we don’t need anything for now. then what I am doing is ,I am displaying enter the message here then I am tokenizing all the words of the message by doing this, x dot lower and then tokenize it and then for every word i am finding the word vector for it and preparing it in the same way as we had prepared our input data in the last tutorial. so what we have here is a vector of 14 words, a vector of a vector of size 15 which has word vectors, so we have a vector of word vectors of 15 word vectors. so what I’m doing is after 14 I am flipping the sentence. I don’t care whatever it is I’m losing some information but my chatbot is designed to handle 15 words only. so don’t use long sentences because it might get my chat bot confused. now if the length is less than 15 I am a pending sentend which is, all ones at the end of the sentences. so to get a uniform output. now what I’m doing is I’m predicting the model by using model dot predict I am predicting the output of the input and then using the word to vec function more similar I am finding the most similar word for each word here. I am finding the most similar word for each word vector in a sentence and then, I’m printing the sentence output, so let’s check how our model works out. so I’m going to run this and it’s asking enter the message I would like to expand this so that you can see. so I write here hi and okay I haven’t loaded this so this might take some more time. hopefully it won’t but let’s wait for it again and then run this part once more. so that got executed and then it’s again asking enter the message and I press hi so let’s wait for it to reply it might take one or two seconds but it should be done pretty quick because it’s just prediction. and it is making us wait so the output is how are you doing. so this is a chat bot which we have trained on just the sentences which I had shown you before,and these are padding words because we added 1111 to the end of the sentence. so let’s write how are you. so it said i am doing a so can, which doesn’t make much sense but it is pretty good because the chatbot was just trained on 68 words or 86 words. so I am sorry 86 sentences so this is pretty good accuracy and these are again the padding words. so this some other things which we can try like how are you doing. I am doing fine, let’s see what it gives. know m also good. with more training and more iterations I think we can always improve the accuracy and this is it, this is our chatbot which works and which gives I say a good input so this is it for our chatbot guys hit the like button if you liked it, hit the subscribe button if you want to keep watching more of these and share it with your friends thank you.

32 thoughts on “Deep Learning Chatbot using Keras and Python – Part 2 (Text/word2vec inputs into LSTM)

  1. Thanks Semicolon. Struggle with the first part. How can i download the jsonfile from gidhub? Or is there a way to load the file directly from there? And what is the word2vec.bin? Why do you append sendend in line 28? Why not only appending if sentvec<15? Thanks

  2. Hi, It's nice video but when training my model with LSTM it gives me Input 0 is incompatible with layer lstm : expected ndim=3, found ndim=2

  3. Hi..Can you explain the reason behind the concept of this 4 layers? Can we consider more layers or fewer layers? How can we improve the accuracy?

  4. Hi… I had executed the model. however while executing the file "chat.py" it is throwing error

    "File "C:UsersXYZMiniconda3libsite-packageskerasoptimizers.py", line 79, in set_weights
    'provided weight shape ' + str(w.shape))
    ValueError: Optimizer weight shape (1200,) not compatible with provided weight shape (300, 300)"

    Please check this and let me know if you have any idea about this error.

  5. When i run chat.py with the downloaded model it show the below error
    File "/Users/yuvikakoul/anaconda/envs/tensorflow/lib/python2.7/site-packages/keras/optimizers.py", line 79, in set_weights
    'provided weight shape ' + str(w.shape))
    ValueError: Optimizer weight shape (1200,) not compatible with provided weight shape (300, 300)

  6. Does anyone getting right output as per video?
    Can any one suggest few blogs or videos similar to these video?

  7. vec_x=np.array(vec_x,dtype=np.float64)
    ValueError: setting an array element with a sequence.

    I use tensorflow and Python 3.5.2

  8. Hi, Thanks for the tutorial, it really helped. I just wanted to clear my understanding. In the line where we add the LSTM layers, we are adding 4 LSTM layers on after the other sequentially. Each of them has the input shape of (15 x 300). Does that mean that all 15 words of the sentence (each encoded by a word2vec, so the size is 300, I assume) are inputted parallelly? If they are inputted parallely then that must also mean that each LSTM layer has 15 LSTM cells ? Also, output _dim is 300, which corresponds to just one word vector, right? Then how come the output is a sentence and not just one word? I am a little unclear about the input and output part. Also, is there any specific reason , why you chose 4 layers of LSTM or is it just an experimental value?

  9. getting, ValueError: Error when checking target: expected lstm_4 to have 3 dimensions, but got array with shape (68, 1). While trying to run chatbotlstmtrain.py

  10. Thanks for the tutorials. Nice source material to build my own chatbot from.

    At present, my output is really bland. Training on your conversation text, I go from: "could I borrow a cup of sugar" to: "also also self self cup unk unk unk unk unk unk unk".
    Is there a minimum number of epochs you need before the model starts to return anything sensible? Or should I expect at least a couple of good words right off the bat? Thanks.

  11. Hi @The semicolon
    I want to create a chatbot that tells you the weather information when asked.
    So how can i identify the intent as "weather information" from the user input ?

  12. Hi, the part at 1.40 where you gave reason for the first arg of LSTM() 'output_dim' being 300 because that is the vector length for the embedding is not correct AFAIK.

  13. vec_x, vec_y = pickle.load(f)
    TypeError: 'NumpyArrayWrapper' object is not iterable
    Can you help me with this error?

  14. hi can i know how to train the chatbot cuz i followed ur video and build one and wish it could able to ans question nicely

  15. Hello, Thanks a lot for the entire Data Analytics and Keras tutorial !
    Can you please tell, where can I find word2Vec.bin file..?

  16. why you have used sigmoid activation function, it has a disadvantage of vanishing gradient descent, Why you don't use ReLU, and why you took weight initializer as glorot_normal , there are various weight initializers are available

  17. Hi, really good tutorial.
    Can we make a n:1 model for predicting the answer ? How can I convert the Keras model to implement this ?

Leave a Reply

Your email address will not be published. Required fields are marked *