# ChatBot using Markov Chain: how to understand the intent of user?

## Recommended Posts

Hi,

I have found a link for creating a bot using Markov Chain:Markov Chain Code

They have provided a "run" option also.

My question about the program is that the program is showing the random unique responses? Am I right? It looks that the random unique responses are meaningful as far as English is concerned. It looks like they have provided different variations of same sentences.

So the program only  generates the sentences  randomly.

But to generate the random sentences we have to first identify the intent of the user, whether the user is asking about the price of item, whether the user is asking about the expiry date of item, whether the user is asking about the quatity of items in the stock (i.e. my objective is to create a bot for helping in buying online grocery items).

Zulfi.

##### Share on other sites
17 minutes ago, zak100 said:

My question about the program is that the program is showing the random unique responses?

You are correct. Your link www.codingame.com explains it in the first sentence:

Quote

In this article, I will present a simple algorithm, based on Markov Chains, to generate random sentences.

17 minutes ago, zak100 said:

But to generate the random sentences we have to first identify the intent of the user, whether the user is asking about the price of item,

I may take a look into some of your other threads on this topic to understand the context of your question better.

##### Share on other sites
9 hours ago, zak100 said:

we have to first identify the intent of the user, whether the user is asking about the price of item, whether the user is asking about the expiry date of item, whether the user is asking about the quatity of items in the stock (i.e. my objective is to create a bot for helping in buying online grocery items).

I do not see the connection between the title's Markov Chain and intent extraction in the context of a chatbot. Can you provide some details?

##### Share on other sites

Hi,

<I do not see the connection between the title's Markov Chain and intent extraction in the context of a chatbot>

Sorry I mean that the intent is something which is extracted from the user's message

But I think that the code provided in the link just handles with providing reponse to the user. But to provide response we need to know what the user is asking in his message. Again, I am calling this as the intent of the user. We would provide the inten of user to the Markov chain and then the Markov chain wil generate the reponse.

But the question is that how will we find the intent of user?

Zulfi.

##### Share on other sites
3 hours ago, zak100 said:

But the question is that how will we find the intent of user?

That is a large, complicated and very interesting question. Answering the question would require many pages, I'll have to provide something short that allows for a discussion.

There are indeed at least indirect connections between Markov and intents that I did not think of initially; Hidden Markov Model (HMM). Is that what you are looking for? HMM may be used for getting part of speech and/or named entities and then find the user intent.

Personally I would probably look into models based on neural network but in this case that is not an option:

On 11/12/2020 at 5:49 AM, zak100 said:

I did a course of NN sometime but since then I hate them

##### Share on other sites

The Markov model linked isn't trying to extract intent, i think it's predicting the next word in a sequence given an input word, based on a state space that contains emission probabilities for word pairs. It then uses that output as the input to predict the next word until a termination is paired with an input word.

Intent is how humans parse such information, there's no reason an algorithm needs to parse words in the same way (and still give intelligible results).

If you want state of the art natural language processing check out GPT-3. It's a neural network based on an autoregressive model (so that it can take into account a number of previous words rather than just one), but i believe it still just predicts the next word sequentially.

##### Share on other sites

Hi,

Good information.

It has provided some information about how to use it with Python. But as far as I understand it is not possible to do so right now.

If anybody knows the trick of using GPT-3 with python please guide me.

Zulfi.

Hi,

I got the code related to Markov at:

and it worked but not impressive to solve specific problems:

The output is given below:

>What is the date today
count had sent him to Petersburg.
>How are you
the approach of the count had sent him to Petersburg.
the bearers,
>where you live
the cold listless gaze fixed itself upon nothing.
>quit
the young man he caught a momentary glimpse between their heads and backs of his gray,

==end bot program using Marov Chain

Can we make it better?

<
Personally I would probably look into models based on neural network but in this case that is not an option: >

Please tell me about NN solutions. I would look into that. I think its better to use NN as compared to hard coded data.

Zulfi.

##### Share on other sites
9 hours ago, zak100 said:

>What is the date today
count had sent him to Petersburg.

Is this a joke?

Anyway:

9 hours ago, zak100 said:

Can we make it better?

Yes. What have you tried so far? How did you extract the intent of the user (I guess you did not yet)? What data was used to train the bot to provide answers?

What did you use to capture the dynamic aspects of the response? The response to "What is the date today" is dynamic and handling that has similarities to a customer asking about the price of an item (as you stated in the opening post about groceries).

Edited by Ghideon
clarification and added the dynamic aspects

##### Share on other sites

Hi,

<Is this a joke?  >

I was reading about the GPT-3 (thanks to our friend Prometheus), even though its generative, the author was saying that such jokes will occur. I have applied for its key. I have not yet received any email yet. Have you tried GPT-3?

<What have you tried so far?> I have tried this Markovbot but now I am trying to use the randomization technique discussed in my other post. Along this I want to work on the NN if I get some advise from you.

<How did you extract the intent of the user (I guess you did not yet)? >

I don't think its possible with AI techniques, I have to use programming.

<What data was used to train the bot to provide answers? >

I used the paragraph from war&peace book as discussed in the stackexchnage post.

<What did you use to capture the dynamic aspects of the response? >

Thanks for increasing my knowledge. I did not know about this. Guide me how to handle such messages.

I am thinking to write some program.

If you have some ideas please let me know.

I am thinking to change from grocery. Its hot  right now but it requires lot of information. I am thinking to focus on student advisor. But it requires contacts of faculty which would be less as compared to stored items in a grocery store.

Zulfi.

##### Share on other sites

It seems my response in your other chatbot thread (intents classification) led you to only half the correct solution.  In the other thread you had a link to a data set which provided you with both input parameters and a series of responses that fit them.  I suggested a markov chain as a much simpler way to map those input phrases to output phrases than an ANN but you will still have to train it on that dataset (and likely format said data in a way the chain can learn to hop from the correct state to the next.)

EDIT

If you’re actually looking to properly model intent (I assumed you were looking for homework help) then that is a topic of ongoing research.  GPT-3 is little more than a statistical model that, although a lot more complex, is similar to a markov chain in that it maps words to the next based on probabilities.  It’s just GPT has 3 billion parameters while people tend to use markov chains with like, 3, parameters.  GPT does not understand intent anymore than a markov chain does.

Edited by PoetheProgrammer

##### Share on other sites

Hi,

My friend you are right. Because the author showed some stuff which was a joke. I discussed with Ghideon also. Then we have to improve the response received from the Markove chain.

You are saying that GPT-3 also does the same thing and Ghideon is saying that we can improve its response. So let's try to improve the response from the Markov instead of focussing on NN.

<but you will still have to train it on that dataset>

You mean that instead of using the war&peace stuff, I have to focus on my domain related messages?

God blesses you.

I would develop some messages and then ask you people about its chaining.

Zulfi.

##### Share on other sites

What's the purpose of the project? Are you focused more on learning how programming a chatbot works, or do you just want a working product (or something else)?

4 hours ago, PoetheProgrammer said:

It’s just GPT has 3 billion parameters

The full model has 175 billion parameters. Crazy stuff. I heard rumour that GPT 4 will have 20 trillion parameters. How much do you think just making bigger models and feeding them more data will improve outcomes?

##### Share on other sites
16 hours ago, zak100 said:

Have you tried GPT-3?

Not yet*.

16 hours ago, zak100 said:

I don't think its possible with AI techniques, I have to use programming.

That does not seem to match my experiences from the area, I have made proof of concept chat bots based on machine learning** from two large commercial vendors. I did not explicitly program the parsing of the user input or the extraction of intents and entities.

16 hours ago, PoetheProgrammer said:

I suggested a markov chain as a much simpler way to map those input phrases to output phrases than an ANN but you will still have to train it on that dataset (and likely format said data in a way the chain can learn to hop from the correct state to the next.)

I have a followup question on that. In your opinion, is that a viable approach in a more general application that zak100's example? Assume we have a working input parsing, tokenising, stemming, intent and entity extraction etc in place. To perform back end calls (bot actions) and generate output we need to track the current state or context (I've seen different words used). Example: we have the following (sketchy) dialogue: (chatbot output in italics)

"How much does 5 bananas cost?"
"5 bananas costs 4€"
"Do you have apples in stock?"
"Yes we have apples in stock"
"What is the price of them?"
"One apple costs 0.35€"

The reasonable answer would be to present the price of apples, not the price of the bananas in the shopping basket. Would Markov Chain be useful to handle the user switching the context in the example above? Reason for asking; I have tried things related to this in more high level frameworks where the underlaying mechanism was not exposed. Your response to zak made me interested in possible implementations.
(Note that the above is just a quick example, it could be reasonable for the bot to answer "I want to buy them" with "I did not understand that, please try something else" or "sorry we do not have 'them' in stock)

16 hours ago, zak100 said:

Thanks for increasing my knowledge. I did not know about this. Guide me how to handle such messages.

See the example above. The bot needs access to price and stock, probably via some calls (sometimes called bot actions) to a backend system, service API or other.

16 hours ago, zak100 said:

Along this I want to work on the NN if I get some advise from you.

Ok, I'll get some links to introduction material... or maybe not:

15 hours ago, zak100 said:

So let's try to improve the response from the Markov instead of focussing on NN.

15 hours ago, zak100 said:

You mean that instead of using the war&peace stuff, I have to focus on my domain related messages?

Yes. The chat bot will only know about the information it has available.
Training a chatbot to generate random sentences from a book will not make the chatbot capable of knowing how to make API calls to an online grocery shop.

*) I have had other requirements requiring other solutions.

**) If machine learning is part of what you call "AI techniques".

Edited by Ghideon

##### Share on other sites

Hi.

Yes machine learning is part of AI techniques.

If you think ML is good, I can switch towards ML. I need guidance either its ML or Markov. But your experience tells me that ML would be more easier for you.

Prometheus:

<What's the purpose of the project? Are you focused more on learning how programming a chatbot works, or do you just want a working product (or something else)? >

Purpose of the project is the familiarity with this area and if possible come up with chatbot to handle student queries b/c of the current COVID-19 situation. People like to prefer more online interaction as compared to face to face.

Zulfi.

Edited by zak100

##### Share on other sites
8 hours ago, Ghideon said:

I have a followup question on that. In your opinion, is that a viable approach in a more general application that zak100's example? Assume we have a working input parsing, tokenising, stemming, intent and entity extraction etc in place. To perform back end calls (bot actions) and generate output we need to track the current state or context (I've seen different words used). Example: we have the following (sketchy) dialogue: (chatbot output in italics)

"How much does 5 bananas cost?"
"5 bananas costs 4€"
"Do you have apples in stock?"
"Yes we have apples in stock"
"What is the price of them?"
"One apple costs 0.35€"

The reasonable answer would be to present the price of apples, not the price of the bananas in the shopping basket. Would Markov Chain be useful to handle the user switching the context in the example above? Reason for asking; I have tried things related to this in more high level frameworks where the underlaying mechanism was not exposed. Your response to zak made me interested in possible implementations.
(Note that the above is just a quick example, it could be reasonable for the bot to answer "I want to buy them" with "I did not understand that, please try something else" or "sorry we do not have 'them' in stock)

On its own probably not.  You could for sure use a markov chain to get “price n apples” from the phrase price of 5 apples, and it’d be trivial to allow the same or a related chain to keep track of a state (so as to disregard bananas) but you would then need to delegate the looking up of some price from an inventory system, etc.  A chain is just that a chain of words and it learns to hop to the most likely word based off the previous N words.  By the time you implemented a chain to extract that kind of data it wouldn’t be a markov chain.

you could use a chain like that to process words and eg extract nouns and things about them (price n apples) but if you want real conversation you’ll need an object hierarchy no process nouns and a system to learn all the things they do e.g. price of bad apple needs to know apples go “bad” as in rot.  To model proper human language such a hierarchy would need to be fairly complex and self building.

edit: would need to be self building if the intent wasn’t to spend years training by manually writing out all these things.

Edited by PoetheProgrammer

##### Share on other sites

Hi,

For Markov chain, I have following question: Let's suppose in the context of college chatbot, which is responsible for handling all types of queries, how the chatbot can detect if the problem is related to fee, for example the users can have stated following types of sentences:

=(1)s it possible to get some fee concession? My father is jobless

=(3)Can I get a fee waiver? I have lost the job./My father has lost the job.

=(4)I can’t pay the tuition, I was hospitalized due to Covid, please look into my problem.

For (1) to (3), users have used the word "fee" which can be used for detection of the problem, but in (4)  the user uses the word "pay", but there can other variations also like:

=I don't have any money, and can't get any loan, please guide me with my semester debt?

Please guide me how the bot can handle such situations? This is related to fee but there could be other sitautions also.

Zulfi.

##### Share on other sites
On 11/17/2020 at 6:40 AM, PoetheProgrammer said:

A chain is just that a chain of words and it learns to hop to the most likely word based off the previous N words.  By the time you implemented a chain to extract that kind of data it wouldn’t be a markov chain.

I agree. I think my question was not clear; I mean more a chain of dialog contexts* spanning over several turns. Assuming parsing user input and calling backend bot logic is taken care of, what is a good way to draw the conclusion that the user is discussing the same topic or has moved on to a new topic? Example: The user has added items to basket and wish to pay. The dialogue moves on to handle checkout. In checkout it may be more likely that the user will ask about deliveries than about items in the stock and the bot may learn that to allow for better predictions. I have used similar things in some frameworks but the implementation was blackbox. As Zak seems to try to start more from scratch your opinion on implementations could be interesting. I also note that this topic is huge and the research is ongoing. Prototypes I did last year is probably obsolete by now.

14 hours ago, zak100 said:

Please guide me how the bot can handle such situations? This is related to fee but there could be other sitautions also.

Your examples are pretty narrow and this is a huge topic. Maybe you could start by designing more generally what the chatbot will handle? Because the answer is different if the bot is concentrating on more specific tasks than if the bot is capable of general help. For instance you seem to assign the entity "fee" with possible synonyms "pay" and "dept" to an intent like payment.cant-afford. That implies that you have already assumed that the student only asks about issues with the ability to pay. But the student could use pay and fee and dept in other intents. Examples payment.how-do-I, payment.how-often and many others.

*) AKA topics in some frameworks(?)

Edited by Ghideon
clarifications

##### Share on other sites
8 hours ago, Ghideon said:

I agree. I think my question was not clear; I mean more a chain of dialog contexts* spanning over several turns. Assuming parsing user input and calling backend bot logic is taken care of, what is a good way to draw the conclusion that the user is discussing the same topic or has moved on to a new topic? Example: The user has added items to basket and wish to pay. The dialogue moves on to handle checkout. In checkout it may be more likely that the user will ask about deliveries than about items in the stock and the bot may learn that to allow for better predictions. I have used similar things in some frameworks but the implementation was blackbox. As Zak seems to try to start more from scratch your opinion on implementations could be interesting. I also note that this topic is huge and the research is ongoing. Prototypes I did last year is probably obsolete by now.

I would say that, if your goal is domain specific as in the example of sales that we’re rolling with, you would need some hard coded primitives/nouns that you can push to a “topic stack,” by which I mean that chains are fine for just handling responses but you will have some logic that isn’t a machine learning.  Mainly though my point is that moving from discussing the items to checkout doesn’t actually change the topic but it adds a new, derived, topic unto the “stack” which in reality would be a higher level chain than the markov chain.

So perhaps you have a high level chain that learns the users hop from discussing the item to the checkout and as it sees you moving topics this high level models moves the “chatbot” to a chain trained on topics regarding checkout of items (which would still have to delegate to some logic that eg checks that weight and/or the freshness of the item and can then give proper answers to questions one would have at checkout.)

As you say this stuff is ongoing research: but you’ll definitely need to stack machine learning techniques alongside old school search to both keep track of the “topic stack” and correctly answer it.  That’s how I’d go about solving the OPs problem but from what I’m gathering from his post this is not a weekend project.

Edited by PoetheProgrammer

## Create an account

Register a new account