{"id":447,"date":"2023-04-27T22:10:42","date_gmt":"2023-04-27T22:10:42","guid":{"rendered":"https:\/\/rodamoya.com\/?p=447"},"modified":"2023-05-26T07:46:14","modified_gmt":"2023-05-26T07:46:14","slug":"chatbot","status":"publish","type":"post","link":"https:\/\/rodamoya.com\/index.php\/2023\/04\/27\/chatbot\/","title":{"rendered":"Chatbot:"},"content":{"rendered":"\n<p>Chatbots have experienced a huge increase with ChatGPT and other AI chatbots. Creating my own chatbot is not easy due to the amount of data I have. However, using all the WhatsApp chats I had, I was able to create a decent chatbot that is even able to speak different languages like Catalan. Take a look at a conversation with my own chatbot.<\/p>\n\n\n\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/rodamoya.com\/wp-content\/uploads\/2023\/04\/Actividad-_-Enric-Roda-_-LinkedIn-\u2014-Mozilla-Firefox-2023-04-03-15-49-05.mp4\"><\/video><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>of the chatbot depends on the quality of the data and the algorithm. Due to the massive amount of data that Google has and its highly skilled professionals, Google AI can engage in incredible and deep conversations.<\/p>\n\n\n\n<p>However, when creating my own chatbot, which is based solely on a .txt file containing a WhatsApp conversation, I could see that the algorithm works, and I was able to have a basic level of conversation.<\/p>\n\n\n\n<p><br>Next, we are going to see how the algorithm work:<br><br>1) Get all the different words of the message sent by the user.<br>2) Read all the lines of the WhatsApp conversation:<br>\u00a0\u00a0\u00a02.1) Count the similar words between the message and each line.<br>\u00a0\u00a0\u00a02.2) Save the lines with more similarities with the message.<br>\u00a0\u00a0\u00a02.3) Get the answer to the saved lines.<br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0(Answer = next line of conversation)<br>\u00a0\u00a0\u00a02.4) Returns the most repeated answer.<br><br>Below I will show a video of a conversation with the chatbot and the link to the GitHub where you will be able to enter the &#8220;.txt&#8221; of your WhatsApp conversations and &#8220;talk to yourself&#8221;.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Import \n\nimport pandas as pd\nimport numpy as np\nimport seaborn as sns\nfrom datetime import datetime\nfrom math import *\nfrom datetime import datetime as date\nimport matplotlib.pyplot as plt\nfrom iteration_utilities import duplicates\nfrom iteration_utilities import unique_everseen\n\n# Read the data:\n\ntxtdata = pd.read_table(\".txt\", on_bad_lines='skip') # --&gt; Enter the .txt flie here\n\n# Data Preparation:\n\n## Separate the txt file in different columns\n\n# Column with all the information:\ncolumn_name = txtdata.columns&#91;0]\n\n# Separate the information in different columns\ntxtdata&#91;\"Date\"] = txtdata&#91;column_name].str.split(\" \").str.get(0).str.title()\n\ntxtdata&#91;\"Hour\"] = txtdata&#91;column_name].str.split(\" \").str.get(1).str.title()\n\ntxtdata&#91;\"Name\"] = txtdata&#91;column_name].str.split(\" \").str.get(2).str.title()\n\ntxtdata&#91;\"Content\"] = txtdata&#91;column_name].str.split(\":\").str.get(3)\n\n\n# Save the  columns:\n\n# columns to keep:\ncolumns_save = &#91;'Date', 'Hour', 'Name', 'Content']\n# New df:\ndf = txtdata&#91;columns_save]\n\n# Clean the data:\n\n# Delet \"&#91;\" brakets \"]\" and \":\" \n\ndf&#91;\"Date\"] = df&#91;\"Date\"].str.split(\"&#91;\").str.get(1)\n\ndf&#91;\"Hour\"] = df&#91;\"Hour\"].str.split(\"]\").str.get(0)\n\ndf&#91;\"Name\"] = df&#91;\"Name\"].str.split(\":\").str.get(0)\n\n## Delet the Nan values\n\n# Save the df without any nan values\n\ndf = df&#91;~df&#91;\"Date\"].isna()]\n\n# Delet the missatges that rows thatr contains the imatge or audio text\n\ndf = df&#91;df&#91;\"Content\"].str.contains(\"omitted\") == False]\n\n# Merge\n\n# Crate a new column to merge the conversations:\ndf&#91;\"Change\"] = \"\" \ndf&#91;\"Count\"] = \"\"\n\n# Max num of rows to do the while loop:\nmax_fun = df&#91;\"Name\"].count()\nmax_fun = max_fun - 2\n\n# Create a loop that detects when it change writer, and count all the changes\n\n# Get and Error but sill work\ni = 0\ncoun = 0\nwhile i &lt;= max_fun:\n    if (df&#91;\"Name\"].iloc&#91;i+1] == df&#91;\"Name\"].iloc&#91;i]):\n        i = i + 1\n        df&#91;\"Change\"].iloc&#91;i]= 1\n        \n        \n        df&#91;\"Count\"].iloc&#91;i ] = coun\n       \n    else: \n        i = i + 1\n        df&#91;\"Change\"].iloc&#91;i] = 0\n        \n        coun = coun + 1\n        df&#91;\"Count\"].iloc&#91;i] = coun\n\n# Find value more repetitive value on a list\n\ndef most_frequent(List):\n    return max(set(List), key = List.count)\n\n# Group the conversations:\n\n# Group the mensatgess of the same writer together until it change\ndf_gr = df.groupby(df&#91;\"Count\"]).first()\n\ndf_gr = df_gr&#91;:-1]\n\n# Chat Bot:\n\nwhile True:\n    \n    # read our mesatge and get all the words\n    msg = input(\"Me:\")\n    list_msg =  msg.split(\" \")\n    \n    # Save the possible answers\n    Posiibles_respostes = &#91;]\n    \n    # Read all the mesatges to detect the once are similars are our msg\n    max_fun = df_gr&#91;\"Name\"].count()\n    max_fun = max_fun - 1\n    i = 0 \n    max_count = 0\n    while i &lt; max_fun:\n        \n        try:\n            #Get the content\n            content = df_gr&#91;\"Content\"].iloc&#91;i]\n\n            # Counter of similar words of our msg and the content\n            count_simitud= 0\n            \n            # Loop for all the words in our msg\n            for paraula in list_msg:\n                \n                # If the word in the msg is in the content add 1 in to the counter\n                if paraula in content:\n                    count_simitud = count_simitud +1\n\n            # If the number of similar word is the maximum delet all the oder saved answares\n            if count_simitud &gt; max_count:\n                Posiibles_respostes = &#91;]\n\n            # If the number of similar word is the maximum or eaqual save the maximum, the position of the maximum, and all the possible answares:\n            if count_simitud &gt;= max_count:\n                \n                max_count = count_simitud\n                num_max = i\n\n                resposta_provisional = num_max + 1\n                \n                # List with the possible answars:\n                resposta = df_gr&#91;\"Content\"].iloc&#91;resposta_provisional]\n                Posiibles_respostes.append(resposta)\n        \n        except:\n            pass\n\n        # Add 2 because we want to read the content of the same person\n        i = i + 2\n    \n    # Get the most frequant answare:\n    resposta_def = most_frequent(Posiibles_respostes)\n    \n    resposta_def = resposta_def.replace(\"baby\", \"\")\n    \n\n    \n    # Print the answare of the Bot\n    print(\"Bot:\" , resposta_def)\n    print(\"\")\n\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Chatbots have experienced a huge increase with ChatGPT and other AI chatbots. Creating my own chatbot is not easy due to the amount of data I have. However, using all the WhatsApp chats I had, I was able to create a decent chatbot that is even able to speak different languages like Catalan. Take a &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/rodamoya.com\/index.php\/2023\/04\/27\/chatbot\/\" class=\"more-link\">Read more<span class=\"screen-reader-text\"> &#8220;Chatbot:&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/posts\/447"}],"collection":[{"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/comments?post=447"}],"version-history":[{"count":3,"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/posts\/447\/revisions"}],"predecessor-version":[{"id":570,"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/posts\/447\/revisions\/570"}],"wp:attachment":[{"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/media?parent=447"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/categories?post=447"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rodamoya.com\/index.php\/wp-json\/wp\/v2\/tags?post=447"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}