Meena: Introduction and Implementation Guide

Updated: Feb 13

Introduction

Google recently introduced a new chatbot called Meena, which has caught a lot of attention for its advanced design and features. Meena is aimed at creating chatbots that can interact with people more naturally, similar to how humans communicate.

It's designed to have more open and natural conversations with users. It's based on a sophisticated architecture called Transformer seq2seq and has been trained on a massive amount of text data from social media conversations.

Key Features of Meena

One of the key aspects of Meena is its ability to understand context and respond accordingly. It's not limited to predefined rules or keywords like many other chatbots.

Implementation

Setting the Stage: Installing Necessary Packages

Imagine preparing the stage for a grand show. Before we dive into the exciting world of creating a chatbot, we need to make sure we have all the right tools. In this case, we're using special software called TensorFlow to build our chatbot. So, we're quickly installing and updating the necessary packages to make sure everything runs smoothly.

!pip install -q -U tensorflow-gpu==1.15.2
!pip install -q -U tensorflow-datasets==3.2.1
!pip install -q -U tensor2tensor
import tensorflow as tf
from tensor2tensor import models
from tensor2tensor import problems
from tensor2tensor.utils import hparams_lib
from tensor2tensor.utils import registry
from tensor2tensor.data_generators import text_problems
import numpy as np
import re
import os
tf.get_logger().propagate = False

Downloading a Pretrained Model

Now that we've got our tools ready, it's time to bring in the star of our show - our chatbot, Meena. But before we do that, we need to make sure it's dressed to impress. That means downloading a fancy, pre-trained model that will serve as the backbone of our chatbot's intelligence.

model_name = "Italian_108M"
!gdown  https://drive.google.com/uc?id=1y0abt3nOKPo5DBfKx3b7A7pjm3GH3wi1
!unzip {model_name}.zip
MODEL_DIR = model_name + '/'

Setting the Parameters

Let's start with MODEL = "evolved_transformer" and VOCAB_SIZE = 2**13. Here, we're setting up the model architecture and the size of the vocabulary.

MODEL = "evolved_transformer"
VOCAB_SIZE = 2**13
# sampling parameters
CONVERSATION_TURNS = 3
SAMPLING_TEMPERATURE = 0.88
NUM_SAMPLES = 5
MAX_LCS_RATIO = 0.9

Enabling Eager Execution

Now, moving on, tfe = tf.contrib.eager enables TensorFlow eager execution, which allows for immediate evaluation of operations, making debugging and interaction with the code easier.

tfe = tf.contrib.eager
tfe.enable_eager_execution()
Modes = tf.estimator.ModeKeys

Defining the Problem

ChatBot inheriting from text_problems.Text2TextProblem, indicating that our chatbot's problem involves transforming one text into another. We're essentially setting up the problem structure for our chatbot within the Tensor2Tensor framework.

@registry.register_problem
class ChatBot(text_problems.Text2TextProblem):
	@property
	def approx_vocab_size(self):
		return VOCAB_SIZE

Preprocessing and Postprocessing

The preprocess_sentence function takes a sentence as input and performs several text preprocessing steps to make it suitable for the chatbot to understand. It converts the entire sentence to lowercase, adds spaces before punctuation marks, removes extra spaces, and non-alphanumeric characters.

def preprocess_sentence(sentence):
	sentence = sentence.lower().strip()
	# creating a space between a word and the punctuation following it
	# eg: "he is a boy." => "he is a boy ."
	sentence = re.sub(r"([?.!,])", r" \1 ", sentence)
	sentence = sentence.replace("'", "' ")
	sentence = re.sub(r'[" "]+', " ", sentence)
	sentence = re.sub(r"[^a-zA-Z0-9?.!,àèìòùáéíóú']+", " ", sentence)
	sentence = sentence.strip()
	return sentence

Now, let's move on to postprocess_sentence. This function takes a sentence generated by the chatbot and performs some post-processing to make it more readable. It removes trailing spaces and periods, and extra spaces before punctuation marks.

def postprocess_sentence(sentence):
	# remove space before punctuation
	sentence = sentence.rstrip(" .")
	return re.sub(r"\s+(\W)", r"\1", sentence)

Encoding and Decoding

Encode and decode functions are responsible for converting text data into a format that the chatbot model can understand and vice versa.


def encode(conversation, output_str=None):
    """Input str to features dict, ready for inference"""
    encoded_inputs = []
    for conversation_turn in conversation:
        encoded_inputs += encoders["inputs"].encode(conversation_turn) + [2]
    encoded_inputs.pop()
    encoded_inputs += [1]
    if len(encoded_inputs) > hparams.max_length:
        encoded_inputs = encoded_inputs[-hparams.max_length:]
    batch_inputs = tf.reshape(encoded_inputs, [1, -1, 1])  # Make it 3D.
    return {"inputs": batch_inputs}

def decode(integers):
    """List of ints to str"""
    integers = list(np.squeeze(integers))
    if 1 in integers:
        integers = integers[:integers.index(1)]
    decoded = encoders["inputs"].decode(integers)
    return postprocess_sentence(decoded)

Calculating LCS Ratio

We have lcs_ratio, which calculates the Longest Common Subsequence ratio between the input conversation and a predicted response. This helps ensure that the generated response stays coherent with the context of the conversation.

def lcs_ratio(context, predicted): 
    m = len(context) 
    n = len(predicted) 
    L = [[None]*(n + 1) for i in range(m + 1)] 
    for i in range(m + 1): 
        for j in range(n + 1): 
            if i == 0 or j == 0 : 
                L[i][j] = 0
            elif context[i-1] == predicted[j-1]: 
                L[i][j] = L[i-1][j-1]+1
            else: 
                L[i][j] = max(L[i-1][j], L[i][j-1]) 
    return L[m][n] / n

Predicting

The predict function takes a conversation as input. First, it preprocesses each turn of the conversation using the preprocess_sentence function.

def predict(conversation):
    preprocessed = [preprocess_sentence(x) for x in conversation]
    encoded_inputs = encode(preprocessed)
    print("decoded input: " + decode(encoded_inputs["inputs"]))
    with tfe.restore_variables_on_create(ckpt_path):
        while True:
            output_candidates = [chatbot_model.infer(encoded_inputs, decode_length=1) for _ in range(NUM_SAMPLES)]
            output_candidates.sort(key = lambda x: -float(x["scores"]))

            for x in output_candidates:
                print(str(float(x["scores"])) + "\t" + decode(x["outputs"]))

            for candidate in output_candidates:
                decoded = decode(candidate["outputs"])
                if lcs_ratio(" ".join(preprocessed), decoded) < MAX_LCS_RATIO:
                    return decoded

Managing the Conversation Loop

This code snippet manages an ongoing conversation with a user by continuously prompting for input, storing the conversation history, generating responses using a chatbot model, and displaying the responses back to the user.

conversation = []
while True:
    sentence = input("Input: ")
    conversation.append(sentence)
    while len(conversation) > CONVERSATION_TURNS: 
        conversation.pop(0)
    response = predict(conversation)
    conversation.append(response)
    print(response)

Output

And that's the end! 🎉 We've finished our chatbot Introduction and Implementation Guide journey. We really hope this guide helped you learn useful stuff.

Now that you understand Google's chatbot Meena, you've figured out how to use it to have cool conversations. By trying it out with Python and the transformers library, you've learned how to set it up and chat with it.

Try New Things, Do Cool Stuff As you move forward, don't be afraid to try new things and use this chatbot in your own projects.

Tell Us What You Think Thanks for coming along on this journey! If you have questions, feedback, or fun stories to share, let us know. Your ideas help make natural language processing better.

If you require assistance with the implementation of chatbot-related projects, please don't hesitate to reach out to us.