Fine-tuning in Gemini AI: Building a Custom Chatbot with Our Own Data.

AI Gemini Tecnology

Learn step by step how to create a unique chatbot tailored to your business. Thanks to the Google AI Studio tool, we can easily fine-tune Gemini models and create our own chatbot.

Fine-tuning en Gemini AI

1. But before we start
– What is Gemini AI??
– Advantages of Gemini AI
– Google AI Studio, a tool to make it simple
– What do we mean by a chatbot?
– ¿What does the fine-tuning technique consist of?
2. Creating our own chatbot step by step
– What are we going to do?
– Step 1: Getting to know Google AI Studio!
– Step 2: Preparing our data
– Step 3: Uploading our dataset
– Step 4: Configuring the fine-tuning
– Step 5: Testing our model
3. Conclusions

Learn step by step how to create a unique chatbot tailored to your business. Thanks to the Google AI Studio tool, we can easily fine-tune Gemini models and create our own chatbot.

But before we start

What is Gemini AI?

Gemini AI is a family of artificial intelligence models from Google, designed to enhance natural language and deep learning capabilities. Gemini is based on the evolution of previous Google models, such as PaLM (Pathways Language Model), and is part of Google’s strategy to lead in generative AI. Gemini stands out for its ability to understand, generate, and work with text, images, and other data modalities, allowing its use in a wide variety of applications.

Advantages of Gemini AI

  1. Multimodality: Gemini can handle multiple types of data, including text and images, enabling richer and more accurate interactions. This opens the door to tasks such as generating images from text descriptions and combined analysis of text and visual elements.
  2. Advanced dialogue capability: It is designed to improve the quality of conversations, providing more accurate and natural responses. It can also remember the context of a conversation, enhancing the experience in applications like chatbots and virtual assistants.
  3. Optimization for specific tasks: Gemini allows customizing models for specific tasks, such as sentiment analysis, code generation, language translation, and other language-based tasks.
  4. Integration with Google applications: Being part of the Google ecosystem, Gemini is seamlessly integrated into products like Google Search, Google Assistant, Google Docs, Google Workspace, and others, enhancing their functionality through generative AI capabilities.
  5. Scalability: Gemini can be used by both individual users and businesses, from small applications to large-scale enterprise solutions.

Google AI Studio, a tool to make it simple

Google AI Studio is a platform developed by Google that allows users to create, train, and deploy machine learning models, including language models like Gemini AI. It offers an intuitive interface that simplifies the fine-tuning process, meaning you can customize a pre-trained model like Gemini AI to suit your specific needs.

We can access the platform at https://aistudio.google.com/app.

What do we mean by a chatbot?

A chatbot is a computer program designed to simulate conversations with humans, either in writing or by voice. Chatbots can be used for a variety of purposes, such as customer service, technical support, entertainment, and education.

What does the fine-tuning technique consist of?

Fine-tuning is a key technique in AI development, used to customize existing models with specific data. It involves adjusting large language models with specific contexts for specialized tasks.

For example, you can train an AI model with a dataset of conversations in a professional environment so that the chatbot better understands the nuances of that context.

The interesting thing about fine-tuning is that instead of training an AI from scratch, it leverages the general knowledge of the base model and adapts it to particular needs, reducing time and costs.

Creating Our Own Chatbot Step by Step

What are we going to do?

Let’s suppose that in our company, the HR team is responsible for answering employees’ questions on various topics, such as: Where can I see my latest payslips? How many vacation days do we have? What holidays are set in the work calendar? What are the steps to request vacation or an absence, among many others.

To free the HR department from this task and allow them to focus on more priority matters, the company is considering creating a chatbot that employees can interact with to resolve company-related questions.

Let’s see how, with the help of Gemini, we can easily create this chatbot that integrates with our business.

Step 1: Getting to Know Google AI Studio!

The first thing we will do is access Google AI Studio, the online platform that will allow us to fine-tune the Gemini model with our own data. To do this, we go to the website https://aistudio.google.com/app.

REMEMBER: You will need a Gmail account to access this tool.

Let’s take a quick look at the platform. We can see that we have a central panel where prompts can be written as if it were the Gemini chat itself. There are also two side panels, the left one with a menu of options, and the right one with a configuration and model adjustment menu.

Initially, we will focus on the left side menu, where if we click on the option New tuned model.

On the screen, we have a button Create a Structure prompt that allows us to make a new adjustment to the model from a prompt structure that the tool itself offers, or from Google Sheets or CSV where we will have our data. There is also the option to select existing data as examples.

In the Provides Details section, we will define the name we will give to our tuned model and a brief description.

In the last section, we are allowed to choose the model and its configuration. In later steps, we will return to this configuration and see each of its options.

Once we have taken a quick look, it is time to start, but to begin fine-tuning the Gemini model with our data, we need precisely that, our data.

Step 2: Preparing our data

To fine-tune the model with company data, we will need to gather information from different sources. We must keep in mind that as of today, this fine-tuning option from Google AI Studio only allows us to work with texts, and the fine-tuning only accepts examples of input-output pairs in a chat style. Multi-turn conversations are not supported at this time.

STRUCTURE OF OUR DATA

Your data examples should match the format and content of your production traffic. If your dataset includes keywords, instructions, or a specific format, the production data must follow the same scheme. For example, if your examples have “question:” and “context:”, the production traffic must include both in the same order. If the context is omitted, the model will not recognize the pattern, even if the question was already in the dataset.

An example of a dataset would be:

DATA LIMITATIONS

We must know that the datasets for fine-tuning in the 1.5 flash model (which we will see later is recommended for this type of task) have certain limitations:

  • Inputs have a maximum of 40,000 characters
  • Outputs have a maximum of 5,000 characters

WHAT SIZE IS RECOMMENDED FOR TRAINING DATA?

Fine-tuning can be done with as few as 20 examples, but the more data we have, the better. The guideline is usually between 100 and 500, although Google provides us with a table, depending on the task we want to perform, indicating how many examples our dataset should consist of:

OUR DATASET

Once we have seen the data format, limitations, and size recommendations for the dataset, we will proceed to create our own dataset.

As we mentioned earlier, we will simulate that after hard work by HR, they provide us with a series of answers to the most common questions employees ask, and we normalize all these questions in an input/output context.

For this example, I have created a Google Sheets file in my Google Drive, where I created the input and output columns and filled it with 100 examples:

It is important to note that the process of collecting, normalizing, and structuring the data is fundamental to being able to fine-tune our base model with quality. Although the data used for the demo are entirely fictitious, they needed to be coherent to achieve a good final result.

Step 3: Uploading our dataset

To upload our dataset, which I have stored in a Google Sheets document in Drive, we click on the Import button:

Once we select the corresponding file, we click Insert. Then, in the next window, we will check the box Use first row as header, so it uses the first row as the header. In the assign to dropdowns, we will mark New input column and New output column for each corresponding column, indicating which context each column belongs to. Once the button is enabled, we click Import examples.

It is important to know that the file size cannot exceed 4 MB. Once the upload is complete, we can see a small preview of the data to ensure everything is correct.

Step 4: Configuring the fine-tuning

Now we will configure our model adjustment based on the data, specifying the advanced configuration.

The model recommended by Google for fine-tuning is gemini-1.5-flash-001-tuning. This model is a large language model (LLM) that stands out mainly for:

  • Lightweight and fast: Designed for high-frequency, low-cost tasks.
  • Multimodal: Capable of processing and understanding text, image, audio, and video data formats.
  • Long context: Very long input sequences and generating responses in context.

Returning to the configuration, we can see:

  • Tuning epoch: In the case of language models, an epoch is a complete cycle in which the model sees all the training data once. Imagine you are teaching a dog new tricks. An epoch would be like a complete training session where you teach it all the tricks you want it to learn.
  • Batch size: The batch size determines how many training examples are processed together in each training step. Think of training as feeding a dog. A batch would be like the amount of food you use to feed it. A large batch speeds up training but can make it harder to adjust.
  • Learning rate multiplier: The learning rate is like the speed at which the model learns. If you want the model to learn faster, increase the multiplier (more than 1). If you want it to learn more slowly and accurately, decrease the multiplier (between 0 and 1).

The recommended training parameters are:

We will use these values to adjust our model, add them to the form, and click Tune.

As we can see, once the adjustment begins, in My library we have the status of this, and if we click on it, it will navigate to the training detail.

In the detail or result of the adjustment, we can see the progress. At this point, we will focus on the Loss / Epochs graph.

THE LOSS CURVE

Imagine you are teaching a child to throw a ball into a basket. Each time the child throws, you measure how far the ball is from the basket. That “distance” is like the loss in model training.

The loss curve is a graph that shows how that “distance” (the loss) decreases as the child practices (each training epoch). Ideally, we want the ball to go into the basket (the loss to be zero), but in reality, we aim for it to get as close as possible.

Why is the loss curve important?

  • It tells us when to stop training: If we keep training too much, the model may start to “memorize” the data instead of generalizing, which worsens its performance on new data. The loss curve helps us find the right point where the model has learned enough.
  • It shows if the model is learning: If the loss curve does not decrease, it means the model is not improving, and there may be a problem with the training.

In summary:

  • The loss curve is a visual tool that helps us understand how our model is learning.
  • We look for the lowest point on the curve before it stabilizes, as that is where the model has achieved good performance without overfitting.

Example:

If the loss curve stabilizes between epochs 4 and 6, it means that training beyond epoch 4 will not give us a significant improvement in model performance. Therefore, we can set the number of epochs to 4 and save training time.

In short: The loss curve is like a map that guides us in training our model, helping us find the best point to stop the process and get the best results.

Step 5: Testing our model

Once our fine-tuning is complete, we can test it by clicking on Use in chat.

This will navigate us to a chat prompt, where we can see in the right panel that our model is being used.

In this panel, we can observe different configuration parameters.

  • Token count: Represents the maximum number of tokens the model can generate in a response. A token is approximately equivalent to four characters. For example, if you set a token count of 100, the model will generate a response of approximately 60 to 80 words. In this case, for testing, we have a maximum of 16,384 tokens.
  • Temperature: This parameter controls the creativity and diversity of the model’s responses. A low temperature value (close to 0) will generate more predictable and concise responses, while a high value (close to 1) will produce more creative and diverse responses, but potentially less coherent. For our case, we will leave it at a medium point of 0.5.
  • Advanced settings:
      • Add stop sequence: By enabling this option, you can specify a sequence of tokens that will indicate to the model when to stop generating text. This is useful for controlling the length of responses or preventing the model from generating repetitive text.
      • Output length: This parameter allows you to set a maximum length for the response generated by the model, measured in tokens.
      • TopK: This parameter is used to limit the model’s vocabulary during text generation. Only the “K” most probable tokens will be considered at each step of generation. A low TopK value makes responses more predictable, while a high value increases diversity.
      • TopP: Similar to TopK, but instead of a fixed number of tokens, TopP defines an accumulated probability. Only tokens whose accumulated probability is less than or equal to TopP will be considered. This allows for finer control over the diversity of responses.

My prompt will be:

I have been assigned an on-call shift and would like to know how these shifts are organized and function:

Let’s test the model a bit more and have a conversation touching on different sections: We can see that the model adjusts perfectly and responds clearly based on our training data.

BONUS: Getting the code to implement our model in different programming languages.

Working with Google AI Studio is great, but we probably want to integrate this chatbot into our system. To do this, we can get the code to implement it where it fits best.

To do this, in the upper right corner, click on Get Code, and the different options for using our model in different languages will open. And I get:

It is important to note that to use our model in some languages, we will need OAuth and an API Key. For this, the project must be registered in Google Cloud.

The main idea of this article is to teach the fine-tuning process using the Google AI Studio tool. However, I provide the resources for the guide to obtain the API KEY and authentication with OAuth, so you can use your model on any platform.

Get an API KEY

Authentication con OAuth.

Conclusions

In this article, we have learned step by step how to fine-tune the Gemini model using our own dataset easily with Google AI Studio to create a chatbot capable of answering employees’ questions about the company they work for.

Models like Gemini offer endless options when using fine-tuning techniques for specific tasks. We can take advantage of all the model’s benefits and apply them in the context of our business.

The Google AI Studio tool makes the work easier for those who are not very familiar with languages like Python and work environments like Colab notebooks.

Without a doubt, Google makes it easy to integrate with generative AI models. So, are you ready to create your own chatbot?

Compartir artículo