Machine Learning for Android Developers

1. Introduction
2. What is machine learning?
3. What machine learning tools are available for Android developers? And what do they offer us?
4. ML Kit
5. Firebase ML
6. Differences between ML Kit and Firebase ML
7. TensorFlow Lite and custom models
8. Conclusions

Nowadays, with the rise of artificial intelligence in recent years, concepts like machine learning are on everyone’s lips. But it’s not something new; in fact, it has been coexisting with us for a long time.

As an Android developer, it’s interesting to know what machine learning tools exist and what we can do with them in our mobile applications. But before looking at the main tools, let’s answer a question: What is machine learning?

What is machine learning?

Machine learning is a branch of artificial intelligence that focuses on developing algorithms and techniques that allow computer systems to learn and improve automatically from data and experiences, without being programmed for a specific task. In other words, training a machine to perform tasks based on patterns from data.

The most common uses of machine learning are:

Predictive Analytics: Studying outcomes based on historical data.
Content Filtering: Search engines, social networks to recommend relevant content. For example, Netflix recommendations.
Virtual Assistants and Chatbots: Siri, Google Assistant, Alexa rely on ML to understand and respond to queries.
Pattern Recognition: Voice, facial, and handwriting recognition.
Medicine and Diagnostics: Analyzing medical images to diagnose diseases, predict health risks, and personalize treatments.

What machine learning tools are available for Android developers? And what do they offer us?

Let’s imagine the following practical case: the Android project we are working on is an application for students and teachers where they can manage events, calendars, exams, grades, notices, etc.

The client of this project asks us about the feasibility of adding a text recognition option to the app, so that if the teacher writes notes on the board, students can take a photo of it and convert the image to text to save it later and consult it in the app when needed. Additionally, they want to add the option to translate all the content into English when saving it.

At first, if we are not familiar with the machine learning tools available for Android, we might think this is an almost impossible task. How to convert an image to text? Or how to provide the option to translate an entire text just captured from an image?

This is where different machine learning platforms or tools come into play. The main machine learning platforms for mobile devices are ML Kit and Firebase ML, and on a slightly more advanced level, TensorFlow Lite.

ML Kit

ML Kit is designed by Google and leverages all of the company’s proprietary machine learning technologies. This machine learning platform is intended for mobile app developers.

All processing is done on the device, making it faster and allowing real use cases such as processing images from the camera. Being integrated within the device allows it to work offline.

It is very easy to use, as the entire concept of model training and integration is abstracted into an API adapted for apps.

ML Kit offers us:

VISION API:

Barcode scanning
Face detection
Face mesh detection
Text recognition
Image labeling
Object detection and tracking
Digital ink recognition
Pose detection
Selfie segmentation
Subject segmentation
Document scanner

NATURAL LANGUAGE API:

Language identification
Translation
Smart replies
Entity extraction

Let’s go back to our practical case. After analyzing what ML Kit offers, which API could we use to integrate text recognition from the board and its subsequent translation? We see that both Text Recognition v2 and Translation fit our needs.

On one hand, Text Recognition allows us to recognize text in different languages and retrieve it, and Translation allows us to translate the text obtained in real-time. Let’s see how to implement one of these APIs in our Android project.

First, we add the dependencies for the ML Kit library, in this case, for translation:

We create a Translator object and configure it with the source and target languages. NOTE: If you don’t know the input language, you could use the language detection API also provided by ML Kit:

We ensure that the necessary model for our translation is downloaded:

Once we ensure that the model has been downloaded correctly, we pass the text string to be translated and get the translated result:

You can find more information about the APIs at: https://developers.google.com/ml-kit

Firebase ML

Firebase ML uses Google’s cloud infrastructure to perform machine learning operations. Broadly speaking, this means that Google Cloud servers handle the processing of the models.

This platform offers cloud-based machine learning services that are easy to implement in our mobile applications. This way, we take advantage of the power of Google Cloud’s infrastructure. An important and differentiating factor compared to ML Kit, besides running it in the cloud and/or on the device, is Firebase ML’s ability to host and deploy custom models.

Another factor to consider is the price. In Firebase’s SPARK plan (free), integration with custom models is included, but not with the Cloud Vision API. This means you would need to have an active BLAZE plan, which costs approximately $1.5 for every 1,000 requests to this API.

Once we have studied and analyzed the options of Firebase ML and return to our practical case, we can observe that Firebase ML does not offer us anything different from what we could already do with ML Kit. Even if we delve into the documentation, we can see that for Recognize Text, it informs us that the SDK is deprecated and recommends using ML Kit. Note that this does not mean that Firebase ML is obsolete compared to ML Kit, only that for certain cases, there are APIs that are no longer available.

For more information on Firebase ML: https://firebase.google.com/docs/ml?hl=es

Differences between ML Kit and Firebase ML

At this point, where we have learned about ML Kit and Firebase ML, certain doubts may arise to differentiate both platforms.

Broadly speaking, Firebase ML and ML Kit are just abstraction layers of all the machine learning functions designed to run on mobile devices, which in one way or another perform the same work.

But what are the differences?

With this table, we can more clearly see what differentiates both platforms. Although the final result is the same, the path to achieving it differs, and we must see which one adapts better to the needs of each project.

Now, considering our initial practical case, if we had to choose between ML Kit and Firebase ML, which one would you choose? In my opinion, I would go with ML Kit, mainly because it has no additional costs, allows users to use it even without a connection, and most importantly, because the Text Recognition model offered by Firebase ML is deprecated.

However, it is always interesting to know all the options we have, their advantages and disadvantages, and present them to the client so that they are aware at all times of the characteristics of each platform.

TensorFlow Lite and Custom Models

In previous points of this article, I have mentioned the words “custom models” several times, but what are they? A custom model is defined as a model that has been specifically trained to solve a particular problem or task in a specific domain with determined data.

We must consider that the concepts of custom models already require developers to have knowledge of machine learning, since the main characteristics for creating custom models are:

Data collection and formalization
Model training with the data
Task adaptation
Optimization
Iteration and learning
Flexibility and adaptability

Unlike platforms like the ones previously seen, the abstraction layer between the developer and the machine learning task disappears, but at this point, it is a good time to talk about TensorFlow Lite.

TensorFlow Lite is a library based on TensorFlow but lighter, specifically designed for mobile devices and embedded systems. It allows running machine learning models on the device, but wait, couldn’t we already do this with ML Kit? Correct, but unlike ML Kit, TensorFlow Lite allows customizing or creating models.

If we return to our practical case, in the tasks of text recognition and translation, we do not need to train any model. The pre-trained models integrated into ML Kit are more than sufficient to cover our needs.

We could create a student attendance registration functionality based on facial recognition. Obviously, a pre-trained model to recognize faces alone is not enough, since we want to confirm in each case that the student in question is indeed them. For this, we would need to train the model with photos of the students in question. This could be done by creating a custom model based on facial recognition and integrating it into our application. And for this, we use TensorFlow Lite. Additionally, it is worth noting at this point, and here it will make more sense, that Firebase ML allows hosting and deploying custom models created by TensorFlow Lite.

Conclusions

These are some of the machine learning platforms oriented towards mobile devices that we must know as Android developers to know how to face integration challenges in our apps. It is important to highlight that there are others based on cloud services such as OpenAI, AWS, or AZURE, which have SDKs integrated into the Java or Kotlin language for Android, or services based on REST APIs that could allow us to cover the cases of our practical case among many others.

And a special mention must be made to the Gemini API for Android developers, which was released not long ago from the writing of this article, and provides Android developers with a client SDK to integrate Google’s Generative AI, Gemini. An alternative that we cannot take our eyes off in the future, and that will create disruptions in the concept of machine learning for Android devices.

For more information on any of these platforms, you can visit:

AWS: AWS Machine Learning
Azure: Azure Machine Learning
OpenAI: OpenAI for Developers
API Gemini: Gemini Template

I would also like to mention a web tool that Google provides us with recommendations on which platform and model to use according to our needs.

As we can see in the image, if we select ANDROID / IOS > VISION > TEXT RECOGNITION, the recommendation is to use ML Kit Text Recognition.

This can help us when we are not sure which platform/model to use or if we do not know if the platform can offer us that service.

You can try it at: https://developers.google.com/learn/topics/on-device-ml

In short, with the rise of generative artificial intelligences, the constant growth at the hardware level in mobile devices, the need for our apps to integrate machine learning functionalities leads us to have to know which platforms and tools we have, what they offer us, and how to differentiate according to the requirements which one best suits our needs.

Let's talk

Here we are