Machine Learning is very much in vogue at the moment and there are many Android folks producing some excellent guides to creating and training models. However, I do not feel that in the vast majority of cases Android Developers will be required to train their own models. But that does not mean that using a model that someone else has created is a straightforward task. In this short series, we’ll take a look at some of the problems and pain points that I encountered while attempting to incorporate a pre-trained Tensor Flow Lite model in to an Android app.
To start with I should really explain that bold assertion that I made in the introductory text: I do not feel that in the vast majority of cases Android Developers will be required to train their own models. My reasoning here is that creating and training an AI or ML model is a discipline somewhat different to Android development. In the vast majority of cases Android developers will not be required to do this because there will be specialized teams responsible for this. In the same way that many projects will have a team responsible for backend web services that will be consumed by Android, IOS, and web clients, companies developing AI / ML based solutions will have a separate team dedicated to creating and refining the model which can then be used across multiple clients.
That would appear to make the app developers’ jobs a lot easier, and that is very much the case. However there are still some pitfalls which exist in part because this is all very new and implementing a model that someone else has created, still requires some knowledge and understanding of that model in order to properly incorporate it in to an Android app.
Let’s start with a basic understanding of what we’re trying to achieve, and set the starting point. One of the standard models that is included with Tensor Flow is the MNIST handwritten numeric digit classifier. The MNIST dataset is a collection of hand-drawn numerical digits which can be use to train a model to recognize hand-drawn digits, and this felt that it would be fairly easy to write an Android app which would allow the user to draw a numeric digit on the screen, and it could use a model trained with the MNIST dataset to identify the digit which was typed.
In order to incorporate this in to the app, I would need to create a Tensor Flow model, and convert this to Tensor Flow Lite format (as we do not have a full Tensor Flow run time on Android, only Tensor Flow Lite), and then incorporate this in to the app using ML Kit. It was a conscious decision which I made at the beginning that I wanted to use ML Kit rather than directly using Tensor Flow Lite. The reasons for this should become apparent as we go.
So the first thing was to train up a model which I could incorporate. I won’t go in to detail here because that isn’t the point of this series of articles, but I will explain a little about the route I had to take. Firstly I tried using the official Tensor Flow MNIST Model. This was easy enough to train, but converting it to Tensor Flow Lite proved tricky and I gave up. Next I found this project by Tianxing Li which uses the same MNIST dataset but contains scripts to train, and test the Tensor Flow model, and then another script to convert it to Tensor Flow Lite. There are full instructions on the project website for how to do that.
Some people might have noticed that the project that I used to generate the model also includes an Android app to do exactly the same as I was setting out to do. However, there is one important distinction: It incorporates the model using the Android Tensor Flow Lite library directly, whereas I wanted to use ML Kit. While this may not seem like a big change, I encountered quite a few problems and misunderstandings along the way, which I’ll be sharing as we go, so it felt like a worthwhile exercise to be able to document these.
So with our model trained and converted to Tensor Flow Lite, we’re ready to start building the app, right? Well, not quite. Although in reality, that’s precisely what I did, I encountered some problems quite early on which caused quite a bit of head scratching and frustration. Although I had my model, I had no idea of the input and output formats that of the data. I knew that the MNIST dataset consisted of a number of image files, each one was 28×28 pixels, and I was able to infer certain things by studying Tianxing’s Android code. However, this will not be the case if you are simply given the model. The best option here would be to speak with the team that created the model to obtain the input and output formats, but I also found a useful tool which can obtain them directly from the model itself. The tool is part of the Tensor Pro Github and creates an HTML visualization of a Tensor Flow Lite model.