Implementing Mobile AR & AI
How difficult is it, really?
Authors: Edward Caulfield
Release Date: October 5, 2017
Few things are as intimidating to the average person as computer programming. It is arcane, it is full of very odd people and it is difficult for most people to understand and appreciate. In the world of computer programming, few things are as intimidating as Artificial Intelligence and Augmented Reality. They are arcane, full of very odd people, and are difficult for most programmers to implement and use productively.
As little as three years ago both AR & AI were pretty much relegated to the high priests of the nerds. AR & AI were both very interesting topics that the majority of people didn’t really understand and didn’t want to tangle with. But, things are rapidly changing.
In regards to AR, thanks to Google Glass, Pokémon Go, HoloLens, ARKit / ARCore and a smattering of other technologies, the world is beginning to grasp the vast potential of this technology. AI, however, remains suspect and foreign to most people. Most of us only know AI from dystopian science fiction movies and we don’t know how much of that fantasy world translates into real life.
What most folks don’t appreciate is that both AR & AI have been with us for decades and it is only lately that, due to some very impressive technological breakthroughs and a lot of hard work by some exceptionally bright folks, both are now starting to hit society’s mental mainstream. So, naturally, the average nerds among us will ask “How difficult are they to implement, either individually or together?”
I will start with AR, if for no other reason than it is a visual technology and most of us understand the things that we see more easily than things we cannot see. Although a lot of people like to pretend that mobile AR became possible with the advent of Apple’s ARKit, the reality is that AR has been on mobile for years. Historically, you have either had the choice of writing everything yourself from the ground up yourself (very expensive, but the ultimate in flexibility!), using a SDK, such as Vuforia or Kudan, to roll your own AR app (much less expensive, still very flexible), utilize a browser based CMS tool with a proprietary scanner ($50 and one day of work to get your first AR experience), or use AWE.media and have a fully platform independent, browser based development & deployment tool. While ARKit and ARCore can be seen as advancements in some corners because they add native depth detection and bring AR closer to the mobile OS, they can also be seen as a huge step backward for the consumer because they fracture the App development landscape by promoting platform specific development. This, just when we are getting used to “Write Once, Run Anywhere” as a legitimate goal.
What ARKit and ARCore do, however, is make it much easier to create (cringe) platform specific AR applications and, most importantly, bring platform native depth perception to the mix, which nobody else has delivered at production quality. Although ARKit and ARCore both have started out with very small and simple interfaces, this will change in time and AR SDK builders are going to have to fight increasingly harder to justify their existence. Cross platform development tools such as Xamarin and Unity will become even more important to the AR development environment.
How easy is it to implement AR with ARKit or ARCore? Wonderfully easy! Both ARKit and ARCore are very small SDKs that are very easy to implement. They both have a geometry tool that facilitates the placement of virtual objects into the camera’s view and give the impression that these virtual objects are fixed in three dimensional space. As you move around, the virtual object stays in place. You can walk over / under / around it, viewing the object from literally any angle. You can interact with the object – pushing, pulling, shrinking, growing and throwing it. Additionally, both ARKit and ARCore have plane detection, currently limited to horizontal planes for both.
All of this is available to the programmer with precious little coding. ARKit and ARCore are doing all of the really hard work for you. Perhaps the biggest negative here is the platform requirements, because both ARKit and ARCore are very demanding.
- Phones – iPhone 6 or later
- Pads – 2017 iPad or the iPad Pro
To program using ARKit and ARCore it is helpful to have a basic understanding of linear algebra, which is used to facilitate the 3 dimensional transformations, but in the end if you know what recipe to follow, you don’t really have to understand how the transform works. Below are a few links that will help you get a picture of what it takes to get AR up and running in a mobile app. Admittedly, there are far more ARKit based examples than ARCore, if for no other reason than ARKit has been out a little while longer. Let’s see if the ratio still holds in six months.
- ARKit Examples
- ARCore Examples
Suffice to say, it is now possible to easily write depth aware AR content into your iOS and Android apps. The challenge is now to make apps where AR adds legitimate value.
What about AI? Do you still have to be “really smart” to plug in and use AI? No, not any more. There are numerous Machine Learning tools that have been made publicly available over the last year that have taken 90% of the complexity out of using AI and getting it running in your mobile app as well.
What you will have to do is understand a few concepts. Modern Machine Learning engines, such as TensorFlow, create “models” which can be loaded onto your mobile phone with your App and be used to perform specific tasks. For example, for Image Classification, you can use the Inception V3 model, which can be trimmed down to about 25MB. 25MB can seem like a lot for a mobile app, but with the average size of a game program being over 60 MB, 25MB is hardly overwhelming.
In theory, the steps for using a model are few.
- Find a model appropriate for the task you have (e.g. Image Classification, Medical Diagnostics, Natural Language Processing, etc.) or create your own model (lots of work, requiring lots of data)
- Train and test the model with your data
- Implement the model in your mobile app
In reality, things are much more difficult. The challenge is now finding a model for the work you want to do. Finding a model for Image Classification is easy because this is one of the primary use cases for AI. However, while you can easily find papers and discussions on any number of AI models, finding models that you can download and use, at any price, is currently a dark art. The only location I found that had available models was a TensorFlow repository on GitHub. I searched the Internet inside and out and contacted a dozen folks involved in Machine Learning and Artificial Intelligence and not one knew of another repository for models, paid or free. I guess we’re still just a little early.
So, this leaves you to create your own model, collect data and train the model. The tools available, such as Google’s TensorFlow, Apple’s Core ML and Microsoft’s Azure Machine Learning Studio, make model creation, training and validation a reasonable proposition. You are now able to experiment and work with Machine Learning without a degree in Computer Science. The real challenges are now creating a model that is actually successful at the desired task and generating sufficient data to train and validate the model.
Isn’t that where we were 12 months ago? No, not really. 12 months ago you would probably have spent 10 times the effort to achieve half the results. Machine Learning has moved from high end research to applicable technology. I suspect that in another 12-24 months we’ll see another layer of abstraction that makes model creation, training and evaluation even more intuitive. Then you’ll be stuck with gathering useful data for training. With a little luck, a market for Machine Learning Models might develop that would allow you to take a model that has proven to be successful in the desired skill and all you need to do is to add training data specific to your product / field.
When all is said and done, the tech giants have done some amazing work allow us to use AI without having to be an AI expert AND give us the ability to create, train and evaluate the models with a reasonable learning curve.
The upshot is that both AR & AI have been brought into the reach of every-man. While it takes a lot of intelligence to understand both technologies in depth, it doesn’t take a genius to implement them in apps.
Why is this worth writing about? Because the work done to bring AR & AI programming to the masses will make AR & AI available to millions of already deployed Smartphones. The cost of implementing AR & AI just hit a new low at the same time that their commercial quality just hit a new high.
For AR, the use cases are very broad and compelling:
- Training (Global higher education market is a $2T industry that is only 2% digitised)
- Service & Support ($47T globally)
- Health Care ($1.6T globally)
- Emergency Response ($ 88B globally)
- And the list goes on and on
For AI, there is simply not a single industry that I can think of, save for maybe striptease, that won’t be impacted in the very near future. Everything and everyone, from dishwashers to space stations, from the assembly line worker to the brain surgeon, is going to experience the influence of AI, whether they recognise it or not.
One thing that I think is abundantly clear after a review of the current AI landscape, is that those who state that it will take another 40 years before AI makes itself felt in a significant way, couldn’t be more wrong. It has gotten to the point where a moderately intelligent person can create a AI enabled application that delivers performance that was unthinkable only 12 months ago. The AI rocket has launched and it won’t be stopped.