When you take your Machine Learning Models to Production for Real Time Predictions

06 / Jun / 2017 by Aseem Bansal 1 comments


We had a use case where we needed to use machine learning to make predictions in real time. To give an estimate of what we consider real time – latency less than 10 milliseconds for our predictions. In this blog, we would discuss what is needed to have the kind of latency for machine learning models that we can use in production. So far, our experience has been with simple models for classification – naive Bayes, logistic regression, decision trees. Hence, all that follows is based on that experience only, and for complex models, additional changes may have to be done.

Considerations for Solutions 

We had a lot of data. We wanted to start simple but use something that can scale and has machine learning capabilities. We chose spark as that fits that requirement. We have begun with Spark 1.6, and as new versions (2.0, 2.1) were released, we kept on upgrading it as some features were being released that we needed to use.

Spark 1.6 had two packages for machine learning – MLlib (stable) and ml (incubating based on upcoming Dataframe API).

We used MLlib for our initial purposes.


  • Had implementations for the simple models that we wanted to start with
  • Models trained on it gave us the performance we needed for making predictions.


  • RDD based API was not that great thus lowering developer productivity.
  • After training the models, saving them somewhere turned into a challenge.

Slight hit to developer productivity we could work with but not being to keep the model somewhere for integration into our other systems was a blocker for us. We looked into what solutions were present and found that

  • Partial support was already inbuilt to export as PMML format
  • Libraries were present to export as PMML format

But the problem with the above was that these did not support all models. We did not need to use all models but locking into something that is not going to keep up with our requirements would be a big challenge. Also, the roadmap for Spark had explicitly mentioned that the ml package was the future and MLlib package would be deprecated. So if we exported the models to PMML format then what would happen when the new versions came out, and we needed to use that version? Importing PMML was not there in the ml package which was supposed to be the future.

We found a library mleap that fit our requirements. It supports nearly all of the models, transformers that spark provides. There are few that it does not support like SQLTransformer, but they can be worked around easily. Our testing led us to find that the latency was as per our requirements. Our web app is in Java, so we tried using it with Java. That was not without pains as this library interface is in Scala. The situation seems to have improved since then as the library has added Java DSL. We were able to save the trained models into their bundles as well as spark’s native format. Currently, we are using mleap bundles to load the models into our web app but also storing spark’s models lets us be free of any lock-in with the library.

Hopefully, this library will also help you out in case you need to take your machine learning models to production for real-time predictions.


comments (1 “When you take your Machine Learning Models to Production for Real Time Predictions”)

  1. Stella

    Pretty article! I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing..


Leave a comment -