Chris, a top-rated data scientist at ACME Bank, is responsible for supporting one of the key corporate business initiatives for increasing the number of per-customer bank services. He had created multiple predictive models, using a proven open source predictive algorithm, against a set of sample data. The models include the customers’ transactional behavior and more importantly, the potential for cross selling other bank services to customers. He trained and validated the models and was confident that the models will be effective in achieving their goals once they are scored against every customer data in production. Mission accomplished? At the contrary, Chris has a huge dilemma: Thus far, all of his model creation work was done using his own small system using limited amount of sample data. He cannot operationalize his model on the same system as it will take too long to export all the production data, which could easily be hundreds of millions of records. More importantly, the system is not capable of scaling to the level required to score the various models rapidly, securely, and consistently. So, what can he do so that all of his hard work is not wasted?
How does Vantage provide the ability to operationalize your predictive models?
Does Chris’ dilemma sound familiar to you? A great number of companies who have ventured into advance analytics are currently stuck as they have not been able to gain business value from the predictive models that their data science teams produced. They are struggling to get the answers that will make the difference in propelling their business to the next level. Teradata Vantage customers have not been immune to this, and this is why Vantage currently offers an ability for R / Python analytics to operationalize Python / R models in Vantage. This was just the tip of the iceberg though. Teradata now expands this capability by allowing Bring Your Own Model (BYOM) feature - allowing for standard model interchange formats (i.e., PMML, ONNX, MLeap) as well as proprietary formats like H2O MOJO to be scored in Vantage via an easy-to-use new BYOM predict function. The new Vantage BYOM feature allows data scientists and data engineers like Chris to overcome this common dilemma and finally operationalize all their predictive models.
How does BYOM Predict function work?
BYOM Predict function provides all the advantages that Vantage is known to deliver: scalability, ease of use, and flexibility. While data science teams may prefer to utilize external tools - be it open source or commercial - to create the machine learning models, the limitations of the tools and company processes often prevent them from fulfilling the goals of actually utilizing the models. BYOM Predict function fulfills this goal in an easy 2-step process. It starts with exporting the predictive model to a standard model interchange format like PMML and then importing the PMML formatted model to Vantage. Once this is done, the BYOM Predict function is instantiated using a SQL query or if one prefers, through Python or R client via Vantage client libraries for Python (teradataml) or R (tdplyr). You do not need to code scoring algorithms on your own; the Vantage function takes care of it all. The input data and resulting predictions are stored in Vantage. Once the prediction is done, it is readily accessible by business units and more importantly, actionable through automated applications or frontline sales representatives.
What business values and outcomes are realized?
BYOM Predict function provides operationalization without disruption to the data scientists’ current activities. They can continue to use their preferred modeling software and platform leading towards to achieving faster time to value. It’s a big win for the organization as not only that business value is achieved, it is also repeatable and scalable. The Vantage BYOM Predict function is capable of scoring multiple predictive models concurrently in a performant manner that no other platforms can match. These operational attributes should not be overlooked as the process does not end when a score is generated. As business outcomes are realized, there will be increasing demands for more advanced analytic models as well as refinement of existing ones. Without Vantage, it is very difficult to execute such iterations in an at-scale and agile manner. With these benefits in mind, Vantage BYOM is a win-win solution for many data scientists out there facing the same dilemma as Chris.
Jefferson is Teradata’s Advanced Analytics Product Manager responsible for Vantage Open Analytics which includes Vantage Client Analytics Libraries for Python and R, Script Table Operator R, Python In-database analytics and Bring Your Own Model Scoring Functions. Jefferson’s vast experience in database and analytics was cultivated throughout the years when he started at a company doing cloud and text analysis before it became industry buzzwords. In his spare time, he enjoys food trips to different places, checking out history books and museums as well as taking time to binge watch quality movies occasionally.
View all posts by Jefferson Uy