Connect Art of Business with Science of Data

The amount of data generated is increasing exponentially owing to the convergence of various technological forces like cloud, social platforms, IOT, mobiles, etc. However, in today’s day and age, leveraging conventional methods of managing, transforming and analyzing the data, that is available both, internally (within the enterprise) and externally (social, image, unstructured, etc), is no longer sufficient. It requires a transformative approach that can be deployed in complex AI pipelines, Master Data Management systems, big data procedures such as Data Lakes, etc where one applies reasoning to learning and delivers knowledge based analytics. Capabilities in Statistical AI, viz.,ML, etc are now table stakes. What we need is more than that – AI Plus or Smart AI, which is leveraging Semantics (Symbolic AI) as well, to drive intelligence by adding a layer of reasoning, over and above, the regular ML techniques.

Business users find it challenging to understand and gain confidence in the predictions machine learning computations deliver. They do not have an integrated decision making system that combines BI, Real Time Alerts and Predictions, with “what-if’s” to take business decisions in right time. They need platforms that host ML models for live scoring and evaluation and integrate knowledge graphs for informed decision making – AI Dashboards. The platforms need to take data from various sources: - Relational, NoSQL, Graph databases and files and streams with various formats such as comma / tab delimited, XML, JSON, semantic RDF triples, etc. And support harmonization of data with disparate schemas to feed AI Pipelines to deliver predictions using machine learning, extracted knowledge using natural language processing for text and unstructured data as well as social, geo spatial and network security analytics using graph analysis. Harnessing deep insights from the data also requires interpreting and analyzing data using subject specific schemas of relationships and meaning (Ontology). Domain ontologies can be loaded into the same platform as just another data set and the runtime enables the ontology to be applied on the source data yielding rich insights along with relationship and context information which otherwise is buried in the data to develop Knowledge Graphs.

Read More


Technical Collaboration with Franz


OAKLAND, Calif. — January 31, 2017 — Franz Inc., an early innovator in Artificial Intelligence (AI) and leading supplier of Semantic Graph Database technology and Lead Semantics, a Big Data Analytics start-up delivering cloud based Advanced Analytics and Data Science, today announced their partnership to deliver Smart-Data Integrated Data Science.

"The integration of Lead Semantics' Hiddime and AllegroGraph delivers new types of analytic outcomes and insights to provide 'Smart Data' for the Enterprise", said Dr. Jans Aasman, CEO, Franz Inc. "AllegroGraph will bring knowledge integration to the Hiddime platform for one of a kind data science capabilities that will deliver unique value for each user."

Emerging Analytics Startups in India


Lead Semantics, a new generation AI company integrating knowledge bases and machine learning, develops products and services targeting the area of 'Semantics Integrated Data Science’ for both the Enterprise and the Cloud environments. is first of its kind Semantic Cloud-BI tool that enables advanced analytics on the cloud. an 'Interactive Discovery and Exploratory Analytics' tool (IDEA tool) in the cloud, enables end business users with little IT knowledge to deliver routine to sophisticated BI and advanced Analytics with just point and click interactions in the browser.

Their data science teams deliver NLP, Graph, Machine Learning and Semantic Technology projects that also include integration of complex Big Data engineering pipelines feeding into BI Datawarehouses and Smart Data Lakes. Their pedigree and experience uniquely positions them to take advantage of the recent surge in interest in Smart Data to deliver cutting edge data science that enterprises are striving for globally.


In a short span of two years, we have delivered cutting edge solutions to our growing base of global clientele. Our solutions cover a wide spectrum. From "old fashioned" Predictive Analytics where there is a heavy dosage of ML techniques to Text Analytics and Advanced Text Analytics where we leveraged NLP and Semantic Graph Technologies, along with ML. We have always focused on solving the business problem, as we believe that technology and technique, are only a means to an end. We have helped deliver a Master Data Management solution in a unique way by leveraging Semantic Graph Technology. Whenever a customer approaches us for a Prediction, we take a wing-to-wing perspective and accordingly, also, recommend on whether the customer needs to build a Data Lake as that becomes the seed for all of the Advanced Analytics that can be delivered for the business.

We have worked across different business domains. Viz: Banking & Financial, Insurance, Healthcare, Retail, Consumer Goods, Government, Education, Energy, Digital Native Organizations, etc.

We believe in working hand-in-hand with our customers and co-creating solutions that they need. Below is a sample of a few solutions.

Predictive Analytics

Loan Recovery Probability

Our Probability of Recovery Model helped one of the largest microfinance lending firms in India, save INR 10 Cr every quarter, as they were able to formulate an optimum recovery strategy. Their loan portfolio consists of 15 million loans given to 7.8 million clients where each client can have multiple loans. Total value of the portfolio is about Rs 138 billion (13,800 cr.). All the loans are issued with no security/collateral but can be issued to fund specific income generating assets. Loan repayments are made weekly with a fixed installment (EWI) amount.

Workforce Optimization

One of the most significant costs for the retail industry is incurred for labor. So optimizing the use of labor becomes very important in order to save on costs. We have provided a solution to a convenience store chain operator in the US, which optimizes their workforce based on the demand. Our accurate predictions of the hourly number of transactions ensured that there aren’t too many or too few people handling checkouts at the cash registers.

Livelihood Recommendations

Governments dole out monies for several welfare schemes but they can’t always know if such schemes have the desired effect. AP government has a program that gives loans to self help groups of women in rural areas. The purpose of this program is to help provide sustainable income to these women and their families so that they can be financially independent. They approached to address a specific concern about the likely livelihoods that these women can pursue to generate a desired income. A blanket recommendation won’t work since there will be differences in geographies, skills, opportunities, demographics etc. So we dug through heaps of data, which was in the untidiest form, and detected patterns in the historical data that lead to region wise recommendations of livelihoods based on the various factors.

Probability of Default
Kitchen Sales Prediction
Food Items Sales Prediction
Hospital Readmission Prediction
Retail Sales Prediction
Students’ Grade Prediction
Anomaly Detection
Fraud Detection

Text Analytics

Text Extraction

Our customer was having a challenging task of collecting, extracting and analyzing content that was coming in the form of scanned documents, PDF, PNG, JPEG formats. Many of the documents were POS receipts. Customer wanted to extract right / relevant fields of data from these forms, put it with other data sets and drive analytics that could help in making business decisions. Since we were already helping the customer in delivering BI and AI Analytics, we offered to automate this process.We are processing 1000 invoices per day right now and we can scale it up to 20000 invoices per day. Previously this is a manual process it took 20 mandays now it is just around two hours compute process without any manual intervention.

Sentiment Analysis

Managing social media content and the reception of such content by netizens plays a large part in a company’s marketing strategies. One of the largest auto parts manufacturers seeked our help for this very purpose. Even though we faced challenges with obtaining adequate data, we devised a solution that included scraping web data, processing it, and analyzing the sentiments based on our NLP expertise. Our solution provided an effective strategy for content management by not only predicting how a specific content will be received but also by designing the content so that the reception is as desired.

Keyword Extraction; Topic Modeling; Document Similarity

In a large organization lots of documents are created and stored at several places. Some of them could be duplicates and/or different versions. One of the largest petroleum companies in the Middle East faced this problem and they asked for a comprehensive solution. We could devise a solution for them that includes extracting the appropriate keywords from document, classifying them, and identifying duplicates, if any. This will help them manage their text content much more efficiently.

Document Search
Extraction of Concepts, Entities, Relationships

IoT(Internet of Things)

Predictive Maintenance

Energy industry’s global leader on wind power solutions wanted to use existing domain specific knowledge about windmills for better condition monitoring, predictive maintenance and for better prediction of fault patterns, well in advance of the fault event. The problems were many such as a) Humongous signal data from various channels or sensors, b) Continuous feed of vibration data from sensors (IIOT).c) For each sensor there are about 25,600 observations recorded in one second (Big Data).and d) Ability to find various patterns in the signal data. We solved using both statistical AI (ML) and Symbolic AI (Semantics). Leveraging Machine Learning and Signal Transformation Methods, we spotted various patterns in the signal data both in frequency and amplitude domain. Leveraging Semantics,we introduced domain knowledge related to wind turbines, which has information about thresholds and physics of the system. This knowledge was connected to the normal signal data to help us draw conclusions in detecting various failure modes and to monitor the overall health of the system.

NILM, Energy Consumption Management

Graph Analytics

Graph Algo

We have delivered a project to a Hospital in the US, having huge patient diagnosis data in a triples store. Each time the patient visits the hospital to take a diagnosis they are generating new ID for the patient for each visit. Here the problem is to group all the diagnosis taken by the patient because each time they are generating a new ID. We have provided a solution to the hospital to group all the diagnosis taken by the patient, we have used the Pregel Algorithm, patient’s metadata, Hadoop, spark and sparkGraphx to provide the solution. Pregel algorithm is huge computational required algorithm because the basic idea of Pregel is that we implement an algorithm that is executed on every vertex of a graph. This algorithm works in iterations and on every iteration it processes incoming messages for a vertex and can update vertex’s value and send messages to other vertices. Pregel stops algorithm execution when no messages are sent by any vertex during one iteration.


We have delivered a project to an event management company, they are conducting events for different occasions like New Year, Christmas, Dussehra etc.. They conducted a New Year Event for three days at Ramoji Film City, Hyderabad. They want to send promotions to the targeted audience who are likely to attend this event without fail, for that they are targeted the Twitter followers of their audience. They need to find the Twitter followers of the 29th attendees who did not attend either 29th and 30th event and likely to attend next day event and target those people to send promotions. We have used Twitter data, ML and GraphDB to get the targeted audience to send promotions.

Knowledge Graph & Cognitive Analytics

Intelligent Search & Recommendation Engine

We built an Intelligent Search and Recommendation Engine - Personalized Intelligent Recommendations for that Specific Customer based on similarity of tastes, patterns, product characteristics with other customers. We leveraged Semantic graph database, which is extremely useful in integrating various data (enterprise, social), and thereby enabling faster queries in real time which is an essential element of the search and recommendation engine.

Intelligent Operations

Knowledge Graphs for Intelligent Operations Pertinent Knowledge, when combined with operational data (transactions, events, etc), enriches the business context, enhances situational awareness and line of sight.. leading to better decision making and efficient operations Knowledge is embedded in relationships between business entities and rules governing the entities Operational Data is Dynamic, Real-time Organizational Knowledge is slow(er) changing - people, places, Things, occasions (times), etc.., Reference Data, Rules, etc. Here is a scenario if an ice-machine got failed in a store how can we arrange ice which doesn't affect the sales in that store. For that, we need to know who we need to contact, what is the nearest store, who is the transport manager and district manager to contact. This is where we will use organizational knowledge graph club with the events data(ice-machine failure detection event is an operational data) to arrange ice for that store. In a regular RDBMS, it is difficult to achieve, but we can achieve only through the Semantic Graph Database and it is easy to do updates/additions to the existing Semantic Knowledge without doing any changes to the underlying schema.

Smart Data Lake

A US based Retailer wanted to access the data from all the different store at one place without connecting to each store and then drive analytics on the cloud, based on the data collected, in near real time. This meant that we needed connect 70 odd stores with each store consisting 300 odd tables, 100 JSON files, etc We built a data-lake on the cloud with IN-Memory DB that scales up to 1TB of data on a single node and this enabled querying is 50X faster than traditional databases. We created an application to monitor the cluster health. We used a cluster with four nodes having an in-memory database, python etc...Alert Server integrated into the data platform enables high value alerts over Business Conditions coupled with operational Events (Ex: IOT, external Events..). Web App server integrated into the data platform enables the delivery of rich,interactive, real-time web & mobile applications.

Master Data Management

This is a client of our partner, Franz. This is a multinational, world leader for corrective lenses. Its catalogs contain hundreds of thousands of variations of stock and finished lens products, which are offered at more than 500 locations worldwide. After trying traditional RDB/ MDM technologies and then Neo4J, which could not scale beyond POC,Client called in Franz. Franz deployed Allegrograph, world’s best Semantic Graph. One of the applications of the Master Data Management system at Client was to fulfill incoming orders.Currently, Master Data Management system is being called more than 2 million times per day, and is able to provide sub-second (.04) response for every query. Franz and leadsemantics can bring to the table, this successful, proven and unique technology to solve problems like Customer 360, One Golden Master, that were hitherto, not solved easily.