Hidden Debt of Machine Learning Platforms

Hidden Debt of Machine Learning Platforms

With advancement of artificial intelligence and machine learning – from smart assistants, like Alexa and Siri, to tagging our friends in the social network, it is not surprising that every aspect of our lives is being touched and revolutionized by this new technology.

It is also impacting enterprise space and driving a transformation in the manufacturing sector, paving the way for Industry 4.0 revolution

However creating and maintaining an AI/ML framework is fundamentally different from traditional software solutions. As D. Shully called out in his seminal paper ‘Hidden Technical Debt in Machine Learning Systems’[i],there is a technical price, an enterprise has to pay in order to implement and maintain AI/ ML systems, before they can reap its rewards.

Source: Hidden Technical Debt in Machine Learning System, D. Sculley et. Al1

In this WP, we will go over the system level thinking needed to implement a functional ML system within an enterprise environment.

In conventional software development, there is an increasing focus of module isolation and code segmentation. However for ML systems, there is intricate dependence on the input signals, feature list etc, making it harder to follow the conventional software models. For example, if system has ‘N’ set of features, removing a feature could impact the entire systems, in terms of weights, connectivity etc and could require re-calibrating the entire system.

Similarly, addition of a new feature could change the dynamics of the system. So in order to create and maintain a robust AI/ ML based system, one possible option is to divide the system into multiple stages with well-defined boundaries and interfaces.

Lastly, the framework needs to integrate with the enterprise existing frameworks/ data lakes and need to be Dev Ops friendly, so that it becomes a sustainable AI/ML solution and not just a point solution for one time usage.

Following flow diagram depicts one such possible implementation, where the entire flow is divided into four different stages.

Figure 1: Potential 4 stage implementation of an Industrial AI/ML System

  • Stage 1 -Sensor Network: The very first stage of an Industrial AI/ ML system could be sensor ingest/ integration stage, which collects the data from different sensors or sensor networks. These sensors could include microphones, cameras,vibration sensors etc.
  • Stage 2 -Data Lakes: As highlighted in D. Sculley’s paper, 90% of the efforts of a machine learning system is to clean and filter input signals, perform ETL operation and make the data usable for down the stream system. So the next stage of the system is to process this sensor data, which might be different formats, like audio files, time series etc., and create a well-defined data-lake for the next stage processing.
  • Layer 3:Data Labeling annotation and feature extraction: This step is needed during the model definition or training phase, and during model updates. The input data needs to labeled and the right feature sets need to be defined, so that right ML algorithms can be developed.
    • Once the models are well defined and trained,another critical step is to decide the timing and the potential trigger points,when this model should get updated.
  • Layer 4: Predictive Analytics: This is the outcome of all the hard work. At this stage, data and ML trained model could perform the predictive analytics to provide valuable insights.

In the upcoming blogs, we will delve into these stages in detail and show how mSense implementation leverages this structured approach for audio classification for predictive analytics. We will also highlight the Dev Ops friendliness of mSense platform, which makes it easier to seamlessly integrate it in the enterprise framework.

In the meantime, please visit us at www.msense.ai to get more details.

[i] Hidden Technical Debt in Machine Learning System, D. Sculley et. Al., https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf

Artificial Intelligence- Driving the New Industrial Revolution (Industry 4.0)

With advancements in new technologies, like Internet of Things (IOT), Cloud Computing and Machine Learning(ML), the modern manufacturing process has come a long way during the last few years. A new paradigm has emerged, creating Industrial 4.0 revolution, where the combination of sensors, edge data processing and ML algorithms allow manufacturing plants to not only  to digitize, automate, and optimize the work flow, but also to provide early detection of any mechanical issues.

Companies, as shown in Figure 1[i], are trying to move up the value chain, by exploiting the top 5 digital technologies – Cloud Computing, Network Control Systems, and Equipment with Embedded Sensor and Control Systems, and AI. However, they struggle with how to best utilize their existing data, pull together new sensor data and combine it with software tools, and talent to deliver on this transformative business potential.

Figure1: Production related technology leads the fourth industrial revolution1

AI is not a very good organizational fit:

A recent PWC report [ii] highlighted that AI requires new organizational thinking. Customarily Chief Information Officers are leading most digital transformation efforts. However, as mentioned above, AI is a convergence of data, sensor physics, distributed networks/ IoT, and ML, which spans the manufacturing enterprise from business units to manufacturing operations, to data specialist, to IT.

As testimony to this new organizational thinking, Figure 2 shows the varied approaches taken from 1,000 executives’ 2019 AI plans with most predominant option being creating an internal center of excellence, which might not be the right approach, considering scarcity of talent and expertise in this area.

Source: PWC 2019 AI Predictions

Figure 2: The many 2019 AI organizational approaches

You do not want a research experiment

Is an AI approach the right one to solve your problem? Most enterprises do not want to take the time and money to run a research experiment to answer this question. Instead what is needed early on is a technique to know what results AI can achieve and are all the pieces, such as sufficient data available.  

To complicate this question often AI offers a trade-off in the amount of labeled data and the result’s accuracy such as a classification confidence percentage.  Understanding if an acceptable confidence score of 60% or 70% may provide sufficient accuracy is paramount to harvesting early AI potential benefits.

The enterprise AI workflow

Getting the right people at the right time to work together will realize the full potential of AI. Leveraging an enterprise’s areas of expertise is required to integrate data, ML algorithms,and IT infrastructure into an end to end solution from manufacturing data such as sensors to the appropriate ML algorithm to realizing a tangible business ROI. 

What is needed is a software AI enterprise workflow that allows for iterative collaboration and appreciation for each step in the AI journey.  A practical enterprise workflow will automatically handle most of the tedious steps in the process while exposing critical steps such as feature engineering for review and implementation.  

Being able to capture manufacturing expertise such as factory floor inspections and converting that into labeled data for ML training to improve assembly run rates is just one example of how an integrated end to end AI workflow can realize the full potential of AI to transform businesses. 

As described in subsequent posts, mSense offers a customer proven practical AI enterprise workflow solution that accommodates the enterprise’s current organization and talent.


[ii] https://www.pwc.com/us/en/services/consulting/library/artificial-intelligence-predictions-2019