Anyone who's worked in tech knows the sting of launching something, only to abandon it for the next project. This issue is particularly acute with AI and machine learning (ML) projects, where continuous improvement is possible without a major software release—provided companies make the right investments. For years one of the biggest complaints I’ve heard from AI/ML teams is they can’t make their products better because the right monitoring tools aren't available.
AI might be the new hot thing, but as Cassie Kozyrkov wisely noted on LinkedIn, "Like most things in life, the best approach in enterprise is to start not with the technology, but with the business problem you’re trying to solve. AI may be the solution you need, but it should be what you try after traditional programming fails." AI makes sense when the value is high enough to justify the investment, complexity and reduced control that it will take to satisfy it.
If you're committed to making AI work for your business, you're probably aiming to create what Andrew Ng calls a "virtuous cycle" or an ML Flywheel —a self-perpetuating cycle that drives growth. More customers generate more data, which improves the experience, attracting even more customers. It sounds great but is incredibly challenging to execute as I’ve covered in a previous post.
To succeed, you need to finish the turn and many completes to get there by not making the investment in monitoring. Without robust monitoring tools, it’s nearly impossible to analyze user interactions, track features, or understand the underlying reasons for user actions. Without this investment, your AI project will never realize its full potential.
The Role of Monitoring Tools
Machine learning model monitoring systems are designed to provide visibility into how models perform in production, enabling teams to address weaknesses before they escalate into significant issues. Without it you are hoping for the best and betting on customers submitting feedback.
This proactive approach contrasts sharply with the reactive strategies many teams employ, where dashboarding and troubleshooting only occur in response to customer complaints.
Imagine your tool is designed to help clients optimize their pricing strategies. It can make recommendations, but how do you know if these recommendations are effective? How well are they working? And when does a change in the market require you to adjust your approach? This is where monitoring becomes essential. By tracking performance deltas from data drift in production and ensuring that ML-driven performance continues to outperform traditional methods, you can maintain the reliability and effectiveness of your solutions.
Failing to monitor these aspects can lead to several risks:
Poor Performance: Without continuous monitoring, models can provide incorrect or misleading information, leading to poor business decisions.
Brand Risk: LinkedIn used to encourage me to make changes to my profile so I could increase my number of viewers by 1%, it didn’t make them look great.
Inconsistency: If your models are not monitored, their performance can degrade over time due to factors like data drift, leading to inconsistent outcomes.
Attrition: Not making the investments will lead to disengagement among team members who are highly sought after in the market.
Arthur AI integrates LLMs within their platform, particularly for financial institutions that must meet stringent regulatory requirements. These companies operate with a higher bar for accuracy and transparency, making monitoring essential to their success.
The risk of not addressing these issues is significant. Companies often launch first-version (v1) features without adequate data support, leading to a failure in closing feedback loops. This not only affects the reliability of results but also strains resources as teams are forced to troubleshoot preventable problems.
Software deployment has been the final step in the process for a long time. For ML/AI projects, deployment is just the beginning. There's substantial value to be gained from continuously tuning and optimizing model performance. Without this ongoing effort, companies risk not only the effectiveness of their AI solutions but also their overall resource management, including the detection of security breaches and anomalies through predictive analytics.
Putting AI into a product requires a shift in focus from merely understanding and improving user experience to ensuring robust monitoring and analysis. Without adequate tools, teams struggle to validate new ideas and manage technical debt, leading to incomplete or suboptimal solutions.
In ML, there is no "unit test and ship" approach. Monitoring performance necessitates long-term investments in infrastructure, data, talent, and ROI assessment. This approach differs significantly from the traditional "Agile" methodologies many organizations follow, highlighting the need for a culture shift towards embracing continuous monitoring and uncertainty management in ML.
Why Are Companies Not Making the Investment?
Many companies shy away from investing in ML model monitoring for several reasons, despite the clear benefits:
Lack of Understanding: Many executives are unfamiliar with the nuances of ML monitoring and its importance. These leaders are often not accustomed to the type of investments required for continuous model improvement and risk management.
Perceived Complexity and Cost: The talent, data, infrastructure, and computing power required for effective ML monitoring are expensive. Adding another layer of tooling can feel overwhelming, especially when companies believe they already have some form of monitoring or feedback collection in place.
Prioritizing Short-Term Gains: Most tech organizations focus on projects that customers can see and interact with—features that marketing can promote and sales teams can sell. Monitoring doesn’t fit into this category, even though its absence can lead to the failure of these very features.
Delegating to IT Teams: Companies often delegate monitoring to IT departments, viewing it as an operational task rather than a strategic investment. This approach can lead to insufficient attention and resources being allocated to monitoring.
Overconfidence in Early Tests: There’s a common misconception that if a model works well in the lab, it will perform just as well in production.
Companies are operating in a tight economic environment and it's perfectly reasonable to save money where you can in a way that your customers won’t feel. This doesn’t have to be a yes or no decision, you can try to manage your investment and improve your monitoring capability overtime.
What Should I Know About Implementing ML Monitoring?
Focus on Product Goals: The primary objective of ML monitoring should align with the goals of your product, whether that’s decreasing costs, increasing engagement, boosting revenue, or optimizing outcomes within specific constraints like time, cost, or risk. Here are some ideas:
Ecommerce: Increase average order value by 10% through personalized recommendations, monitored by tracking recommendation accuracy, customer behavior data drift, and fairness across user segments.
Subscription Services: Reduce churn rate by 15% by targeting at-risk customers with personalized interventions, monitored by checking churn prediction accuracy, tracking customer behavior changes, and ensuring fairness in models.
Marketing: Improve lead conversion rate by 15% through targeted ads and personalized messaging, monitored by tracking lead scoring and segmentation model performance, customer response changes, and ensuring fairness in targeting.
It Doesn’t Need to Be Perfect: If you aren’t making highly impactful decisions regarding health, loans, credit, apartments or similar things you don’t need the best monitoring system, you need something that makes sense for your budget and scale. Overtime you can upgrade.
Evaluation Focus: Your evaluations should be directly tied to your product goals. Are your models providing value? If so, what kind? For newer companies, the focus may be lighter, but the principles remain the same—ensure that the monitoring is aligned with delivering tangible value.
Continuous Investment: Just as you wouldn’t ship a product and forget about it, you shouldn’t neglect your models after deployment. Continuous monitoring and improvement are essential. If a model is no longer delivering value, be prepared to sunset it rather than letting it deteriorate over time.
Challenges in Data Management: Effective ML monitoring requires robust data management practices, which many organizations struggle with. Without clean, well-organized data, monitoring efforts can be hampered, leading to unreliable insights.
Other Considerations: When implementing ML monitoring, pay attention to several critical factors:
Data Drift: Monitor for changes in the input data that could affect model performance.
Accuracy: Continuously evaluate the accuracy of your models against real-world outcomes.
Data Quality: Ensure that the data feeding your models is clean, consistent, and relevant.
Fairness: Assess whether your models are making equitable decisions across different user groups.
Estimated Accuracy: Track the estimated accuracy of predictions and adjust as needed.
Carbon, a FinTech company in Lagos, uses machine learning to offer fast and convenient consumer loans. By processing 150,000 loan applications each month through DataRobot’s prediction API and tracking deployments with MLOps, Carbon can assess each customer’s likelihood of default and adjust lending terms accordingly. This approach also helps the company manage fraud and anti-money laundering risks, demonstrating the importance of robust ML monitoring in high-stakes industries.
Conclusion
While the initial costs and complexities may seem daunting, the long-term benefits of robust monitoring far outweigh the risks of neglect. Proper monitoring ensures that your AI and ML projects continue to deliver value, adapt to changing conditions, and avoid costly failures. As AI becomes increasingly integral to business operations, those who prioritize monitoring and continuous improvement will be best positioned to lead in their industries.
Resources
The definitive guide to AI / ML monitoring (Mona Labs)
Machine Learning Monitoring: why it matters and how to get it right (AI Infrastructure Alliance)
Image: A graphic I made ages ago
Opmerkingen