“Culture eats strategy for breakfast”: The argument for creating an analytics culture

The following is a guest blog from Craig Lukasik , one of my former MBA students from the Poole College of Management’s supply chain program.  I was really pleased when he contacted me about this blog, and am delighted to feature it here.  Thanks Craig!

Some managers ignore catchy phrases like this one that was popularized by managers in the early 2000s and is often incorrectly attributed to the management guru Peter Drucker. Today’s executives may want to give this old adage a closer look because empirical data continues to underscore just how important culture is to the performance of a firm.

Management studies dating back to the 1950s reinforce culture as a key ingredient of successful organizations, and a recent study by North Carolina State University’s Dr. Handfield and Universidade Federal do Espirito Santo’s Dr. Marcos Paulo Valadares de Oliveira showed that an open analytics culture (OAC) strongly correlates with improved supply chain performance metrics. The study used a theoretical model composed of OAC, real-time analytics (RTA) and the analytical skill level of decision-making managers, and it found that the effect of an OCA was stronger than RTA alone. The study concludes that its…

“results support the research hypothesis that… Real-time analytic technologies are positively related tosupply chain performance… An OAC mediates the positive relationship of real-time analytic technologies on supply chain performance, and… Analytics Skills enhances the positive relationship of RealTime Analytic Technologies on supply chain performance.”

The takeaway is that to achieve the best ROI, investments in (a) RTA, (b) OCA and (c) analytical skills for decision-making managers are crucial for achieving optimal return and for ensuring a competitive position inthe marketplace.

Supply Chain Metrics!?!? – Why should managers who work outside of “Supply Chain” care about this study?

The metrics used by Handfield & de Oliveira’s study use well-established, industry-standard metrics for supply chain management. This study may be useful to all analytics-driven managers because studies comparing companies against each other often fail to find suitable metrics that provide an apple-to-apple comparison. True, 10-K’s andother standard financial filings provide metrics that are often used when comparing performance between companies.

However, the numbers in those filings are lagging, involve noise from other activities (e.g. write-offs and R&D), and, thus, do not offer the granularity and freshness of supply chain performance metrics. I believe Handfield & de Oliveira’s study provides insights for all industries where decision-making managers rely on analytical insights andthat these metrics provide empirical evidence that underscores the importance of OAC, RTA & skills.

What is an “Open Analytics Culture” (OAC)?

For some readers, it might be easier to first define a “‘closed’ analytics culture”. A 1977 study states that

“…a ‘closed’ analytic culture is characterized by data being withheld in functional silos, a lack of standards preventing common interpretation of data and a “mechanistic” model that relies on the improved governance over diverse activities across the organization” (Galbraith, 1977).

Handfield & de Oliveira’s study define an open analytics culture (OAC) as

“the extent to which organizational members (including top-level executives, middle managers and lower-level employees) make decisions based on the insights extracted from data and create shared data environments through mobile technologies for collaborative supply chain decision-making.”

 Having worked with data as a software and data engineer since the early 1990s, I find these definitions to beaccurate. In my experience with a variety of firms and industries that have large datasets that they need to transform from raw data (including unstructured data) to consumable datasets (via SQL or BI dashboards), symptoms that indicate a closed analytics culture include the following:

  • Difficulty and slowness that business stakeholders experience in trying to obtain the datasets or analyticsthey seek. They have to search in many systems within the firm or have to email and ask their IT contacts how to obtain the data they need.
  • Data copying – multiple copies of data are made as it passes hands; for instance, data is copied from adata analyst’s system (DA) to a business intelligence (BI) system to a Data Scientist’s workstation.
  • Inconsistent definitions of what data values Or inconsistent values across datasets for thesame value. E.g. identifiers for a part number for a lookup (“or dimensional”) dataset may have outdated or inconsistent definitions.
  • Multiple departments implementing their own methods or using different tools for the processing of inbound data, “cleansing of data” (by applying data quality rules), applying machine learning techniques to datasets,or publishing predictive machine learning models.
  • Shadow IT – where non-IT departments leverage technology to build their own analytics solutions, charts or graphs, often introducing IT risk because of improper security measures or interpretation of Shadow IT often happens when business stakeholders are frustrated with the slowness of obtaining the analytics they critically need.
  • Lack of or inconsistent enforcement of a unified governance model where end-users are assigned togroups that have permission to a subset of the firm’s data or analytics assets.
  • Lack of or inconsistent processes for the analytics pipeline where data is acquired by data engineers, it is processed by data analysts or engineers, and it is shaped into dashboards or datasets forconsumption by decision-making managers or data scientists.

And, in my experience, an open analytics culture includes these properties:

  • Unified governance where permissions to datasets and analytics assets are applied to named groupsthat correspond to IT functions or business functions that align with a firm’s value chain.
  • Data lineage – where datasets and the columns within datasets can be traced back to their origin datasetwithin the analytics pipeline.
  • Unified platform – where data and analytics practitioners (i.e. data engineers, data analysts, businessanalysts, data scientists and data visualization experts) have access to the programming languages and libraries that are essential to their function within the analytics pipeline.
  • Fungibility and agility through an enterprise-wide analytics platform that allows data and analyticspractitioners to move between projects and initiatives without having to endure the learning curve of yet another analytics
  • Consistent data quality and taxonomy where data values and column definitions are consistent acrossthe

Nowadays, analytics practitioners are leveraging software-as-a-service (SaaS) solutions for piecemeal portions ofthe analytics pipeline, are building their own analytics tools and datasets on the cloud, or are leveraging analytics platform-as-a-service (PaaS) offerings.

With virtually unlimited computational resources available and a slew of cloud-based building blocks, what isholding firms back from achieving a high-functioning analytics culture that provides its decision-making managers with the real-time analytics (RTA) that they need to make the best decisions they can using the firm’s data?

A few hurdles get in the way of maximizing analytics potential. Let’s first look up to the clouds and understand how they both helped and hurt firms navigate the process of creating analytical solutions and data-driven insights.

Cloudy analytics with a chance of raining costs

In 2006, Amazon (AMZN) started a revolution. It created Amazon Web Services (AWS) where it sells on-demand compute capacity and managed services, accessible via APIs or through a web console. The “cloud” eventually became a ubiquitous concept inside and outside of Information Technology circles. In the early days of the cloud, many firms in regulated industries (banking, healthcare, etc.) were reluctant to move their applications and analytical workloads to the cloud because of privacy and security concerns.

Over time, AWS built trust with its customers and users, and it added a number of certifications, including HIPAAand PCI-DSS to address the concerns of CTOs of regulated firms. More clouds appeared in the virtual sky too, with Microsoft’s Azure and Google’s GCP becoming AWS’s biggest cloud contenders. Now, AWS has well over one million active users and has millions of servers deployed across multiple regions across the globe. The reluctance of banks, healthcare companies, and other industries that worried about the risk of data breaches melted over time, and now those companies join other firms with their armies of cloud architects, cloud engineers, cloudadministrators, and other cloud specialists feverishly deploying workloads and applications to the cloud.

Singin’ in the rain

“I’m singin’ in the rain, just singin’ in the rain. What a glorious feeling, I’m happy again.

I’m laughin’ at clouds, so dark up above.

The sun’s in my heart and I’m ready for love.”

–     From Gene Kelly’s “Singin’ In The Rain”

As a software developer and architect for most of my career, I was ecstatic to work in the cloud. Before the clouds appeared, building software required finding “on-premise” servers or databases with capacity for a new application or service, or it required a lengthy process of procurement, design, installation, and administration.

With the cloud, software developers can provision servers (or managed services, such as relational databases)within seconds, and all of that can just as easily be deprovisioned when no longer required. “Infrastructure as code”allowed “coders” to obtain infrastructure on-demand, with a few lines of code and configuration. IT departments across all industries, often under pressure to reduce CAPEX spend, rushed to build software that ran in the cloudwhich instead fell on the balance sheet under OPEX. Firms sometimes opted to build solutions with the cloud service provider’s native services and APIs, and in other cases to use an ever-expanding suite of specialized Platform-as-a-Service (PaaS) providers that run “on top” of the cloud resources while providing a simplified user and administrative experience.

As a technologist, building software and analytics pipelines in the cloud became as easy and joyful as playing with Legos, where numerous building blocks (native cloud services) could be snapped together to build a solution!

A storm is brewing

Unfortunately, all that freedom, flexibility, and computational power resulted in what this July 2022 CIO.com articledescribes as an “all-you-can-eat buffet” for developers and engineers. Sub-departments within IT departments leveraged the cloud building blocks in their own ways, created unmanaged copies of identical datasets, andperformed data engineering and data science in a variety of ways using a variety of tools. Now

“…many CIOs in the mid to latter phases of their cloud migrations are increasingly trapped in a vexing quagmire — handing their CFOs massive monthly cloud bills with no ROI to show for it.”

So what happened? The linkage between intention (e.g. “deliver analytics that predict defects on the manufacturing line”) and the cloud-based software that IT deployed was poorly managed such that it became hardto decipher how all that cloud spend related to IT initiatives. Executives know there’s value in the analytical solutions, but they often struggle to understand how that relates to their impact within the value chain.

Optimal supply chain management performance hinges on managers being able to make the best decisions theycan, with the best data and analytical insights available to them. Predictive modeling and data transformations that power these insights are increasingly likely to be a part of this “vexing quagmire” of cloud spend. Today firms risk inadvertently impeding the innovation necessary to remain competitive as CFOs are on the prowl to slash costs as part of their efforts to optimize spend across the firm.

I was recently on the DM Radio show and podcast, where the show’s title was “Think Your Analytics Are Working? Think Again!” We explored the question: when it comes to analytics on the cloud, how do you know whether or not there’s ROI for all that cloud spend and investment in analytics and engineering? I argued that Michael Porter’sValue Chain model allows a firm to take a first-principles approach to building a bridge between the value chain, and how analytics are delivered.

Think in terms of flows

Most executives and corporate strategists are familiar with Michael Porter’s Value Chain. The model’s simplicitymeans that it is as easy for the suits in the C-suite to grasp as it is for the IT “techies” in their hoodies. Professionals in HR and accounting are already, and possibly unknowingly, using the value chain model. For instance, mostemployees fall under a “cost center” within their HR system. However, most employees probably ignore these values when filling out expense reports or other corporate forms. Furthermore, data analytics is a cross-cutting capability that can apply to all aspects of the value chain. Given the fungible nature of these analytics professionals, companies need disciplined processes to ensure that their activities are tracked to the appropriate part of the value chain, as opposed to a general IT cost center.

So an executive can know if their investments in analytics are working when they can draw a clear line from acloud, Paas or SaaS bill line-item to an analytic deliverable (dashboards or predictive models) along the value chain. The value chain can be thought of as a fluid, flowing stream that the CEO is ultimately accountable for optimizing, removing waste, and adding incremental value. Analytics help decision-making managers streamline the flow and apply innovations that make it more valuable.

The flow of the analytics pipeline involves:

  1. Data Engineers ingesting data (e.g. sensor data from factories)
  2. Collecting that data (e.g. in a storage bucket in the cloud)
  3. Transforming the data (e.g. building and hydrating a data model with data)
  4. Publishing datasets for consumption by humans (e.g. writing SQL)
  5. Data scientists building machine learning models that can be integrated with applications within the

If the analytics practitioners (data engineers, data analysts, BI & dashboard authors, and data scientists) are working in different systems and have to copy data, the flow of the analytics pipeline is suboptimal. Ideally, theseprofessionals should work in an analytics platform that provides unified governance and makes collaboration and data sharing as seamless and efficient as possible.

Executives are hungry to understand how analytics investments across the value chain activities map to outcomes and shareholder value. Likewise, executives seek to monitor cost and return on data assets as they map to these activities so they can fine-tune how analytics projects are funded. Vanity metrics and dashboards excite viewers but

are useless to shareholder value. Therefore, measuring IT department’s output (e.g. the number of dashboards published) is a fool’s errand. CEOs know where in their firm’s value chain its managers struggle to make good or consistent decisions, so, ideally, the CEO should be able to redeploy the firm’s scarce analytics professionals where it makes the most sense.

Executives see the value chain as a flow that needs continuous improvement and optimizations. Thus, the firm should remove barriers that prohibit achieving Real Time Analytics and predictive analytics.

The urgency of Data Democratization

Ten years ago, the complexity of data democratization made it an infeasible goal that only a few companies could achieve. Today, there are few barriers that prevent a firm from building an open analytics culture and empowering business stakeholders with RTA. Executives should seek a platform where analytics practitioners can iterate quickly and can adapt to various projects while increasing data fluency and reliability.

My experience validates all these statements. I have worked with a variety of firms and industries, and the engineers and practitioners have similar resumes. So it is not just about hiring the “right people.” I have observed,however, that when there’s an OAC, I see greater fluency in the team regarding how their individual mission (e.g. build a data pipeline) maps to the firm’s value chain. These orgs are often lightyears ahead of the other firms and are deep into advanced analytics, have RTA via streaming technologies, and have MLOps honed such that new models can be deployed daily (or more frequently).

At Databricks, I help firms democratize data. Data democratization involves dissolving data silos and replacing departmental data fiefdoms with unified data governance and a platform that optimizes the flow between the various roles of the analytics profession while allowing executives to treat these professionals as fungible assets that can be deployed throughout the value chain quickly to handle any disruptions to the value chain. And with Databricks, SQLWarehouses (used by analysts and BI tools like Tableau) and compute clusters (used by data engineers and datascientists) can be tagged so that the CFO is no longer clueless when staring at a cloud provider’s bill.

Executives know when their company has achieved data democratization when they hear statements like these:

  • “Databricks Lakehouse has helped Edmunds significantly reduce the time it takes to unlock the value ofour data so that everybody can get access to those insights faster.”
    • Quote from a Databricks blog from Greg Rokita, Associate Vice President of Technology, Edmunds
  • “Today, we’re predominantly leveraging Databricks in a couple of key ways: first, the creation of highly performant and reliable data pipelines for ingestion and data transformation at large Second,we’re utilizing Delta Lake, which allows us to provide a single source of truth for our data.”
  • “Using machine learning on Databricks #Lakehouse Platform, our business has experienced $2 Million incost avoidance through manufacturing upset event reduction in the first year.” “Databricks is a fantastic development environment for Python-centric data scientists and deep learning engineers and enables collaboration for end-to-end ML“.
  • “The Databricks platform is enabling everyone in our integrated drug development process – from physician-scientists to computational biologists – to easily access, analyze, and extract insights from all of ourdata.”
  • Quote from a Databricks blog from Jeffrey Reid, PhD, Head of Genome Informatics at
  • “What Databricks Lakehouse has given us is a foundation for the most innovative digital freightmarketplace that leverages data and AI to deliver the best experience possible for carriers and shippers.”
    • Quote from a Databricks blog from Joe Spinelle, Director, Engineering & Technology at B. Hunt.
  • “Databricks Lakehouse has been the key to gathering new insights of our autonomous mowers and also to predict maintenance issues and ensure our service teams are equipped with the right insights to fixproblems before they significantly impact our customers.”
    • Quote from a Databricks blog from Linus Wallin, Data Engineer, Husqvarna AI
  • “We’ve seen major improvements in the speed we have data available for We have a number ofjobs that used to take 6 hours and now take only 6 seconds.”
    • Quote from a Databricks blog from Alessio Basso, Chief Architect,
  • “Databricks has helped Comcast scale to processing billions of transactions and terabytes of data ”
    • Quote from a Databricks blog from Jan Neumann, VP Machine Learning at Comcast
  • “Our ability to embed ML and AI in all aspects of our business has been crucial in creating more value forour clients. Azure Databricks and MLflow are core to our ability to deliver on this value.”
    • Quote from a Databricks blog from Anurag Sehgal, Managing Director, Head of Global Markets,Credit Suisse
  • “Adopting the Databricks Lakehouse Platform has enabled a variety of teams and personas to do morewith our With this unifying and collaborative platform, we’ve been able to utilize a singleenvironment for all types of users and their preferred tools, keeping operations backed by a consistent set of data.”
  • “Databricks has provided one platform for our data and analytics teams to access and share data acrossABN AMRO, delivering ML-based solutions that drive automation and insight throughout the company.”
    • Quote from a Databricks blog from Stefan Groot, Head of Analytics Engineering, ABN AMRO
  • “Databricks Lakehouse has provided a unified platform that brings together data and AI to deliverpredictive solutions that help to protect our customers and our business by stopping fraud before it happens.”

More stories about data democratization and the impact of a Lakehouse platform can be found here: https://databricks.com/customers.

About the Author

Craig Lukasik is a technologist who thrives on using technology to enhance business outcomes and fuel innovation. He obtained his MBA from NC State’s Poole College of Management Jenkins MBA program in 2009 and his B.S. in Computer Science from the University of Michigan, Ann Arbor, in 1997. He has worked in a number of industries: automotive, pharmaceuticals, wealth/family office management, real-time derivatives training (investment banking), and even had a stint at the American Kennel Club as a consultant. He enjoys creatingsoftware system architectures, building data pipelines, and writing code.