SAP recently extended the capabilities of SAP Datasphere and SAP Analytics Cloud, enhancing two of its major data analytics solutions with generative AI, data governance, and knowledge graphing capabilities in its efforts to establish a unified business data fabric for customers.
As detailed during the SAP Data Unleashed virtual event, new data modeling capabilities within SAP Datasphere, along with vector capabilities in SAP HANA Cloud, are generally available to improve the data service’s interactions with large language models (LLMs), supporting generative-AI outputs with business context and inhibiting AI-induced data hallucinations.
To dive deeper into the announcements and reflect on SAP’s larger data strategy, ASUG connected with Irfan Khan, President & Chief Product Officer, SAP HANA Database & Analytics.
A 12-year veteran of SAP, Khan currently leads global development, product, and solution management for the company’s entire database, data management, and analytics portfolio, which includes SAP HANA, SAP Analytics Cloud, and SAP Datasphere. Prior to assuming this role in 2021, Khan spent six years in sales, most prominently as the president and chief revenue officer for the SAP Platform & Technologies organization, where he managed global sales and go-to-market (GTM) for all SAP database assets.
Below, Khan discusses the critical role that connecting enterprise data will play in enabling SAP customers to leverage generative AI, the company’s vision for extended planning and analysis (xP&A), and the unifying role of the business data fabric in SAP’s innovation roadmap.
This interview has been edited and condensed.
ASUG: Let’s set the scene. A year ago, SAP announced SAP Datasphere and detailed its importance within the idea of architecting a “business data fabric” for customers. Can you recap that recent direction for our readers?
Irfan Khan: SAP Datasphere is the SAP managed service, part of SAP Business Technology Platform. It’s there as a means of helping customers navigate the highly diverse and equally so heterogeneous environments and datasets they have to interact with. We've phrased this terminology of the “business data fabric” as a means of architecturally sharing clarity around how we achieve our goals for our customers.
Our first goal: no customer should be left behind, and no data should be left behind. It doesn’t matter if you start off with an on-prem environment, or for that matter in a private cloud or public cloud environment. All of your data should be accessible to you. As we look at more generative-AI use cases, the last thing we’d want to do is to marginalize the value of some data because it’s inaccessible or because our customers haven't figured out how to pipeline the data into their main processes.
How does SAP Analytics Cloud fit in? Among the announcements, SAP promised improved integration between SAP Datasphere and SAP Analytics Cloud, which will also provide users with access to generative-AI assistant Joule. Walk us through the integrations at play here.
Khan: In providing a richer experience for Joule, we will embed Joule directly within SAP Analytics Cloud, so that will serve as our foundational starting point. SAP is one of the largest planning vendors in the market—not that you would know that, because we have a very diffused planning foundation. You have supply chain planning with SAP Integrated Business Planning (IBP), workforce planning in SAP SuccessFactors, and territory planning in SAP Sales Cloud, formerly CallidusCloud. SAP has many different planning capabilities.
Whilst there’s value in planning, and whilst there’s value for users who’ve been using those explicit planning capabilities, there’s more benefit if you have seamlessly integrated end-to-end planning. This is the direction that we’ve been following; market commentators categorize it under the banner of extended planning and analytics (xP&A).
As a planning foundation, SAC in conjunction with SAP Datasphere provides our ability to service the needs of xP&A. It does so in two ways. First, SAP Datasphere gives you access to all the different planning content. Supply chain planning in IBP, for example, runs on SAP HANA Cloud, which is a foundation of SAP Datasphere. Now we have the federation capability from SAP HANA to SAP HANA, through SAP HANA Cloud running underneath IBP for supply chain planning. As a consequence, SAP Datasphere is able to consume planning models directly. We can do a semantical onboarding of the different planning content that comes from those different planning solutions. That is, in itself, a huge value-add.
Add to that the new SAP Datasphere knowledge graph, and add Joule, and think about the correlation of those technologies with SAP Analytics Cloud as a planning foundation, which now has the ability to establish extended planning across all the different planning content across SAP. Now, with the knowledge graph, you can ask open-ended questions of that foundation.
If you’re a planner, and you want to go through a process and look for a forecast or predictive model, and you want to associate that with a planning foundation, that’s within your ability. If you’re a marketeer with 2 million euros to spend and you want to make the most significant impact with that 2 million, you can use a planning foundation to determine which is your most significant territory in terms of revenue. You could stipulate the territory has to have a demographic of over a million potential consumers, and you want to be able to convert that. Say you’re running a trade promotion; you’ve got to plan for that. Around supply chain planning, you’ve got to be able to figure out the finance implications. Perhaps you need to have trade promotions, where you incentivize certain sellers. This is about how you tie all those different areas together.
That’s a common scenario many SAP customers run today. They’re challenged, because each and every one of those experiences will result in a large amount of data movement, building out planning models, integrating planning models, and building a user experience on top of them. Joule is available through SAP Analytics Cloud, which integrates with SAP Datasphere, which in turn now offers knowledge graphing, which allows you to access a variety of different datasets and harmonize that data in one semantically enriched data layer, served up through SAP Datasphere. This allows you to accelerate those types of use cases, also using a combination of SAP BTP services on top of that.
ASUG: To be clear, xP&A capability is not replacing SAP Integrated Business Planning (IBP) but rather, as the name suggests, extending its reach.
Khan: That’s right. We wouldn’t for a moment consider eliminating SAP IBP in this process, because that’s not in the best interests of the customer. You use IBP for synchronized planning, and you look at it from the manufacturing process side. There is an excellence in the supply chain that IBP encapsulates. But what you do want to do is use the models that you’re using in scenarios around supply chain planning to extend their [connectivity] to your financial or workforce planning. That’s how we look at this whole evolution: extending P&A, so that you can run analytics on top of planning content, to do forecasting.
Think of a controller that comes up with a budget. If you want to have the planner decide where the planning content needs to go, planning values by market unit or by territory, and then you want to run complex predictive analytics on top of that, this links all those pieces together. What was missing before was that single, semantically rich harmonization layer, aka SAP Datasphere, with its notion of a business data fabric architecture that allows us to play to the strength of the entire customer landscape, instead of biasing it by what you can control because you can only access subsections of data.
With generative AI sweeping across enterprise technology, data quality and security is top of mind for our members, as is transparency as to how AI interacts with enterprise data to ensure they can trust and validate the planning and analysis it generates.
Khan: One of the announcements we made involved extending our relationship and partnership with Collibra around generative-AI governance. That plays to what you said a moment ago: if you want to ensure you’re looking at data of the right quality, with the right semantic enrichment that you need, you need to have a level of governance around that. We don’t want hallucinations in these large language models. We’re approaching that with real pragmatism, considering how we can use the best of breed to be able to extend the value of SAP. Where we see the best of breed accelerating benefits, we want to capitalize on that.
Similarly, but internally, we also have our AI ethics approach. With SAP’s intrinsic understanding of customer data and business processes, we can understand the context of data, rather than taking it out of context. In HR systems, if you’re looking at employee demographics, and you aren’t looking at it through a historical lens, perhaps there’s been a huge push, based upon certain KPIs that larger organizations have, that adds context to the demographics. Environmental and social governance (ESG) falls into that as well. You look at the green balance sheet, where customers now want to be able to move the positioning of how ecologically they're progressing in terms of carbon neutrality. You need to look at the broader context of that data. This is why, when we’re looking at AI, we’re looking at business context.
Joule is able to look at data with that same ethical lens. And you have to look through that, because otherwise you’re going to be asking questions that are either outside the realm of what would be considered to be ethical or out of context of the kinds of questions that need to be asked. We view this as one elaborate but elegant way of putting all these pieces together: elaborate because there are lots of moving parts, elegant because of the way we can precisely connect them using the business data fabric.
The value of us integrating Joule across all of our different lines of business, along with our generative AI hub that gives you access to a broad range of LLMs, and using the foundations of SAP BTP—with its SAP Build Code capabilities also unlocking pro-code development capabilities—is that you can use generative-AI with Joule to create the next generative-AI applications, build new data models, and even build test infrastructure around all of that. It really builds an end-to-end story for our customers, from the pro-code development side to the consumption side for planning capabilities, all the way to having a single digital-copilot experience across all of your SAP applications.
SAP wants to empower both business and technical users to successfully navigate its solutions. What capabilities will these data-centric announcements help to build, for customers looking to ensure those in IT and on the business side of operations have equal confidence to make inquiries within SAP systems?
Khan: There is definitely a spectrum of different business users and technical users. And we can’t obfuscate one against the other. Take SAP Datasphere—that’s appealing to data architects and data professionals who need to build integration points for SAP and non-SAP data. One of the huge movements that we’ve been pursuing internally, across all of our different lines of business, concerns the ability for each line of business to create data products.
What is a data product? Essentially, it’s an encapsulation of a semantically enriched payload of data. A good example would be an invoice, right? An invoice is a semantically enriched dataset that has customer data, order information, line-item information, and VAT information; in aggregate, that constitutes an invoice. It’s not the raw data or any of those individual constituent parts that customers care about. They care about an invoice, and its supply chain; you care about an order and inventory. All of those things need to be reflected from the application side as data products. SAP Datasphere can consume those data products. It’s a consumer of data products by lines of business of SAP.
Imagine that you’ve got SAP SuccessFactors. And you create, for instance, specific data products around the onboarding experience. Or, in SAP Ariba, it could be based around the procurement experience within a network, around supplier spend analysis, for example. All of this information is disparately associated with each of the different lines of business. And you, as an SAP customer, could have multiple SAP properties. Maybe you have core ERP, in SAP S/4HANA, you’ve got SAP Ariba, and you’ve got SAP SuccessFactors.
How is it that you can combine all that data together in a seamless experience? As a business user, you’re relying upon IT to somehow stitch all that together for you: typically, to integrate it all together, put it in a data lake, and build it out in terms of a new data model. But the value of SAP Datasphere, of having these data products served up via different lines of business, is that we can semantically onboard those data products into SAP Datasphere. And by doing so, your data professional only needs to understand an inventory, which includes all the different data products out there.
Automatically, you're building a real foundation, having that business data fabric. This fabric is able to provide you with connectivity to data that ordinarily would involve a technical lift-and-shift or integration project. As a business user, you can interrogate those specific data products. You can take a look at building those planning use cases but not having to go through IT to curate, provision, and pull all this data from all those different environments. That is the benefit and the beauty of the entire end-to-end SAP data strategy coming to life.
Data products curated by different lines of business are rendered and made available through a catalog, which is what SAP Datasphere is, registering those different data products. You can semantically onboard your BusinessObjects environment, objects within data in SAP Business Warehouse, to semantically onboard them or go to your SAP ERP Central Component (ECC) where your Core Data Services (CDS) Views live. You can bring all that data together in one single location, in a virtualized way or in a physical way. As a business user or as a technology user, we're blurring the lines, and taking away a lot of the heavy lifting that ordinarily would have been necessary.
At last fall’s SAP TechEd announcements, vector capabilities for SAP HANA Cloud were announced to enhance the enterprise’s interactions with LLMs and by extension its ability to leverage generative AI. Since then, how has the conversation about vector capabilities in SAP HANA Cloud progressed at SAP?
Khan: There’s a bunch of moving parts. You’ve got the vectoring capability in SAP HANA Cloud, knowledge graphs sitting inside of SAP Datasphere, Joule’s integration with SAP Analytics Cloud to provide a more copilot-based experience… All of these areas are correlated.
I’d say, without hesitation, that SAP HANA has been a major vehicle for innovation for a number of years. We talked about this in terms of the SAP HANA Cloud evolution, splitting out the storage and compute layer; that’s now part of the foundation of SAP HANA as it has modernized for the cloud-native world. When announcements were first made around OpenAI and ChatGPT, it became very quickly evident that LLMs, from a text-based perspective, are exceedingly good at asking two-dimensional types of questions where you’ve encapsulated the data and information needed to generate a text response. When you get into those more specific, contextual business questions—for example, if you were to ask a question on value-added tax (VAT) harmonization in multiple jurisdictions—a lot of business context needs to be known.
Having a vector capability, you can go through that drag-and-drop framework process, iterate through, and be able to provide that business context to an LLM, knowing and trusting that it’s coming from SAP. That has orders of magnitude more importance and value to the end customer, and it moves away from mass hallucinations that you get with LLMs that are trained purely on historical, public data from the Internet. Because LLMs can build you a model that will allow you to figure out what the VAT thresholds should be, but it doesn't give you the answer. It gives you a means to solve the problem. Customers don't want to do that. They want an answer, and they want it to be injected and infused with that exact level of value in context.
If you can summarize data and drive actions based upon business context, that’s a value-add. We’re seeing that our customers want to look at generative AI not just in an artificial academic sense, but really plug it into business context. That’s why the whole business AI movement of SAP goes along with these data announcements. They’re inseparable. You won’t find business AI or any generative AI being successful without a very substantial data position.
For more coverage of recent SAP developments, dive into the SAP Data Unleashed announcements and the TechEd 2023 announcements that preceded them.