In my last column article Generative AI: What’s Next and How to Get the Payoff, I concluded with a recommendation and decision tree for selecting a generative AI tool — or should I say “toy” — looking at the still ongoing hype?
Well, on further thinking, “toy” is not the right word, as there are a multitude of beneficial use cases.
Now, I am bold enough to assume that you followed my recommendation and found a provider and possibly a partner for implementing a generative AI that fits you and your requirements, not only now, but also in the foreseeable future. You have also educated your team on established guardrails and governance. The team is aware and believes that this new powerful tool is there to help them do their work and not to replace them.
So, you and your team feel confident.
This means that you have a tool, an eager team, and are ready to run. Maybe you even have first simple and lower-risk use cases implemented. You do see the value, not in the least in creating highly personalized content, to name just one. Still, you are wary of risks. One of them is of course the exposure of too much personal and/or company sensitive data.
The problem: Convenience vs. safety and privacy
The crux here is that a good prompt that delivers immediately useful content — content that takes less time to review and validate — than it takes to write it, requires a lot of specific information. Chief of these are employee names for the signature, customer names, recent relevant customer behaviors, and perhaps a lot more.
Unless you have opted for a self-hosted AI or a private cloud, you cannot be sure where a prompt is stored. All of a sudden, you are in GDPR territory, and with it, the right of being forgotten that it assures. And this already assumes that you have trained your model without any personally identifiable (PII) data as you cannot tell which PII data the model uses, far less how and where it uses and stores any PII data — or whether it gets anonymized in the process. By the way: Do you have consent to use your customers’ or employees’ data for training or fine-tuning the model?
At this place, let’s not dive deep into ethical problems like bias, sustainability, currency of data, or own honesty, that working with LLMs brings up. This is a discussion all of its own. Instead, let’s concentrate on the immediately relevant problem outlined above.
How does an architecture need to look like that protects the safety of corporate data and keeps customer and employee privacy intact while providing employees with a tool that increases their efficiency? Is it possible to get both, convenience and safety?
The solution: An AI security layer for safe LLM usage
It is safe to assume that sooner, rather than later, there will be many interfaces in business applications that allow for the prompting of LLMs, or generally, external AIs. These systems will provide general knowledge about the world and/or offer the ability to generate responses that are formulated in high quality. In a corporate setting, a prompt will always happen in a specific business context. Therefore, to create a meaningful prompt for the LLM, the prompt that is entered by the user needs to be enriched by contextually relevant data from the company’s business systems. This data may come from the application database, data warehouse, data lake, lakehouse, or whatever other internal repository. This enrichment needs to stay as minimal as possible.
Latest from this moment on, it is necessary to technically protect the data when it is sent to the LLM. This means, that before submission to the LLM, it needs to be sent through the company’s internal security layer. This AI security layer is part of the company’s IT platform. It ensures that no critical data is sent to an external processor (unless rules allow this) One can now argue that the prompt itself and the enrichment step need to be part of this layer, too, but this is not relevant for this discussion.
Once the data is enriched and ready to be submitted, it needs to be masked in a way that the context stays in place, sensitive data is protected, but restorable by the AI security layer. Basically, the system needs to store the substitutions done for later reversal and auditability purposes. Once this has happened, the prompt can get submitted to the LLM via a secured channel.
The generated output may then be subjected to a profanity check that makes sure that corporate communication policies are not violated, before being unmasked again.
Following this, a protocol entry, minimally covering prompt, mask, profanity result, and output is created, so that compliance checks can be performed.
Last, but not least, the result is presented back to the user, who can reject, accept, manually refine, or have the system refine the result. Importantly, subsequent prompts use different masking.
On acceptance and rejection, or whenever the user decides to, the prompts that were sent to the LLM, are deleted in the LLM system. Although most LLM providers ensure that prompts that are sent via the API are not used for training the system, this makes double sure.
An LLM provider that does not offer this possibility, should simply be avoided.
Why is this technology-centric article in this column?
Glad you asked!
Well, first, I have a tech background, but more importantly, there is a big business benefit to be had. An architecture that considers this AI security layer can help businesses drive their engagement to a whole new quality level. Bi-directional communication with customers can become individualized at scale while protecting customer and company data. Done right, this personalization works with only zero-party or first-party data.
This, along with a judiciously chosen frequency of communications, helps build trust and relevancy. At the same time, the offered level of automation helps reduce the effort level required by employees by removing mundane tasks. The potential business benefit on both dimensions is something to look at, not to mention regulatory and compliance topics here.
What do business leaders need to do?
Ideally, the AI security layer is part of the software infrastructure that has already been delivered by one or more of your strategic application partners. Many companies have more than one, e.g., SAP as a transactional back end and Salesforce or another vendor covering customer-facing processes. So, it pays off to contact your partners.
- Talk to your strategic software- and implementation partners about how the AI security layer can get implemented based on their platforms and technology. Ask them, how data from various business applications is accessed securely and how they make sure that no sensitive data crosses the corporate boundary — encrypted or not, how they make sure that no data is stored outside the corporate boundary, and how it is ensured that no profanity or overly biased responses that do not correspond to corporate policy answer prompts (there is no way that there is no bias at all, just acceptable levels).
- Assess flexibility, cost, and implementation efforts, as well as the vendors’ roadmaps and their viability.
- If there is no AI security layer that your strategic partners can deliver, evaluate additional vendors using the same questions you asked your strategic ones; in this case, you will likely need to follow a best-of-breed strategy.
- Get clarity about your implementation partners’ skill sets and ability to implement an AI security layer.
In any case, do not underestimate the importance of this technology layer.