Join top executives in San Francisco on July 11-12 to hear how leaders are integrating and optimizing AI investments for success. Learn more
Palo Alto, located in California Airflowa company that makes it easier for developers to embed data privacy into their applications today announced the launch of a “privacy vault” for major language models.
The solution, as the name suggests, provides enterprises with a layer of data privacy and security throughout the lifecycle of their LLMs, starting with data collection and continuing through model training and deployment.
It’s because companies across industries continue to race to embed LLMs, such as the GPT model suite, into their workflows to simplify processes and increase productivity.
Why a privacy vault for GPT models?
LLMs are all the rage these days and help with things like text generation, image generation, and summaries. However, most models out there are trained on publicly available data. This makes them suitable for wider public use, but not so much for the business side of things.
For LLMs to work in specific business environments, companies must train them on their internal knowledge. A few have already done it or are in the process of doing it, but the task is not easy, as you must ensure that the internal, business-critical data used to train the model is protected at all stages of the process.
This is exactly where Skyflow’s GPT privacy vault comes in.
Delivered via API, the solution creates a secure environment, allowing users to define their sensitive data dictionary and protect that information at all stages of the model lifecycle: data collection, preparation, model training, interaction and deployment. Once fully integrated, the vault uses the dictionary and automatically redacts or tokenizes the chosen information as it flows through GPT – without diminishing the value of the output in any way.
“Skyflow’s patented polymorphic encryption technique allows the model to seamlessly process protected data as if it were plain text,” Skyflow co-founder and CEO Anshu Sharma told VentureBeat. “It protects all sensitive data flowing into GPT models and only discloses sensitive information to authorized parties after it has been processed and returned by the model.”
For example, Sharma explained that sensitive data elements in plain text, such as email addresses and social security numbers, are exchanged with Skyflow-managed tokens before input is provided to GPTs. This information is protected by multiple layers of encryption and fine-grained access control during model training, and finally detokened after the GPT model returns its output. The result is that authorized end-users get a seamless output experience, where plain text-sensitive data bypasses the GPT model.
“This works because GPT LLMs already break down inputs to analyze patterns and relationships between them and then make predictions about what comes next in the sequence. So tokenizing or redacting sensitive data with Skyflow before input is delivered to the LLM has no impact on the quality of GPT LLM output – the patterns and relationships remain the same as before sensitive data was tokenized in plain text by Skyflow,” Sharma added.
The offering can be integrated into a company’s existing data infrastructure. It also supports multi-party training, where two or more entities can share anonymized datasets and train models to unlock insights.
Multiple use cases
While Skyflow’s CEO didn’t share how many companies use the GPT privacy vault, he did note that the offering, which is an extension of the company’s existing privacy-focused solutions, helps protect sensitive clinical trial data throughout the development cycle of medicines. as well as customer data used by travel platforms to improve customer experiences.
IBM is also a Skyflow customer and uses the company’s products to de-identify sensitive information in large datasets before analyzing it through AI/ML.
Notably, there are also alternative approaches to the privacy issue, such as creating a private cloud environment for running individual models or a private instance of ChatGPT. But that could turn out to be Much more expensive than Skyflow’s solution.