<aside> 💡 This is an evolving threat vector and is probably one of the biggest concerns of companies’ full-throated adoption of LLMs; namely, their concern that personal or confidential company information makes its way into the model training data and eventually into the outputs of user (hacker) prompts.
3 ways private data ‘leaks’ out:
For more of the current research into this issue, see: https://www.semanticscholar.org/search?q=privacy loss TRAINING DATA LEAKAGE in large language models&sort=relevance
</aside>
<aside> 💡
Microsoft had to do an about face on an AI feature that was to be their killer feature in the next iteration of Windows, called Recall. Which ironically, they had recall! Why? due to user privacy objections.
It worked by snapping screenshots of your PC screen every few seconds, allowing an internal AI to do near constant image-to-text analysis, allowing users to query just about anything on your PC. the search results would be presented in timeline format so you could see your interactions over time.
In principal, a useful sounding and potentially powerful capability… until you factor in the creep factor and the “uncanny valley” of all this… and you realize your great idea is a flop.
</aside>
To mitigate these risks, experts recommend implementing robust data protection measures, enforcing stricter privacy regulations, ensuring transparency in AI systems, and educating users about the potential risks of sharing sensitive information with LLMs[3][6]. As the technology continues to evolve, it's crucial to balance the benefits of LLMs with the fundamental right to privacy.
Citations: [1] https://arxiv.org/abs/2310.10383 [2] https://thenewstack.io/llms-and-data-privacy-navigating-the-new-frontiers-of-ai/ [3] https://stackoverflow.blog/2023/10/23/privacy-in-the-age-of-generative-ai/ [4] https://www.tonic.ai/blog/safeguarding-data-privacy-while-using-llms [5] https://www.sentra.io/blog/safeguarding-data-integrity-and-privacy-in-the-age-of-ai-powered-large-language-models-llms [6] https://www.linkedin.com/pulse/chatgpt-data-breach-wake-up-call-privacy-security-large [7] https://hiddenlayer.com/research/the-dark-side-of-large-language-models/ [8] https://hiddenlayer.com/research/the-dark-side-of-large-language-models-2/