In a world where artificial intelligence plays an increasingly significant role, many organizations are hopping on board the AI train by sending out information into AI-driven engines like Open AI’s ChatGPT, Microsoft’s Bing, and others. What they may not consider, however, is that this information is then stored by these external providers and used to further train the AI models.
The best way to ensure the protection of sensitive data is by utilizing automated data classification solutions, which effectively scan company data, locate sensitive information, and flag any potential risks.
In my recent chat with Chris Stapenhurst, Product Manager at information management technology provider Veritas, we explored potential risks using novel generative AI engines and how data classification can help prevent them.
Sending Data into AI Engines: A Tricky Business
The widespread use of AI engines for both personal and business use has resulted in a surge of data being shared with AI-based systems. Of course, this has numerous benefits and immense potential to decrease workload, simplify complex tasks, and help open our creative chakra. However, organizations and individuals often overlook the fact that the information they provide may end up stored and utilized by AI providers.
“This lack of awareness raises concerns about data privacy, as sensitive information can unknowingly be used to train AI models,” Stapenhurst warns.
“It is, therefore, crucial for organizations to take control of the data they share and ensure its cleanliness before it reaches the cloud.“
How Does Data Classification Work?
Data classification is a method of categorizing data based on its content to gain insights into its nature and potential risks.
“Data classification should ideally occur before data is sent to AI-based engines or cloud providers since the enormous volumes of information involved make manual inspection impractical once the data is out of an organization’s control.” Stapenhurst explains.
Automated classification engines play a crucial role in efficiently scanning and identifying the content of data, including contracts, documents, emails, and policy types.
“These engines offer insights into whether data contains personal information, market abuse references, or off-channel signaling,” Stapenhurst adds.
Who Needs Data Classification, and Why?
The answer is simple: Everyone needs data classification. It is essential for almost all organizations, regardless of their size or industry.
“Many organizations primarily focus on capacity management or data management by repository, neglecting the comprehensive understanding of their data,” Stapenhurst notes.
“This lack of insight leads to the accumulation of what we call ‘dark data,’ leading to many potential risks.”
Dark Data comprises obsolete (old), redundant (copies), or trivial (non-business) information that exists in a company’s information estate without it being aware of its existence or exact location.
“Organizations are often lucky if they know where even half of it is,” Stapenhurst notes.
“While trivial data is of low importance, obsolete and redundant data can be extremely important. They could be copies of sensitive data, old customer lists, or employee contracts,” he explains.
Effective data classification enables organizations to proactively manage their data, reducing risks and enhancing data protection.
Aside from protecting information and ensuring compliance, data classification offers additional benefits. These include maintaining proper data governance, streamlining eDiscovery processes, enabling efficient file retrieval, enhancing IT-based tasks, and tailoring backup strategies based on data sensitivity. By implementing data classification, organizations gain control over their data, reduce risks associated with unknown information, and optimize their operations.
Veritas: Empowering Organizations with Data Classification
Veritas offers an extensive library of over 1100 data identification patterns and 300 compliance violation detection policies and provides organizations with an advanced data classification engine.
“We define our classification engine as a microservice since it’s integrated into various Veritas products, including file analysis and archiving tools,” Stapenhurst shares.
This enables multiple teams within an organization – from security and compliance to records management and privacy – to leverage classification insights and act upon them immediately.
“In this day and age, proper data management and specifically data classification is critical to be able to ensure no sensitive or private data leaks out and to prevent accidental breaches of contracts with both customers and partners,” Stapenhurst concludes.
To learn more about Veritas’ data classification and data management solutions, visit their website.
from UC Today https://ift.tt/d9VNTFH
0 Comments