It all starts with proper data governance. AI needs data, and data needs data governance. The lack of proper data governance can expose an enterprise to substantial risks (GDPR violations can be fined up to 4% of annual revenue) and compromise the quality of any analytics or AI. At its core, data governance focuses on security, risk, and compliance, especially at data aggregation points like data warehouses and data lakes where there are still substantial security blind spots. The twin pillars of “What data?” and “Who has access?” are the foundation of data access governance.
“What data?” - Effective data governance starts with ingesting existing governance rules, data assets, identities, and access privileges and how they relate to each other. The unauthorized data replication for development, testing, or acceleration purposes seems far too common. This shadow data can fly under the data governance radar, creating substantial risk if not detected. The scope of this discovery process is typically not limited to intra-enterprise but also frequently extends to inter-enterprise as the exchange of data with trusted partners in so-called clean rooms becomes standard. The discovery process should ideally run without agents or proxies to avoid additional latency or performance bottlenecks. It is also important that the discovery data itself is subjected to the same governance rules as the underlying data. Exporting data or metadata to external systems outside a customer’s jurisdiction usually raises concerns.
Not all data is created equal. The value of data is ultimately determined by the business cases they feed into. However, independent of business cases, certain data enjoys special protection by law/compliance rules, given their sensitive nature. Frequently, the sensitivity is more than just a one-dimensional construct, as in the case of GDPR, where the location of the data is of critical importance. Modern data governance platforms automatically classify data and assign values based on deep reads and NLP. Conflating data sources with their value and risk profile will highlight critical risks and guide remediation efforts.
“What access?” - The questions “What data?” and “Who has access?” become even more complicated when the aperture is narrowed from the holistic to the atomic level, as the unit of labor for data access governance is rarely an entire data warehouse but individual cells or columns. Defining access rules down to individual cells has become necessary, not just in enterprises but inter enterprises where hundreds of millions of records are exchanged in clean rooms by allowing customers unique and fine-grained access to pieces of the same data set. Access rules are rarely static: employees and customers turn over, organizational charts change, and business processes get reorganized - updating systems is frequently challenging. Additionally, access rights get frequently defined rather loosely, i.e., employees receive access based on a role or their position in the organization chart and not based on the data itself and the business outcomes it drives. Every overprovisioned account is a substantial security risk.
Lastly, how can we defend our data against insider threats? What if the Who in “Who has access?” starts to exhibit abnormal behavior trying to cause damage? It could be an insider or an outsider who finds ways to hack into somebody’s account and is now out to exploit the assumed identity. Both cases will require extensive behavior baselining to establish what is normal so that deviations can be spotted immediately and shut down. Additionally, certain standard patterns prepare for data exfiltration. Even before an attempt to exfil is made, monitoring and instant detection of these patterns are critical.
“Why a dedicated solution?” - Every major enterprise has dozens of security products and suites, and consolidating technologies into increasingly comprehensive platforms offered by a handful of vendors seems to be a major trend. What are the current offers to address all the above-referenced problems in the seemingly unending labyrinth of security products that all come with sophisticated acronyms? The surprising answer is that there aren’t any solutions as traditional security has been focused on strengthening the perimeter, i.e., keeping bad actors out. However, as illustrated above, data governance will require an insider’s view. As data governance is closely attached to the data, it is only logical to have data governance and enforcement sit next to it. Co-location with the data will also avoid data transfers (including metadata), increasing the overall system's security.
“Why Theom?” - Theom is an AI-driven data-centric access governance platform designed to provide organizations with a comprehensive solution for managing, controlling, and monitoring access to sensitive data. Using Theom, enterprises ensure that the right people have access to the right data at the right time, minimizing the risks of data breaches.
Understanding who can access the most sensitive data within the enterprise is a gap in today’s security tools. Further, understanding the reality of access, if the provisioned privilege for a user was used or not, is a much bigger issue. Theom's granular access control engine allows organizations to define specific permissions for each user, group, or role. This prevents unauthorized access and maintains a strict level of security by limiting access to only those who require it for their role. Administrators can easily configure and modify permissions as needed, ensuring a flexible and dynamic approach to access governance.
With Theom, enterprises can monitor and fix changes in access provisioning, role misuse, and atypical behavior of users. Further, identify compromised users, analyze potential blast radius, and prevent data breaches. For each access issue, Theom enables the quantification of potential liabilities based on criticality and $ value associated with the data. Theom has workflows to fix over-provisioning, so user permissions are shrink-wrapped, and principles of least privilege are implemented. Theom provides workflows for operational hygiene around identity cleanups.
Theom is embedded within your data lakes and warehouses and does not transfer any data out. Theom does not use proxy or cross-account roles for delivering data access governance. By harmonizing access, data, and security across clouds, Theom brings a unified and consistent approach to access governance across the data clouds, reducing the complexity and overhead of managing multiple security solutions.