Data Access Control(DAC) : Trino/Starburst – Collibra Integration is an Access Rule Governance and Metadata harvesting framework that leverages Trino/Starburst Query Engine to source metadata from the Big-Data Sources in a transparent and seamless manner for Query Access Control. Trino/Starburst integrates with Collibra through the proprietary MIF framework. It combines the governance capabilities of Collibra with the Query Access Rule implementation features of Trino/Starburst. The DAC framework allows enrichment and provisioning of metadata along with access rule governance. Querying, Access Rules, connectors for Big Data sources, Query execution model are the capabilities supported by Trino/Starburst are brought under the Governance umbrella of Collibra workflows. In addition, Collibra’s Data Checkout and Provisioning features along with the guided stewardship for the data access at the table level for a group of users, customized with selection of right for each table in the dataset is added to provide a powerful tool for seamless and transparent governance of metadata brought from different data sources of the enterprise-data ecosystem. Industry standard regulations and security practices are taken into account and based on the interacting entities, Data Usage (DU), Data Movement (DM), and Data Sharing Agreements (DSA). Assets are available to be leveraged as per the organization’s requirements.
Following are the business and functional challenges that are encountered in provisioning access to big-data sources:
- Lack of a centralized governance platform that controls access to multiple big-data sources (e.g., Hive, Kafka, Elastic Search, etc.).
- Non-homogenous data-access sharing methods with different rules, sharing agreements and ownership
- Lack of dedicated Query Access rule Governance frameworks
- Data discovery challenges due to multiple metadata platforms and non-uniform discovery features
- Lack of a uniform and automated dataset checkout mechanism
- Automated security checks on data (e.g., PII, SDE elements) are not present during checkout and provisioning.
- Complete provisioning of dataset without control over the rights among the group of users
The DAC Trino/ Starburst – Collibra Integration framework, which is built around Lorang Technology’s proprietary Metadata Integration Framework (MIF) addresses the above challenges in data access governance and provisioning.
It provides a simplified and effective governance mechanism with automated data-access provisioning by orchestrating metadata from target data sources (HDFS, Hive, HBase, Kafka, Elastic Search, etc.) And Query Access Rules that are already harvested into Trino/Starburst. It further enhances and streamlines the Access Rule Implementation and Querying features present in Trino/Starburst. The MIF Framework acts as a bridge between Collibra and Trino/Starburst and integrates with the respective REST API’s. It performs data exchange between Collibra and Trino/Starburst in JSON format and performs mapping, transformation and schema validation.
Integration with ServiceNow is also available for manual provisioning of data. The Collibra Data Intelligence platform provides a Catalog of all the Data resources and the capability to request Data access. Collibra workflows with maker-checker functionality enable metadata normalization, enrichment and classification. The MIF Framework performs two-way data exchange between Collibra and Trino/Starburst, so that updates for Query Access Rule information are fed back to Trino/Starburst. Communication between the different systems (Collibra, Trino, Starburst and MIF) is secure, decoupled and achieved through RESTful calls.
DAC – Trino/ Starburst Features:
- Unified meta-data Harvesting, Governance, management, classification, and provisioning platform
- A unified Collibra Operating Model capable of mapping to different data sources from the Hadoop data ecosystem (e.g., Hive, HDFS, HBase, Kafka, Elastic, etc.)
- Provides simplified Query access rule creation using Collibra workflows during the Dataset checkout process.
- Orchestrated dataset checkout process with guided stewardship for granting the access among the group of users with their respective rights.
- Auto-conversion of metadata between Collibra and Trino/Starburst formats
- Synchronization of user driven access controls to implement Trino/Starburst Access Rules
- Integrates with Query Engine Trino/Starburst and Collibra, acts as a bridge between the two systems.
- Automated access rules extraction from Trino/Starburst to translate them to business user access policies.
- Unified access rule management platform for multiple access control tools.
- Seamless integration with the ServiceNow ticketing system for manual data provisioning.
To Achieve Data Access Control workflows in Collibra launched by a user that concerns a user (himself very probably) that select a data set and indicate the right he wants to get on each table, we would like a workflow executed by a data steward (or data custodian) that concerns a group (that may be provided manually, as I don’t see how Collibra could sync with a external group provider) for which you indicate the right (select, insert..) that must be setup for each table, knowing that we want to forbid a group that may have at the same time a table from GCP can be in WRITE, and another table in which there would be sensitive data (this can be simply analyzed by the approver in the next step, or automatically if possible).
Trino/Starburst – Collibra Integration allows access tables with multiple privileges with users or groups.
Please send your questions directly to the vendor via: