Google Cloud Data Catalog Bidirectional Collibra Integration
Overview
The Google Cloud Data Catalog is Google Cloud’s centralized, serverless, and highly scalable data discovery and metadata management service. This means that with it, you can build and manage an optimized index to search your data assets.
This bidirectional integration can perform two types of synchronizations:
Google Cloud Platform to Collibra Platform
By integrating the GCP Data Catalog into your Collibra Platform instance, you will be able to retrieve metadata for your GCP schemas, views, tables, files, streams, and spreadsheets quickly and efficiently. It will then transform the metadata into assets and upsert it to Collibra.
It is easy to track all changes in GCP as Collibra keeps a record of them. When schemas, columns, tables, tag templates, tag template fields, tags, and tag fields are removed from the GCP Data Catalog, they are preserved in Collibra and marked as ‘Obsolete’. In contrast, all assets that are still active in GCP will be marked as ‘Candidate’ in Collibra.
Collibra Platform to Google Cloud Platform
This direction enables you to update the GCP Data Catalog tag template fields and tag fields according to the values set on the Collibra Platform. The tag template field values that are updated on GCP Data Catalog are:
- Display Name
- Is Required
- Type
- Description
Meanwhile, only the tag field value is updated on GCP Data Catalog.
The integration also supports passing a query parameter tagTemplateId
when triggering the integration that would be used to only synchronise the above values for the respective tag template and the tags that have a relation to this tag template.
Use Cases
Customers that leverage GCP Data Catalog typically want to synchronize the metadata between GCP Data Catalog and the Collibra Platform. This bidirectional integration allows them to do so in a straightforward manner.
-
GCP Data Catalog to Collibra Platform
-
This direction of the integration allows replicating the structure that is available on GCP Data Catalog as assets on Collibra.
-
-
Collibra to GCP Data Catalog
-
The reverse direction of the integration makes it possible to update the tag attributes on GCP Data Catalog based on their respective Collibra Platform values.
-
Therefore, each direction of this integration can help your company understand and update the data on GCP Data Catalog quickly and efficiently.
Elements in Scope
The integration is designed to retrieve the following metadata:
- Dataset
- Table
- Database View
- Column
- Sub-column
- Google Tag Template
- Google Tag Template Field
- Google Tag
- Google Tag Field
To receive support on this item, you can engage our Professional Services team or post any questions in the Data Citizens Community.
Media
More details
Release Notes
- Removed security vulnerabilities
- Upgraded to Spring Boot version 2.7.5
- Docker file added
Compatibility
- Spring Boot Framework
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Google Cloud Data Catalog
License and Usage Requirements
Release History
Release Notes
- The functionality Remove tag from Data Catalog dataset/table/column has been added
- Docker file has been added
- Integration library and Spring Boot version updated to latest version
- Web security updated
- Collibra output module filter has been added for performance improvement
- Tag ID filter added to Java code
- Collibra output module boolean value has been added for Remove Tag and Asset ID values
Compatibility
- Spring Boot Framework
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Google Cloud Data Catalog
License and Usage Requirements
Release Notes
- Fix for displaying Tag Fields from GCP Data Catalog to Collibra.
- Enhancements and added support to update the GCP Data Catalog tag field values.
Compatibility
- Spring Boot Framework
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Google Cloud Data Catalog
License and Usage Requirements
Release Notes
- Updated Spring Boot Starter dependency to v2.5.12
- Updated the Spring Boot Integration Library dependency version in the pom.xml file to v1.1.5
Compatibility
- Spring Boot Library
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Google Cloud Data Catalog
License and Usage Requirements
Release Notes
Updated the Spring Boot Integration Library dependency version in the pom.xml file to v1.1.3 that supports the latest Collibra Platform versions (v2022.01+).
Compatibility
- Spring Boot Library
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Google Cloud Data Catalog
License and Usage Requirements
Release Notes
Updated properties for start class and Integration library version.
Compatibility
- Spring Boot Framework
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Google Cloud Data Catalog
License and Usage Requirements
Release Notes
Updated the Log4j version from 2.16 to 2.17 due to vulnerabilities.
Compatibility
- Spring Boot Framework
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Google Cloud Data Catalog
License and Usage Requirements
Release Notes
Updated logger log4j2 dependency to Apache log4j2 version 2.16.0.
Compatibility
- Spring Boot Framework
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Collibra Platform v2021+
- Google Cloud Data Catalog
License and Usage Requirements
Release Notes
- Bi-directional functionality
- Restricted the previous Integration both logic and form of version 1.0.1
- Backward compatibility with previous releases
- Created additional synchronised direction from Collibra to GCP
- Added 2 synchronisation options:
– Synchronise all tag fields within 1 specific template in 1 location
– Synchronise all tag fields on all templates in all locations within the same project
Compatibility
- Spring Boot Framework
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 1.8
- Spring Boot Integration Library
- Collibra Platform v2021+
- Google Cloud Data Catalog
License and Usage Requirements
Release Notes
- Added note regarding using an external KeyStore file
- Updated the log4j2.xml file to include the Collibra logger
- Updated the GCPDataCatalogService variables initialization
Compatibility
- Spring Boot Framework
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 8
- Spring Boot Framework
License and Usage Requirements
Release Notes
Initial release:
Spring Boot integration that retrieves metadata from Google Cloud Data Catalog, transforms it and upserts it as assets on the Collibra Platform instance.
Compatibility
- Spring Boot Framework 2.5.0
- Eclipse IDE
- Collibra Data Intelligence Cloud
- Collibra Data Intelligence On-Prem
Dependency
- Java Runtime Environment 8
- Spring Boot Framework 2.5.0
License and Usage Requirements
See existing Q&A in the Collibra Community
Browse discussions with customers who also use this app.
Start a New Topic in the Collibra Community
Collibra-hosted discussions will connect you to other customers who use this app.
The following terms shall apply to the extent you receive the source code to this offering.
Notwithstanding the terms of the Binary Code License Agreement under which this integration template is licensed, Collibra grants you, the Licensee, the right to access the source code to the integrated template in order to copy and modify said source code for Licensee’s internal use purposes and solely for the purpose of developing connections and/or integrations with Collibra products and services.
Solely with respect to this integration template, the term “Software,” as defined under the Binary Code License Agreement, shall include the source code version thereof. Except with respect to the foregoing, all remaining terms of the Binary Code License Agreement shall apply to the license of integration template hereunder.