Unsupported Screen Size: The viewport size is too small for the theme to render properly.

Google Cloud Dataflow to Collibra Integration

Published by: Collibra Marketplace
Latest version: 1.0.6
Released: January 9, 2023
Contact Publisher
PackageDocumentation
Community Offering

Your use of Community Offerings is subject to the Collibra Marketplace License Agreement. Read more.

Overview

For this specific integration (and all other Custom Integrations listed on the Collibra Marketplace), please read the following disclaimer:

  • This integration is a template that has been developed in cooperation with a few select clients based on their custom use cases and business needs.
  • While all effort has been made to encompass a range of typical usage scenarios, specific needs beyond this may require chargeable template customization.
  • With this in mind, we have made sure that the template is available as source code and readily modifiable to suit the client's particular use case.

Google Cloud Dataflow is used to manage and execute various data processing patterns. This integration helps analysts, and data scientists understand where the data is coming from, where it has been, how it is being used and who is using it. As an example, it can be used to identify the root cause of bad data events, and checking regarding the impact analysis prior to making data changes.

This Spring Boot integration retrieves the job details (pipelines) from Google Cloud Dataflow, transforms, and upserts them to a Collibra Platform instance as assets and technical lineage.

Use Cases

This integration will help you increase trust and data citizen engagement around data streams that pass through, are collected by, or are produced in Google Cloud Dataflow.

The data center makes it straightforward for you to track changes in your Google Cloud Dataflow metadata. This gives you the confidence to plan and engage with relevant business process owners accordingly.

Elements in Scope

The integration is designed to retrieve the following metadata:

  • Mapping Specification
  • Table
  • System
  • File
  • GCS Bucket
  • GCP Dataflow Step

To receive support on this item, you can engage our Professional Services team or post any questions in the Data Citizens Community.

Media

More details

Release Notes
  • Added Docker files.
  • Updated the Collibra Integration Library version to 1.1.10.
  • Updated the Spring Boot version to 2.7.5.
  • Code enhancements.
Compatibility
  • Spring Boot Framework
  • Eclipse IDE
  • Collibra Data Intelligence Cloud
  • Collibra Data Intelligence On-Prem
Dependency
  • Java Runtime Environment 1.8
  • Spring Boot Integration Library
License and Usage Requirements

Release History

Version 1.0.5
June 24, 2022
Release Notes
  • Updated the Spring Boot Starter Parent version to 2.5.12 (CVE-2022-22965).
  • Updated the Collibra Integration Library version to 1.1.5.
Compatibility
  • Spring Boot Framework
  • Eclipse IDE
  • Collibra Data Intelligence Cloud
  • Collibra Data Intelligence On-Prem
Dependency
  • Java Runtime Environment 1.8
  • Spring Boot Integration Library
License and Usage Requirements
Version 1.0.4
March 18, 2022
Release Notes

Updated the Spring Boot Integration Library dependency version in the pom.xml file to v1.1.3 that supports the latest Collibra Platform versions (v2022.01+).

Compatibility
  • Spring Boot Framework
  • Eclipse IDE
  • Collibra Data Intelligence Cloud
  • Collibra Data Intelligence On-Prem
Dependency
  • Java Runtime Environment 1.8
  • Spring Boot Integration Library
License and Usage Requirements
Version 1.0.3
December 22, 2021
Release Notes

Updated the Log4j version from 2.16 to 2.17 due to vulnerabilities.

Compatibility
  • Spring Boot Framework
  • Eclipse IDE
  • Collibra Data Intelligence Cloud
  • Collibra Data Intelligence On-Prem
Dependency
  • Java Runtime Environment 8
  • Spring Boot Framework
License and Usage Requirements
Version 1.0.2
December 16, 2021
Release Notes

Updated logger log4j2 dependency to Apache log4j2 version 2.16.0.

Compatibility
  • Spring Boot Framework
  • Eclipse IDE
  • Collibra Data Intelligence Cloud
  • Collibra Data Intelligence On-Prem
Dependency
  • Java Runtime Environment 8
  • Spring Boot Framework
License and Usage Requirements
Version 1.0.1
August 24, 2021
Release Notes
  • Added WebClient retry requests support
  • Added note regarding using an external KeyStore file
  • Updated the log4j2.xml file to include the Collibra logger
Compatibility
  • Spring Boot Framework
  • Eclipse IDE
  • Collibra Data Intelligence Cloud
  • Collibra Data Intelligence On-Prem
Dependency
  • Java Runtime Environment 8
  • Spring Boot Framework
License and Usage Requirements
Version 1.0.0
August 5, 2021
Release Notes

Initial release:

A Spring Boot integration that retrieves job details from Google Cloud Dataflow, transforms and upserts them to a Collibra Platform instance as assets and technical lineage (using the Collibra Lineage Harvester and the Collibra Integration Library).

Compatibility
  • Spring Boot Framework 2.5.0
  • Eclipse IDE
  • Collibra Data Intelligence Cloud
  • Collibra Data Intelligence On-Prem
Dependency
  • Java Runtime Environment 8
  • Spring Boot Framework 2.5.0
License and Usage Requirements

Need help? We have a coaching session that can help you with:

  • Initial information on the integration and prerequisites for custom Springboot integrations.
  • Expert session on debugging and development support for custom Springboot integrations.

Book here

See existing Q&A in the Data Citizens Community

Browse discussions with customers who also use this app. 

Start a New Topic in the Data Citizens Community

Collibra-hosted discussions will connect you to other customers who use this app.

The following terms shall apply to the extent you receive the source code to this offering.

Notwithstanding the terms of the Binary Code License Agreement under which this integration template is licensed, Collibra grants you, the Licensee, the right to access the source code to the integrated template in order to copy and modify said source code for Licensee’s internal use purposes and solely for the purpose of developing connections and/or integrations with Collibra products and services.

Solely with respect to this integration template, the term “Software,” as defined under the Binary Code License Agreement, shall include the source code version thereof. Except with respect to the foregoing, all remaining terms of the Binary Code License Agreement shall apply to the license of integration template hereunder.

Reviews

Rating
on January 27, 2023

Dear Team,Could you please provide Integration steps from gcp dataflow to collibra. I see other products and service have documentation steps.but i couldnt find a one to do this for my POC.

on January 27, 2023

Thanks for your question. The documentation is now online.

Leave a review