JDBC Driver for Apache Spark
Overview
The Apache Spark JDBC driver can be used in the Collibra Catalog in the section ‘Collibra provided drivers’ to register Apache Spark sources. Leveraging this driver, Collibra Catalog will be able to register database information and extract the structure of the source into its schemas, tables and columns.
Collibra JDBC drivers can retrieve the following database components:
- Schemas
- Tables
- Views
- Columns
- Primary keys
- Foreign keys
More details
Release Notes
Apache Spark driver v8155 has a fix for Views in Edge and Job Server.
Compatibility
- Collibra Cloud
- Collibra 5.7.5 and newer
Dependency
- Apache Spark 1.6 and above
License and Usage Requirements
- Collibra Catalog
- Metadata Integration
Release History
Release Notes
April 2022 release with latest updates and, where relevant, bug fixes.
Compatibility
- Collibra Cloud
- Collibra 5.7.5 and newer
Dependency
- Apache Spark 1.6 and above
License and Usage Requirements
- Collibra Catalog
- Metadata Integration
Release Notes
December 2021 release with latest bug fixes.
Compatibility
- Collibra Cloud
- Collibra 5.7.5 and newer
Dependency
- Apache Spark 1.6 and above
License and Usage Requirements
- Collibra Catalog
- Metadata Integration
Release Notes
The driver has been certified with the PLAIN (username/password) authentication scheme for the following:
- Ingestion of technical metadata from the source.
- Profiling of the registered data source and calculation of statistics that describe the data.
Compatibility
- Collibra Cloud
- Collibra 5.7.5 and newer
Dependency
- Apache Spark 1.6 and above
License and Usage Requirements
- Collibra Catalog
- Metadata Integration
Support is provided for the template as downloaded from the Marketplace, without changes to the way it works. These drivers are designed to only work with Collibra Catalog and cannot be used for other purposes.
Use of this solution requires a license to Collibra Data Intelligence Cloud and the purchase of Metadata Integrations at a volume commensurate with desired use. Please contact your sales representative or Customer Success Manager for more information.
Noor basha Shaik
Is this driver nothing but the Apache Spark SQL driver? If so, the permissions needed on the data source (in this case Hive metastore?) are not explicitly mentioned in the documentation –
https://productresources.collibra.com/docs/collibra/latest/Content/Catalog/RegisterDataSource/JDBC/ref_db-connection-details-collibra-provided-driver.htm?catalog-connector-details=spark-sql
Marketplace Collibra
AdminHi @noorbashashaik, thanks for this feedback.
Here’s our answer:
1. This is Apache Spark SQL driver Provided by Collibra via our partner CData.
2. We have tested with Databricks Cloud which is running in Azure.
Please refer permissions on Databricks source here: Databricks documentation
Cheers,
Paulo