DataHub Releases
Summary
Version | Release Date | Links |
---|---|---|
v0.13.3 | 2024-05-23 | Release Notes, View on GitHub |
v0.13.2 | 2024-04-16 | Release Notes, View on GitHub |
v0.13.1 | 2024-04-02 | Release Notes, View on GitHub |
v0.13.0 | 2024-02-29 | View on GitHub |
v0.12.1 | 2023-12-08 | View on GitHub |
v0.12.0 | 2023-10-25 | View on GitHub |
v0.11.0 | 2023-09-08 | View on GitHub |
v0.10.5 | 2023-08-02 | View on GitHub |
v0.10.4 | 2023-06-09 | View on GitHub |
v0.10.3 | 2023-05-25 | View on GitHub |
v0.10.2 | 2023-04-13 | View on GitHub |
v0.10.1 | 2023-03-23 | View on GitHub |
v0.10.0 | 2023-02-07 | View on GitHub |
v0.9.6.1 | 2023-01-31 | View on GitHub |
v0.9.6 | 2023-01-13 | View on GitHub |
v0.9.5 | 2022-12-23 | View on GitHub |
v0.9.4 | 2022-12-20 | View on GitHub |
v0.9.3 | 2022-11-30 | View on GitHub |
v0.9.2 | 2022-11-04 | View on GitHub |
v0.9.1 | 2022-10-31 | View on GitHub |
v0.9.0 | 2022-10-11 | View on GitHub |
v0.8.45 | 2022-09-23 | View on GitHub |
v0.8.44 | 2022-09-01 | View on GitHub |
v0.8.43 | 2022-08-09 | View on GitHub |
v0.8.42 | 2022-08-03 | View on GitHub |
v0.8.41 | 2022-07-15 | View on GitHub |
v0.8.40 | 2022-06-30 | View on GitHub |
v0.8.39 | 2022-06-24 | View on GitHub |
v0.8.38 | 2022-06-09 | View on GitHub |
v0.13.3
Released on 2024-05-23 by @david-leifker.
DataHub Release Notes
User Experience
- NEW: Business Attributes: Business Attributes are used to standardize and manage data elements across multiple domains, projects, and applications. By linking dataset attributes to Business Attributes, organizations ensure uniformity and ease of updates, as changes made to a Business Attribute are automatically propagated across all linked datasets. #9863
- Improved UI for Dataset Properties: Added collapse functionality for long dataset properties, making it easier to navigate and view relevant information. [#10203](https://github.com/datahub-project/datahub/pull/10203)
- Pagination for Ingestion Tasks Listing: Added pagination to the tasks listing page, making it easier to manage and navigate through tasks. [#10293](https://github.com/datahub-project/datahub/pull/10293)
- Rich Text Support for Form Descriptions: Added support for rich text in form descriptions, enhancing the user experience. [#104](https://github.com/datahub-project/datahub/pull/104)25
- New Analytics Charts: Added charts in the Analytics tab to identify Top Users and New Users. #10344
- Enhanced search functionality with customizable autocomplete configuration. [#10426](https://github.com/datahub-project/datahub/pull/10426)
Developer Experience
- Unified CI Workflow Updates: Improved CI build with unified workflow updates and disk space cleanup, making the build process more efficient. [#10353](https://github.com/datahub-project/datahub/pull/10353)
- Improved Logging for GraphQL Requests: Enhanced logging for GraphQL requests, providing better insights and debugging capabilities. [#10404](https://github.com/datahub-project/datahub/pull/10404)
- Enhanced Documentation for Lineage Feature Guide: Updated documentation for the lineage feature guide, making it easier to understand and implement. [#10401](https://github.com/datahub-project/datahub/pull/10401)
- Improved Documentation for SchemaField.label: Updated documentation for SchemaField.label, providing clearer guidance for developers. [#10251](https://github.com/datahub-project/datahub/pull/10251)
- Enhanced CI with Docker Image Publishing: Added Docker image publishing capabilities to the CI workflow, streamlining the deployment process. [#10193](https://github.com/datahub-project/datahub/pull/10193)
- Redesigned Docs Site Feedback Button: Improved the design of the feedback button in the documentation, making it more user-friendly. [#10182](https://github.com/datahub-project/datahub/pull/10182)
Metadata Ingestion
- Improved Data Profiling by early filtering of tables, correctly computing sample row counts, and combining unique count queries per table. #10378, #10319, #10322
- Airflow: Introduced support for
BigQueryInsertJobOperator
. #10452 - BigQuery: Added support for Table Clones and incremental column-level lineage.
- Snowflake: Improved reporting for usage aggregation and handled lineage errors; Improved ingestion performance with system sampling on very large tables. #10279, #10430
- Glue: Introduced support for delta schemas. [#10299](https://github.com/datahub-project/datahub/pull/10299)
- Redshift: Improved usage extraction by filtering out system queries. #10247
- Mode: Enhanced ingestion for Mode by adding dashboards into containers, improving data visualization and management. [#10563](https://github.com/datahub-project/datahub/pull/10563)
- PowerBI: Added support to automatically extract table lineage between PowerBI and Databricks. [#10416](https://github.com/datahub-project/datahub/pull/10416)
- dbt: Improved dbt ingestion by handling complex SQL and enhancing documentation, providing better data management and insights. [#10323](https://github.com/datahub-project/datahub/pull/10323)
- NiFi: Enhanced ingestion for NiFi with process group as browse path and incremental lineage, improving data organization and tracking. [#10202](https://github.com/datahub-project/datahub/pull/10202)
- Incubating Sigma and CockroachDB sources. #10037, #10226
Breaking Changes
- DynamoDB Connector:
aws_region
is now a required configuration. The connector will no longer loop through all AWS regions; instead, it will only use the region passed into the recipe configuration. [#10419](https://github.com/datahub-project/datahub/pull/10419) - Custom Validators and Mutators: Dropped a previously required constructor. [#10389](https://github.com/datahub-project/datahub/pull/10389)
- FabricType RVW: Added as a new FabricType. No rollbacks allowed once metadata with this fabric type is added without manual cleanups in databases. [#10472](https://github.com/datahub-project/datahub/pull/10472)
For full details on breaking changes, please refer to the updating DataHub documentation.
Contributors
A big thank you to all our contributors for this release!
First-Time Contributors
@bouaouda-achraf, @camilogutierrez, @dotan-mor, @egemenberk, @erikkvale, @guyr-ziprecruiter, @ishtartec, @jonasHanhan, @mrjefflewis, @noggi, @olgapenedo, @paguos, @richenc, @Rosmirose, @sagar-salvi-apptware, @timothyjin
Repeat Contributors
@ajoymajumdar, @deepgarg-visa, @dushayntAW, @filipe-caetano-ovo, @gaurav2733, @kevin1chun, @ksrinath, @Masterchen09, @mayurinehate, @ms32035, @Nelvin73, @rtekal, @sgomezvillamor, @shubhamjagtap639, @siladitya2, @skrydal
DataHub Maintainers
@anshbansal, @asikowitz, @chriscollins3456, @darnaut, @david-leifker, @eboneil, @gabe-lyons, @hsheth2, @jayacryl, @jjoyce0510, @RyanHolstien, @shirshanka , @sid-acryl, @treff7es, @yoonhyejin
Thank you all for your hard work and contributions!
What's Changed
- fix(ingest/bigquery): Supporting lineage extraction in case the select query result's target table is set on job by @treff7es in https://github.com/datahub-project/datahub/pull/10191
- fix(retention): fix time-based retention by @trialiya in https://github.com/datahub-project/datahub/pull/10118
- feat(lineage): give via and paths in entity lineage response by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10192
- fix(ingestion/datahub): implemented the filter to ignore/include URN for ingestion by @dushayntAW in https://github.com/datahub-project/datahub/pull/10174
- fix(ingestion/glue): fix to ingest the comment for partition key as description by @dushayntAW in https://github.com/datahub-project/datahub/pull/10189
- feat(ingest/looker): cleanup usage generation code by @hsheth2 in https://github.com/datahub-project/datahub/pull/10153
- fix(dev): fix env file overrides for profiles by @hsheth2 in https://github.com/datahub-project/datahub/pull/10194
- fix(ingestion/hive): ignore sampling for tagged column/table by @dushayntAW in https://github.com/datahub-project/datahub/pull/10096
- fix(ui/property): add collapse for long dataset properties by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10203
- saas release v0.3.1 release notes by @david-leifker in https://github.com/datahub-project/datahub/pull/10205
- fix(ingest/databricks): pin pandas for databricks ingestion by @mayurinehate in https://github.com/datahub-project/datahub/pull/10204
- Fixed issue where the custom defined aspects were missing from the API specification. by @ajoymajumdar in https://github.com/datahub-project/datahub/pull/10208
- feat(ingestion/transformer): Handle overlapping while mapping in extract ownership from tags transformer by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10201
- fix(build): avoid nested gradle commands by @hsheth2 in https://github.com/datahub-project/datahub/pull/10198
- feat(ingest/great_expectations): support in-memory (Pandas) data assets by @bouaouda-achraf in https://github.com/datahub-project/datahub/pull/9811
- ci(workflow): publish docker from pr with label by @david-leifker in https://github.com/datahub-project/datahub/pull/10193
- bump(version): bump classgraph version, add early package filter by @david-leifker in https://github.com/datahub-project/datahub/pull/10207
- fix(ingestion/mongodb): MongoDB source unable to parse datetimes with years > 9999 by @jonasHanhan in https://github.com/datahub-project/datahub/pull/10110
- fix(graphql-core): DomainEntitiesResolver does not support values FacetFilterInput parameter by @siladitya2 in https://github.com/datahub-project/datahub/pull/10188
- fix(graphql-core):Auto completion/suggestion of Domains are not working by @siladitya2 in https://github.com/datahub-project/datahub/pull/10150
- chore(usage-stats): measure time for getting buckets and aggregations by @darnaut in https://github.com/datahub-project/datahub/pull/10220
- test(search): introduce retry for search test by @david-leifker in https://github.com/datahub-project/datahub/pull/10206
- feat(ingest/bigquery): fix support for incremental column lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/10222
- fix(ingest/dbt): better dbt timestamp parsing by @hsheth2 in https://github.com/datahub-project/datahub/pull/10223
- feat(ingest/sql): normalize bigquery partitioned tables when parsing by @hsheth2 in https://github.com/datahub-project/datahub/pull/10224
- docs: fix feedback button design by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10182
- docs: add discourse to community tab by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10181
- docs: edit the text and destination for sign up link by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10183
- fix(ingestion/datahub): moved urn_pattern config to source config by @dushayntAW in https://github.com/datahub-project/datahub/pull/10215
- fix(ingestion/airflow-plugin): ingesting the tags along with the data by @dushayntAW in https://github.com/datahub-project/datahub/pull/10216
- fix(ingest): suppress all column-level parsing errors by @hsheth2 in https://github.com/datahub-project/datahub/pull/10211
- fix(ci): unified workflow login logic by @david-leifker in https://github.com/datahub-project/datahub/pull/10235
- fix(lineage): fix lighting cache dataJob platform by @david-leifker in https://github.com/datahub-project/datahub/pull/10233
- feat(vianode): v3 of cll via datajob update by @david-leifker in https://github.com/datahub-project/datahub/pull/10221
- chore(build): bump actions versions by @david-leifker in https://github.com/datahub-project/datahub/pull/10240
- fix(ingest): avoid requiring sqlalchemy for dynamodb classification by @hsheth2 in https://github.com/datahub-project/datahub/pull/10213
- docs(cli/init): make datahub init docs more clear by @gabe-lyons in https://github.com/datahub-project/datahub/pull/10245
- feat(ingest/redshift): filter out system queries from usage by @hsheth2 in https://github.com/datahub-project/datahub/pull/10247
- feat(gql): support operationName by @hsheth2 in https://github.com/datahub-project/datahub/pull/10210
- fix(frontend): fix frontend script used in release checklist by @david-leifker in https://github.com/datahub-project/datahub/pull/10243
- docs(init): Update entrypoints.py to be more clear about acryl init by @gabe-lyons in https://github.com/datahub-project/datahub/pull/10248
- fix(airflow): disable OL regardless of plugin status by @hsheth2 in https://github.com/datahub-project/datahub/pull/10250
- fix(ingestion/salesforce): added additional check for description by @dushayntAW in https://github.com/datahub-project/datahub/pull/10239
- feat(api): Add description parameter to editable dataset change entity event by @eboneil in https://github.com/datahub-project/datahub/pull/10237
- fix(ingest/bigquery): fix lineage if multiple sql expression passed in and destination table set by @treff7es in https://github.com/datahub-project/datahub/pull/10212
- feat(ingest/nifi): ingest process group as browse path v2, incremental lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/10202
- fix publish-datahub-jars workflow by @david-leifker in https://github.com/datahub-project/datahub/pull/10244
- fix(ingest/unity): Fix bug around unity notebook ingestion by @asikowitz in https://github.com/datahub-project/datahub/pull/10253
- feat(ingest/cockroachdb): add cockroachdb ingestion by @dotan-mor in https://github.com/datahub-project/datahub/pull/10226
- feat(ingestion/bigquery): support patterns for label -> tag capture by @olgapenedo in https://github.com/datahub-project/datahub/pull/10146
- feat(ingest/fivetran): use emails in owner user urns by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10229
- feat(cli): Make yaml loaders compatible with pydantic v2 by @eboneil in https://github.com/datahub-project/datahub/pull/10257
- fix(ingest): support pydantic v2 with properties subcommand by @hsheth2 in https://github.com/datahub-project/datahub/pull/10256
- feat(ingestion): Add
-e
flag touv
command in ingestion Dockerfiles by @skrydal in https://github.com/datahub-project/datahub/pull/10114 - fix(quickstart): remove unneeded init.sql by @darnaut in https://github.com/datahub-project/datahub/pull/10266
- fix(ingestion/airflow-plugin): replace deprecated calls by @ms32035 in https://github.com/datahub-project/datahub/pull/10238
- build(deps): bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/10109
- fix(metadata-io):Recently viewed, Recently Edited and Recently Searched section is missing in datahub home page by @siladitya2 in https://github.com/datahub-project/datahub/pull/10234
- Update datahub-executor docs by @noggi in https://github.com/datahub-project/datahub/pull/10263
- feat(access): Improve external role retrieval by @filipe-caetano-ovo in https://github.com/datahub-project/datahub/pull/10160
- fix(openapi): fix structured properties mapping by @david-leifker in https://github.com/datahub-project/datahub/pull/10260
- fix(authorization): fix restricted entity privmitives by @david-leifker in https://github.com/datahub-project/datahub/pull/10265
- fix(ingest/mongodb): schema_metadata referenced before assignment by @sid-acryl in https://github.com/datahub-project/datahub/pull/10169
- feat(ui/folder-structure-sort): sort folder structure alphabetically by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10268
- feat(ui/ingestion): add pagination on ingestion executions by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10269
- feat(access): Experimental policy debugger by @anshbansal in https://github.com/datahub-project/datahub/pull/9833
- feat(docs) Update updating-datahub.md for GA4 analytics change by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10196
- fix(docs): update docs for SchemaField.label by @hsheth2 in https://github.com/datahub-project/datahub/pull/10251
- feat(ingest): show custom model info by @hsheth2 in https://github.com/datahub-project/datahub/pull/10259
- fix(ingest/bigquery): Adding way to change api's batch size on schema init by @treff7es in https://github.com/datahub-project/datahub/pull/10255
- feat(ingest/mode): Mode improvements: by @treff7es in https://github.com/datahub-project/datahub/pull/10273
- fix(ingestion/powerbi): patch column lineage for powerbi report by @dushayntAW in https://github.com/datahub-project/datahub/pull/10270
- fix(ingestion/lite): An index with the name aspect_idxalready exists … by @jonasHanhan in https://github.com/datahub-project/datahub/pull/10267
- feat(ingest/looker): browse path followups by @mayurinehate in https://github.com/datahub-project/datahub/pull/10217
- fix: revert signup page by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10282
- feat: add posts to quickstart sample data by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10276
- fix(ingestion/transformer): tranformer to replace the externalUrl in dataset properties by @dushayntAW in https://github.com/datahub-project/datahub/pull/10281
- fix(ingestion/csv): add to support multiple ownership type for the sa… by @dushayntAW in https://github.com/datahub-project/datahub/pull/10287
- docs: update welcome acryl doc by @anshbansal in https://github.com/datahub-project/datahub/pull/10280
- feat(ui/backend/openapi/docs) : Add support for Business Attributes by @deepgarg-visa in https://github.com/datahub-project/datahub/pull/9863
- feat(ingest/sigma): Sigma connector integration by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10037
- feat(graph-retriever): implement graph retriever by @david-leifker in https://github.com/datahub-project/datahub/pull/10241
- fix(ingestion/scheduler): add extraArgs support for Ingestion Scheduler (e.g. for extra pip packages) by @Nelvin73 in https://github.com/datahub-project/datahub/pull/10195
- fix(spring): refactor spring configuration by @david-leifker in https://github.com/datahub-project/datahub/pull/10290
- fix(ingest): improve performance of get_allowed_list in AllowDenyPattern when dealing with large lists by @Masterchen09 in https://github.com/datahub-project/datahub/pull/10219
- fix(oidc settings): use correct path for preferredJwsAlgorithm by @darnaut in https://github.com/datahub-project/datahub/pull/10302
- chore(ingest/presto-on-hive): Renaming presto-on-hive to hive-metastore source by @treff7es in https://github.com/datahub-project/datahub/pull/10278
- fix(ingest): disallow src.* imports, fix powerbi/sigma by @hsheth2 in https://github.com/datahub-project/datahub/pull/10292
- Cc fix broken cll impact analysis by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10303
- docs: add content describing diff between datahub and acryl datahub by @shirshanka in https://github.com/datahub-project/datahub/pull/10301
- docs: versions bump for 0.13.1 by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10285
- doc(gms/scim): SCIM API user guide by @sid-acryl in https://github.com/datahub-project/datahub/pull/10311
- chore(docker): bump kafka docker base image by @david-leifker in https://github.com/datahub-project/datahub/pull/10313
- fix(ui) Show edited field descriptions in schema table by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10314
- chore(pyiceburg): set minimum version by @david-leifker in https://github.com/datahub-project/datahub/pull/10318
- fix(ingest/tableau): handle very large filter queries by @mayurinehate in https://github.com/datahub-project/datahub/pull/10295
- fix(ingest/databricks): handle and report config parse failure, updat… by @mayurinehate in https://github.com/datahub-project/datahub/pull/10261
- feat(ingest/airflow): support disabling iolet materialization by @hsheth2 in https://github.com/datahub-project/datahub/pull/10305
- feat(ingest/sigma): fix stateful ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/10321
- fix(ingest/profiling): compute sample row count correctly by @hsheth2 in https://github.com/datahub-project/datahub/pull/10319
- fix(ingest/transformers): Use set to store tags in AddDatasetTags by @asikowitz in https://github.com/datahub-project/datahub/pull/10317
- feat(views): apply views to homepage entity counts & recommendations by @ksrinath in https://github.com/datahub-project/datahub/pull/10283
- fix(ingest): make gms url configuration resilient in rest emitter by @anshbansal in https://github.com/datahub-project/datahub/pull/10316
- feat(ingest/profiling): allow unique count queries to be combined by @hsheth2 in https://github.com/datahub-project/datahub/pull/10322
- fix(ingest/kafka): clarify meta-mapping docs by @hsheth2 in https://github.com/datahub-project/datahub/pull/10320
- feat(ingest): materialize terms produced by ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/10249
- openapi-v3 by @david-leifker in https://github.com/datahub-project/datahub/pull/9550
- chore(kafka-setup): bump kafka version by @david-leifker in https://github.com/datahub-project/datahub/pull/10329
- fix: make next as default version & create redirection by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10309
- feat(ui/tasks): add pagination on tasks listing page by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10293
- feat(ingest): mark acryl cloud package first-party for logging by @hsheth2 in https://github.com/datahub-project/datahub/pull/10334
- feat(ingest/classify): add pip dependency by @hsheth2 in https://github.com/datahub-project/datahub/pull/10335
- feat(ingest/metabase): add ability to exclude other users collections by @paguos in https://github.com/datahub-project/datahub/pull/10330
- chore(metadata) Addressing vulnerabilities by @rtekal in https://github.com/datahub-project/datahub/pull/10296
- fix(ingest/bigquery): set default
max_overflow
to -1 by @treff7es in https://github.com/datahub-project/datahub/pull/10342 - fix(auth-impl): handle empty entities in field resolver providers by @david-leifker in https://github.com/datahub-project/datahub/pull/10341
- feat(ingest): bump acryl-sqlglot dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/10343
- fix(ingestion/transformer): updated transformer to avoid duplicating … by @dushayntAW in https://github.com/datahub-project/datahub/pull/10348
- feat(schema-registry): exclude schema reg onboot check from schema re… by @david-leifker in https://github.com/datahub-project/datahub/pull/10349
- fix(ingest/starburst): parse create_time datetime format by @ishtartec in https://github.com/datahub-project/datahub/pull/10345
- test(ingestion/sigma): Add integration test cases by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10356
- fix(ingestion/salesforce): escape markdown char for multiline description by @dushayntAW in https://github.com/datahub-project/datahub/pull/10351
- fix(mae): fix mae standalone platform consumer by @david-leifker in https://github.com/datahub-project/datahub/pull/10352
- fix(ingestion/qlik): Unable to ingest more than ten spaces by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10228
- docker(ingestion-base): set certificate location for python by @david-leifker in https://github.com/datahub-project/datahub/pull/10364
- build(ci): unified workflow update 1 by @david-leifker in https://github.com/datahub-project/datahub/pull/10353
- feat(ui): Adding new analytics charts for new users, top users past month by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/10344
- feat(ingestion/bigquery): support for table clones by @camilogutierrez in https://github.com/datahub-project/datahub/pull/10274
- build(ingest): update base requirements file by @anshbansal in https://github.com/datahub-project/datahub/pull/10368
- feat(ingest/mssql): improve docs on using odbc by @mrjefflewis in https://github.com/datahub-project/datahub/pull/10370
- feat(ingest/dbt): handle complex dbt sql + improve docs by @hsheth2 in https://github.com/datahub-project/datahub/pull/10323
- feat:(entity-registry): add ability to search for float and double by @Rosmirose in https://github.com/datahub-project/datahub/pull/10324
- fix(hazelcast): fix cache value classloading by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10373
- docs(business-attribute):add info businessAttributeEntityEnable flag by @deepgarg-visa in https://github.com/datahub-project/datahub/pull/10379
- fix(ingest/bigquery): map date types correctly by @hsheth2 in https://github.com/datahub-project/datahub/pull/10383
- feat(ingest/dbt): use columns from manifest as a fallback by @hsheth2 in https://github.com/datahub-project/datahub/pull/10374
- fix(ingest/profiling): Filter tables early based on profile pattern filter by @treff7es in https://github.com/datahub-project/datahub/pull/10378
- feat(ingest/dbt): support a
datahub
section in meta mappings by @hsheth2 in https://github.com/datahub-project/datahub/pull/10371 - docs(observe): update docs for remote executor, databricks by @mayurinehate in https://github.com/datahub-project/datahub/pull/10393
- fix(graphql) Fix entity type filter clash with legacy filters by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10362
- fix(backend): do not lower-case dataset key parts i.e. data platform … by @ksrinath in https://github.com/datahub-project/datahub/pull/10385
- docs(search): document default search operator by @darnaut in https://github.com/datahub-project/datahub/pull/10397
- fix: add redirection for the past versions by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10395
- feat: add keywords for SEO by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10358
- docs: add slack utm component in docs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10214
- perf(ingestion/fivetran): Connector performance optimization by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10346
- feat(graphql): Improve logging of GraphQL requests by @darnaut in https://github.com/datahub-project/datahub/pull/10404
- fix(ingest): map bigquery nested types properly by @hsheth2 in https://github.com/datahub-project/datahub/pull/10409
- fix(ingestion/looker): fix lineage for dimension group column by @sid-acryl in https://github.com/datahub-project/datahub/pull/10382
- feat(metabase): add stateful ingestion by @paguos in https://github.com/datahub-project/datahub/pull/10360
- docs(apis): Update datahub-apis.md to add link to search example by @gabe-lyons in https://github.com/datahub-project/datahub/pull/10412
- feat(graphql): log query name if operation name is not provided by @darnaut in https://github.com/datahub-project/datahub/pull/10420
- DynamoDB IAM auth by @eboneil in https://github.com/datahub-project/datahub/pull/10419
- fix(ingest/bigquery): Fixing double sanitization of urns by @treff7es in https://github.com/datahub-project/datahub/pull/10386
- fix(ingestion/transformer): new transformer to clean user URN for DatasetUsageStatistics aspect by @dushayntAW in https://github.com/datahub-project/datahub/pull/10398
- fix(ingestion/airflow-plugin): emit the operation aspect by @dushayntAW in https://github.com/datahub-project/datahub/pull/10402
- feat(search): allow overriding case-sensitivity to zero by @david-leifker in https://github.com/datahub-project/datahub/pull/10422
- fix(ci): add labeled to list of pr types for ci by @david-leifker in https://github.com/datahub-project/datahub/pull/10363
- docs(ingest): update datahub sink doc to include an acryl example by @gabe-lyons in https://github.com/datahub-project/datahub/pull/10411
- feat(ui) Support rich text for form descriptions by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10425
- feat(auth): improve authentication flow logging by @darnaut in https://github.com/datahub-project/datahub/pull/10428
- feat(upgrade): common base for mcl upgrades by @david-leifker in https://github.com/datahub-project/datahub/pull/10429
- feat(search): autocomplete custom configuration by @david-leifker in https://github.com/datahub-project/datahub/pull/10426
- fix(upgrade): fix upgrade npe by @david-leifker in https://github.com/datahub-project/datahub/pull/10436
- fix(docker): use distinct empty env files by @hsheth2 in https://github.com/datahub-project/datahub/pull/10438
- feat(ingest/snowflake): use system sampling on very large tables by @hsheth2 in https://github.com/datahub-project/datahub/pull/10430
- fix(ingest/bigquery): remove last modified timestamp fallback by @hsheth2 in https://github.com/datahub-project/datahub/pull/10431
- feat(cli): cache sql parsing intermediates by @hsheth2 in https://github.com/datahub-project/datahub/pull/10399
- docs: fix blog link by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10441
- fix(ingestion/tableau): Fix tableau custom sql lineage gap by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10359
- fix(changeEvents): add description-parameter to the change-event of a schemaField-description by @ksrinath in https://github.com/datahub-project/datahub/pull/10414
- feat(ci): add linting for cypress tests by @anshbansal in https://github.com/datahub-project/datahub/pull/10424
- feat(spark/openlineage): Openlineage 1.13.1 upgrade by @treff7es in https://github.com/datahub-project/datahub/pull/10433
- feat(ingestion): Copy urns from previous checkpoint state on ingestion failure by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10347
- fix(ingest/snowflake): add more reporting for usage aggregation, handle lineage errors by @mayurinehate in https://github.com/datahub-project/datahub/pull/10279
- feat(docker): Enable and expose Jetty statistics by @darnaut in https://github.com/datahub-project/datahub/pull/10448
- fix(ingest/mode): Improve query lineage by @treff7es in https://github.com/datahub-project/datahub/pull/10284
- feat(ingest): add actorUrn for ingestion through UI by @anshbansal in https://github.com/datahub-project/datahub/pull/10447
- fix(ingestion/airflow-plugin): warning log for non-materialized iolets by @dushayntAW in https://github.com/datahub-project/datahub/pull/10421
- fix(ingestion/salesforce): handle the label with none value scenario by @dushayntAW in https://github.com/datahub-project/datahub/pull/10446
- fix(ingestion): Explicitly set requirement on snowflake-connector-python to be newer or equal to 3.4.0 by @skrydal in https://github.com/datahub-project/datahub/pull/10445
- perf(ingest): speed up urn encode happy path by @hsheth2 in https://github.com/datahub-project/datahub/pull/10451
- feat(ingest/tableau): Fetch Upstreams From Columns by @egemenberk in https://github.com/datahub-project/datahub/pull/9874
- docs(ingest): fix typos and clarify ingestion recipe docs by @guyr-ziprecruiter in https://github.com/datahub-project/datahub/pull/10405
- fix(patch): update json patch library by @david-leifker in https://github.com/datahub-project/datahub/pull/10449
- fix(metadata-service): add PE processor to component scan by @darnaut in https://github.com/datahub-project/datahub/pull/10462
- fix(ingestion/airflow-plugin): bumping up the openlineage-airflow version by @dushayntAW in https://github.com/datahub-project/datahub/pull/10457
- fix(ingest/tableau): catch exception during sign out by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/10459
- fix(ingest/dbt): failures due to API change by @anshbansal in https://github.com/datahub-project/datahub/pull/10467
- fix(ingestion/kafka-connect): fixed the issue with ingestion requiring multiple substitutes by @dushayntAW in https://github.com/datahub-project/datahub/pull/10443
- feat(ingest/cli): add some URNs per aspect for easier debugging by @anshbansal in https://github.com/datahub-project/datahub/pull/10468
- fix(ingest/dbt): Adding fix if dbt data type is null by @treff7es in https://github.com/datahub-project/datahub/pull/10471
- fix(docs): adjust new requirements for DynamoDB ingestion by @darnaut in https://github.com/datahub-project/datahub/pull/10470
- feat(ingest/redshift): add timers for lineage v2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/10460
- feat(fabricType): add fabric type RVW by @eboneil in https://github.com/datahub-project/datahub/pull/10472
- feat(structured-properties): immutable flag by @david-leifker in https://github.com/datahub-project/datahub/pull/10461
- fix(docker): mount newly added jetty-jmx.xml by @darnaut in https://github.com/datahub-project/datahub/pull/10475
- feat(plugins): spring custom plugins by @david-leifker in https://github.com/datahub-project/datahub/pull/10389
- docs(impact analysis): Add column level impact analysis graphql example by @gabe-lyons in https://github.com/datahub-project/datahub/pull/10427
- fix(entity-registry): fix plugin load error by @david-leifker in https://github.com/datahub-project/datahub/pull/10476
- fix(openapi): fix lookupAspectSpec by @david-leifker in https://github.com/datahub-project/datahub/pull/10478
- fix(openapi-v3): comprehensive aspect name casing fix by @david-leifker in https://github.com/datahub-project/datahub/pull/10484
- feat(ingest/slack): Support profile ingestion using users:info by @asikowitz in https://github.com/datahub-project/datahub/pull/10410
- docs: fix docs utms & slack footer by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10481
- feat(docs): Updating assertion docs + adding schema assertion doc by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/10473
- fix(misc): misc fixes for OSS release by @david-leifker in https://github.com/datahub-project/datahub/pull/10493
- docs: sort feature section alphabetically by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10400
- docs(developers): add section regarding symbolic links on Windows 10/11 to developer's guide by @Masterchen09 in https://github.com/datahub-project/datahub/pull/10487
- fix(ingestion/transformer): Add dataset domains based on tags using transformer by @sagar-salvi-apptware in https://github.com/datahub-project/datahub/pull/10458
- chore(ingest/presto-on-hive) Set enable_properties_merge to True by default by @dushayntAW in https://github.com/datahub-project/datahub/pull/10469
- fix(ci): documentation build fix by @anshbansal in https://github.com/datahub-project/datahub/pull/10507
- docs: 0.3.2 Acryl by @anshbansal in https://github.com/datahub-project/datahub/pull/10377
- feat(ingest/tableau): support platform instance mapping based off database server hostname by @richenc in https://github.com/datahub-project/datahub/pull/10254
- fix(ingestion/looker): deduplicate the view field by @sid-acryl in https://github.com/datahub-project/datahub/pull/10482
- fix(graphql): Support querying Posts and Queries by @asikowitz in https://github.com/datahub-project/datahub/pull/10502
- fix(ebean): fix auto-closeable ebean dao streams by @david-leifker in https://github.com/datahub-project/datahub/pull/10506
- feat(ingest/airflow): support BigQueryInsertJobOperator by @hsheth2 in https://github.com/datahub-project/datahub/pull/10452
- fix(ingest): avoid using
_inner_dict
in urn iterator by @hsheth2 in https://github.com/datahub-project/datahub/pull/10492 - fix(ingest/snowflake): use block sampling more conservatively by @hsheth2 in https://github.com/datahub-project/datahub/pull/10494
- feat(sdk): add DataHubGraph.get_timeseries_values() method by @hsheth2 in https://github.com/datahub-project/datahub/pull/10501
- fix(mcp): fix mcp key aspect by @david-leifker in https://github.com/datahub-project/datahub/pull/10503
- fix(ingest): fix bug in incremental lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/10515
- chore(ingest): run pyupgrade for python 3.8 by @hsheth2 in https://github.com/datahub-project/datahub/pull/10513
- docs: update cli recommendation by @anshbansal in https://github.com/datahub-project/datahub/pull/10518
- Wrap non-required $ref properties in an object to mark as nullable by @timothyjin in https://github.com/datahub-project/datahub/pull/10514
- Fix formatting for #10514 by @timothyjin in https://github.com/datahub-project/datahub/pull/10525
- feat(ingestion/glue): delta schemas by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/10299
- fix(ingestion/snowflake): fix dataclass defaults for SnowflakeReport by @ms32035 in https://github.com/datahub-project/datahub/pull/10529
- Security/CWE 200 graphql introspection toggle by @erikkvale in https://github.com/datahub-project/datahub/pull/10531
- feat(neo4j): neo4j pagination as per v2 scrollApi for related entities by @deepgarg-visa in https://github.com/datahub-project/datahub/pull/10537
- docs: add api templates by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10521
- fix(ingestion/powerbi): handle special character #(tab) in native query parsing by @sid-acryl in https://github.com/datahub-project/datahub/pull/10520
- OpenAPI v3 Spec bug fixes: by @kevin1chun in https://github.com/datahub-project/datahub/pull/10548
- fix(assertions) aligned graphql AssertionType definition with the AssertionType defined in metadata-models by @jayacryl in https://github.com/datahub-project/datahub/pull/10534
- fix(smoke-test): pin requests to 2.31.0 by @darnaut in https://github.com/datahub-project/datahub/pull/10549
- fix(ingest/dbt): improve handling for CLL via ephemeral nodes by @hsheth2 in https://github.com/datahub-project/datahub/pull/10535
- feat(connections) Add Connection entity type and graphql endpoints by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10550
- doc(gms/scim-api): fix title and add overview by @sid-acryl in https://github.com/datahub-project/datahub/pull/10388
- docs: add guides on forms & structured properties by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10340
- fix(graphl): fix introspection setting by @david-leifker in https://github.com/datahub-project/datahub/pull/10560
- feat(ingest): bump acryl-sqlglot dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/10554
- feat(ingest): auto-fix duplicate schema fieldPaths by @hsheth2 in https://github.com/datahub-project/datahub/pull/10526
- refactor(ingest): defer ctx.graph initialization by @hsheth2 in https://github.com/datahub-project/datahub/pull/10504
- consider all values of FabricType enum in DatasetUrn util by @kevin1chun in https://github.com/datahub-project/datahub/pull/10564
- fix(ingest/airflow): fix support for bigquery insert job operator by @hsheth2 in https://github.com/datahub-project/datahub/pull/10567
- fix(ingest/mode): Adding Dashboards into containers by @treff7es in https://github.com/datahub-project/datahub/pull/10563
- feat: update lineage feature guide by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10401
- docs: improve lineage docs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10396
- fix(ingestion/powerbi): Databricks support for table lineage by @sid-acryl in https://github.com/datahub-project/datahub/pull/10416
- fix(ingest/dbt): resolve more dbt ephemeral node lineage gaps by @hsheth2 in https://github.com/datahub-project/datahub/pull/10553
- fix(ui) Fix preventing users from deleting personal views by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10510
- fix(lint): fix linting by @david-leifker in https://github.com/datahub-project/datahub/pull/10572
- build(jar): enable custom plugin lib by @david-leifker in https://github.com/datahub-project/datahub/pull/10552
New Contributors
- @bouaouda-achraf made their first contribution in https://github.com/datahub-project/datahub/pull/9811
- @jonasHanhan made their first contribution in https://github.com/datahub-project/datahub/pull/10110
- @dotan-mor made their first contribution in https://github.com/datahub-project/datahub/pull/10226
- @olgapenedo made their first contribution in https://github.com/datahub-project/datahub/pull/10146
- @paguos made their first contribution in https://github.com/datahub-project/datahub/pull/10330
- @ishtartec made their first contribution in https://github.com/datahub-project/datahub/pull/10345
- @camilogutierrez made their first contribution in https://github.com/datahub-project/datahub/pull/10274
- @mrjefflewis made their first contribution in https://github.com/datahub-project/datahub/pull/10370
- @Rosmirose made their first contribution in https://github.com/datahub-project/datahub/pull/10324
- @guyr-ziprecruiter made their first contribution in https://github.com/datahub-project/datahub/pull/10405
- @sagar-salvi-apptware made their first contribution in https://github.com/datahub-project/datahub/pull/10458
- @timothyjin made their first contribution in https://github.com/datahub-project/datahub/pull/10514
- @erikkvale made their first contribution in https://github.com/datahub-project/datahub/pull/10531
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.13.1...v0.13.3
v0.13.2
Released on 2024-04-16 by @david-leifker.
Hotfix Release
Fixes MCL message deserialization bug when using internal schema registry and running specific upgrade jobs.
policyFields (enabled by default):
BOOTSTRAP_SYSTEM_UPDATE_POLICY_FIELDS_ENABLED:true
dataJobNodeCLL (disabled by default):
BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_ENABLED:false
Example Error:
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 1
Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 13 out of bounds for length 2
at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:460)
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:283)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:188)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:161)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:260)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:248)
Recovery Directions:
If currently affected, please remove the topic prior to upgrading to v0.13.2 to remove the corrupted message. The default topic name is MetadataChangeLog_Versioned_v1
however if you've customized the topic name be sure to remove that topic.
If running kafka per the example Helm chart for prerequisites the following command will delete the topic.
kubectl exec -it prerequisites-kafka-broker-0 -c kafka -- kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic MetadataChangeLog_Versioned_v1
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.13.1...v0.13.2
v0.13.1
Released on 2024-04-02 by @david-leifker.
DataHub Release Notes
User Experience
- Capture and Manage Common Joins between Datasets: Users can now view and manage common join relationships between datasets, making it easier than ever to capture best practices and bespoke join logic. Watch the walkthrough here! 8325
- Head's up: you'll need to enable the
ER_MODEL_RELATIONSHIP_FEATURE_ENABLED
env variable to use this feature!
- Head's up: you'll need to enable the
- Enhanced UI Interactions: Users can now enjoy an improved markdown editor and filter policies by active/inactive statuses, resulting in a more intuitive and manageable interface. 9949, 9958
- Visual Context for Groups: You can now include picture links for groups in the UI, adding a richer visual context and enhancing the navigational experience. 9882
- Improved Error Visibility: The UI now displays error messages related to data size limitations, allowing for better troubleshooting and user experience. 10038
Developer Experience
- Enhanced Kafka Compatibility: Updated client version for Kafka setup ensures better compatibility and functionality for developers. 9962
- Optimized Docker Build: Docker setups now respect pip mirrors, optimizing the build process especially in restricted network environments. 9963
- Advanced Error Handling: New error handling for duplicate class names and improved
fspath
lint error management enhance the code reliability and quality. 9960, 9976 - Latest OpenSearch Image: Incorporation of OpenSearch image version 2.11.0 aligns with the latest stable releases, boosting performance and security. 9984
Metadata Ingestion
- NEW: Dagster Integration: You can now seamlessly ingest your Dagster Pipelines, Jobs, Ops, and lineage into DataHub. 10071
- Expanded Field Classification Support: This release introduces support for field-level classification during ingestion for Redshift, BigQuery, DynamoDB, and SQL Sources. 10013, 10031
- Enhanced Ingestion Capabilities: DataHub now offers stateful ingestion by default, optimizing routines for REST sinks and improving metadata accuracy across diverse sources like dbt and BigQuery. 9934, 10158, 10080
- Better Data Lineage: This release introduced support for Openlineage in service of the Spark Lineage Beta Plugin; additionally, we now support incremental Column-Level Lineage, improving the accuracy of detecting column-level relationships during ingestion.9870, 9967, 10090
- Schema Clarity: New descriptions support for JSON schema arrays and a mechanism to escape special characters in BigQuery table descriptions aid in clearer schema validation and ingestion processes. Databricks ingestion now supports Hive Metastore schemas with special characters. 9757, 9932, 10049
Version Upgrades
- Kafka client and OpenSearch image were updated to the latest versions.
Breaking Changes
This release introduces default settings for stateful ingestion and updates in handling dbt ingestion. For details on all breaking changes, view the full documentation here.
Contributors
MASSIVE shoutout to our contributors!
First-Time Contributors
akarsh991, alexs-101, AvaniSiddhapuraAPT, diegmonti, dushayntAW, filipe-caetano-ovo, HuanjieGuo, jayacryl, k7ragav, kopax-polyconseil, LePuppy, Nelvin73, pinakipb2, poorvi767, rae89, trialiya, valeral.
Repeat Contributors
ANich, shubhamjagtap639, sgomezvillamor, siladitya2, skrydal, sumitappt, Masterchen09, mayurinehate, ngamanda, gaurav2733, githendrik, jayasimhankv.
DataHub Maintainers
anshbansal, asikowitz, chriscollins3456, darnaut, david-leifker, eboneil, ethan-cartwright, gabe-lyons, hsheth2, pedro93, RyanHolstien, treff7es, yoonhyejin.
What's Changed
- bump(kafka-setup): client version bump by @david-leifker in https://github.com/datahub-project/datahub/pull/9962
- feat(ingest): throw codegen error on duplicate class names by @hsheth2 in https://github.com/datahub-project/datahub/pull/9960
- feat(docker): respect pip mirrors with uv by @hsheth2 in https://github.com/datahub-project/datahub/pull/9963
- Openlineage endpoint and Spark Lineage Beta Plugin by @treff7es in https://github.com/datahub-project/datahub/pull/9870
- fix(ingest/json-schema): adding support descriptions for array by @AvaniSiddhapuraAPT in https://github.com/datahub-project/datahub/pull/9757
- fix(ingest/redshift): fix bug in lineage v2 table renames by @hsheth2 in https://github.com/datahub-project/datahub/pull/9967
- feat(ingest): speed up to_obj() and validate() by @hsheth2 in https://github.com/datahub-project/datahub/pull/9969
- feat(ingest): fix fspath lint error by @hsheth2 in https://github.com/datahub-project/datahub/pull/9976
- docs: archive old version before 0.12.0 & fix broken links by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9957
- fix(ui/markdown-editor): arrows change field when editing description… by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9949
- feat(ui/policies): add filter for Active/Inactive/All on policy page by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9958
- feat(ui): add option to add picture link for groups by @akarsh991 in https://github.com/datahub-project/datahub/pull/9882
- feat(ingest): add Looks subtype + stop reemitting browsePathV2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/9978
- fix(ingest/bigquery): escape special characters for table descriptions by @AvaniSiddhapuraAPT in https://github.com/datahub-project/datahub/pull/9932
- feat(ui): add loading spin to access management table by @filipe-caetano-ovo in https://github.com/datahub-project/datahub/pull/9974
- fix(ingestion/fivetran): Fix fivetran get connector jobs bug by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/9975
- feat(ingest/dbt): generate CLL for all node types by @hsheth2 in https://github.com/datahub-project/datahub/pull/9964
- chore(search): bump OpenSearch image version to 2.11.0 by @darnaut in https://github.com/datahub-project/datahub/pull/9984
- feat(ingest): enable stateful_ingestion by default for DataHub rest sink by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/9934
- feat(ingestion/cli): Adding check option to validate allow/deny and path_specs by @treff7es in https://github.com/datahub-project/datahub/pull/9983
- fix(ingest): only import PathSpec when necessary by @hsheth2 in https://github.com/datahub-project/datahub/pull/9989
- feat(config): add configuration to reprocess UI sourced events by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9988
- feat(pluginRegistry): add configuration to reduce runnable frequency by @RyanHolstien in https://github.com/datahub-project/datahub/pull/9990
- build(react): Fix typescript errors in test files by @sumitappt in https://github.com/datahub-project/datahub/pull/9982
- feat(docs): disable last update timestamps by @hsheth2 in https://github.com/datahub-project/datahub/pull/9987
- feat: add versioned content for 0.12.1 by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9944
- doc: add version 0.13.0 by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9991
- fix: fix mobile view and subtitles on slack/calendar page by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9822
- fix(ingest/redshift): fix stl scan lineage for lineage v2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/9986
- fix(ingest/delta-lake): support parsing nested types correctly by @dushayntAW in https://github.com/datahub-project/datahub/pull/9862
- fix(test): nested domains by @david-leifker in https://github.com/datahub-project/datahub/pull/9993
- fix(ci): refactor build-and-test command by @hsheth2 in https://github.com/datahub-project/datahub/pull/9999
- feat(ingest/snowflake): generate query nodes for snowflake by @mayurinehate in https://github.com/datahub-project/datahub/pull/9966
- fix(ingest/unity): creating group urn in case of group by @dushayntAW in https://github.com/datahub-project/datahub/pull/9951
- fix(ui/left-side-bar): hide data products option in left side bar by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10001
- feat(ingest/redshift): make query generation configurable by @hsheth2 in https://github.com/datahub-project/datahub/pull/10000
- fix(opensearch): Rollover usage events at a file size rather than time-based manner by @darnaut in https://github.com/datahub-project/datahub/pull/10006
- chore(java): bump java dependency versions by @david-leifker in https://github.com/datahub-project/datahub/pull/10009
- ci(react): Update package.json to enable lint check by @sumitappt in https://github.com/datahub-project/datahub/pull/10011
- fix(ui/ingest): trim leading and trailing whitespaces from the text f… by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10012
- fix(policy-backfull): fix policy backfill job by @david-leifker in https://github.com/datahub-project/datahub/pull/10016
- feat(opensearch): support for updating ISM policy used for usage events by @darnaut in https://github.com/datahub-project/datahub/pull/10018
- refactor(react): Provide option to skip importing theme in CustomThemeProvider; rearrange toplevel components by @asikowitz in https://github.com/datahub-project/datahub/pull/9940
- fix(openapi): fix openapi openlineage endpoint by @david-leifker in https://github.com/datahub-project/datahub/pull/10019
- feat(ingest): update sqlglot fork by @hsheth2 in https://github.com/datahub-project/datahub/pull/10022
- feat(ingest/superset): map awsathena platform name to athena by @LePuppy in https://github.com/datahub-project/datahub/pull/10005
- fix(ingest/redshift): patch instead of replace redshift custom properties by @ethan-cartwright in https://github.com/datahub-project/datahub/pull/9293
- fix(ingest/slack): tweak docs for slack source by @hsheth2 in https://github.com/datahub-project/datahub/pull/10007
- fix(ingest): use contextvar for cooperative timeout by @hsheth2 in https://github.com/datahub-project/datahub/pull/10021
- feat(ingest): improve custom package metadata by @hsheth2 in https://github.com/datahub-project/datahub/pull/9985
- feat(docs): build website using swc-loader instead of babel by @hsheth2 in https://github.com/datahub-project/datahub/pull/9977
- feat(ingest): add query formatting to sql aggregator by @hsheth2 in https://github.com/datahub-project/datahub/pull/10025
- feat(ingest): add DataHubGraph.emit_all method by @hsheth2 in https://github.com/datahub-project/datahub/pull/10002
- feat(ingestion): Support for Server-less Redshift by @skrydal in https://github.com/datahub-project/datahub/pull/9998
- fix(ingest/teradata): small teradata improvements by @treff7es in https://github.com/datahub-project/datahub/pull/9953
- feat(ingest): add classification for sql sources by @mayurinehate in https://github.com/datahub-project/datahub/pull/10013
- docs(monitoring): add health check endpoint by @kopax-polyconseil in https://github.com/datahub-project/datahub/pull/10033
- feat(ingest/dbt): capture both raw and compiled code by @hsheth2 in https://github.com/datahub-project/datahub/pull/10026
- fix(ingest/redshift): Temp table lineage fix by @treff7es in https://github.com/datahub-project/datahub/pull/10008
- feat(ingest): utilities for query logs by @hsheth2 in https://github.com/datahub-project/datahub/pull/10036
- docs: add missing api sample docs by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9869
- feat(gms): add aspect name to siblings hook log by @hsheth2 in https://github.com/datahub-project/datahub/pull/10044
- feat(ingest): add classification to bigquery, redshift by @mayurinehate in https://github.com/datahub-project/datahub/pull/10031
- fix(ui/lineage): show data is too large error when limitation exceeds by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10038
- feat(ci): exempt more names from community by @mayurinehate in https://github.com/datahub-project/datahub/pull/10039
- docs: improve versiondropdown design & set docs main to /features by @yoonhyejin in https://github.com/datahub-project/datahub/pull/9994
- fix(ingest/redshift): tweak lineage v2 queries by @hsheth2 in https://github.com/datahub-project/datahub/pull/10045
- chore(aws-msk-iam-auth): bump dependency version by @darnaut in https://github.com/datahub-project/datahub/pull/10063
- feat(lineage): add priority to via node by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10034
- docs(acryl-cloud): notes for 0.2.16 by @anshbansal in https://github.com/datahub-project/datahub/pull/10069
- fix(ingest/unity-catalog): generate sibling and lineage by @dushayntAW in https://github.com/datahub-project/datahub/pull/9894
- fix(ingest): only auto-enable stateful ingestion if pipeline name is set by @hsheth2 in https://github.com/datahub-project/datahub/pull/10075
- feat(ingest/s3): set default spark version by @hsheth2 in https://github.com/datahub-project/datahub/pull/10057
- feat(ingest): better rest emitter error message by @hsheth2 in https://github.com/datahub-project/datahub/pull/10073
- docs(sdk): Update API guide with example for Acryl by @gabe-lyons in https://github.com/datahub-project/datahub/pull/10072
- feat(ingest): check for private import path usages by @hsheth2 in https://github.com/datahub-project/datahub/pull/10059
- feat(ingest): add sql formatter utility by @hsheth2 in https://github.com/datahub-project/datahub/pull/10064
- feat(ingest): refactor LineageConfig class by @hsheth2 in https://github.com/datahub-project/datahub/pull/10074
- feat(ingest/dbt): point dbt assertions at dbt nodes by @hsheth2 in https://github.com/datahub-project/datahub/pull/10055
- feat(dbt): show source and compiled code in the UI by @hsheth2 in https://github.com/datahub-project/datahub/pull/10028
- feat(ui/ingest): ingestion form for Okta and AzureAD by @gaurav2733 in https://github.com/datahub-project/datahub/pull/9829
- Update domains docs to include nested domains by @eboneil in https://github.com/datahub-project/datahub/pull/9890
- fix(ingestion): Handle Redshift string length limit in Serverless mode by @skrydal in https://github.com/datahub-project/datahub/pull/10051
- build(deps): bump follow-redirects from 1.15.4 to 1.15.6 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/10060
- build(deps): bump es5-ext from 0.10.62 to 0.10.63 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/9927
- fix(lineage): fix array out of bounds error by @david-leifker in https://github.com/datahub-project/datahub/pull/10081
- Add owners, tags, glossary terms to dataset yaml loader by @eboneil in https://github.com/datahub-project/datahub/pull/9859
- Add rate limiting to slack source by @eboneil in https://github.com/datahub-project/datahub/pull/10082
- fix(metadata-ingestion)glue connector failure when Optional field Type of PartitionKey is absent for a Table by @siladitya2 in https://github.com/datahub-project/datahub/pull/10052
- feat(redshift): adds flag to skip all external tables by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/10040
- feat(models) : Joins (Datasets) schema, resolvers and UI by @poorvi767 in https://github.com/datahub-project/datahub/pull/8325
- feat(properties) Add upsertStructuredProperties graphql endpoint for assets by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9906
- Clean up logic for dataset.py yaml loader by @eboneil in https://github.com/datahub-project/datahub/pull/10089
- feat(ingest/dbt): add option to skip sources by @hsheth2 in https://github.com/datahub-project/datahub/pull/10077
- feat(ingest): support incremental column-level lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/10090
- feat(ingest/powerbi): add chart subtypes by @hsheth2 in https://github.com/datahub-project/datahub/pull/10076
- fix(ingest/metabase): Use connect_uri instead of display_uri to query Metabase API by @diegmonti in https://github.com/datahub-project/datahub/pull/9996
- feat(tableau): ability to force extraction of table/column level linage from SQL queries by @alexs-101 in https://github.com/datahub-project/datahub/pull/9838
- feat(ingest/datahub-gc): gc source to cleanup things by @anshbansal in https://github.com/datahub-project/datahub/pull/10085
- docs(acryl-cloud): fix year in notes from 2023 to 2024 by @anshbansal in https://github.com/datahub-project/datahub/pull/10095
- feeat(openapi): add batch endpoint to v2 using requestbody by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10100
- fix(ingest/dbt): fix config validator for skip_sources_in_lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/10098
- docs: add gtm tag by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10083
- docs: add doc for assertions & data contracts by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10029
- test(ingest/mssql): use non-ephemeral mapping port by @hsheth2 in https://github.com/datahub-project/datahub/pull/10104
- fix(ingestion/unity-catalog): patch owners and properties by @dushayntAW in https://github.com/datahub-project/datahub/pull/10086
- fix(ingestion/transformer): added new transformer to cleanup suffix/prefix in owner URN by @dushayntAW in https://github.com/datahub-project/datahub/pull/10067
- fix(ui/user-group): add non existent entity page for user by @gaurav2733 in https://github.com/datahub-project/datahub/pull/10004
- fix(resolver): Allow users to add/remove related terms for children glossary terms by @pinakipb2 in https://github.com/datahub-project/datahub/pull/9895
- Increase role member count in listRoles query to 20 from 10 by @jayasimhankv in https://github.com/datahub-project/datahub/pull/10020
- fix(frontend): exclude plugins/frontend/auth/user.props config does not exist warnings from log by @Masterchen09 in https://github.com/datahub-project/datahub/pull/10043
- fix(ui): show dataset display name in browse paths v2 by @Masterchen09 in https://github.com/datahub-project/datahub/pull/10054
- fix(metrics): get fieldName for GraphQL Mutation queries by @trialiya in https://github.com/datahub-project/datahub/pull/9972
- feat(UI): disable access management ui when no roles are linked to entity by @githendrik in https://github.com/datahub-project/datahub/pull/9610
- ci(filters): add graphql code to backend trigger by @david-leifker in https://github.com/datahub-project/datahub/pull/10113
- test(urn): add test case by @david-leifker in https://github.com/datahub-project/datahub/pull/10112
- fix(ui) Add min width to the usage stats component by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/10056
- log(system-update): Update DataHubStartupStep.java by @david-leifker in https://github.com/datahub-project/datahub/pull/9971
- fix(usage-stats): usage-stats error handling and filter by @david-leifker in https://github.com/datahub-project/datahub/pull/10105
- fix(elasticsearch logging): log how long bulk execution took by @darnaut in https://github.com/datahub-project/datahub/pull/10116
- feat(auth): view authorization by @david-leifker in https://github.com/datahub-project/datahub/pull/10066
- fix(searchContext): fix search flag immutability by @david-leifker in https://github.com/datahub-project/datahub/pull/10117
- fix(ingest/looker): use
external_base_url
for explore url generation by @k7ragav in https://github.com/datahub-project/datahub/pull/10093 - feat(ingest/dagster): Dagster source by @treff7es in https://github.com/datahub-project/datahub/pull/10071
- fix(forms) Fix a couple of small inconsistencies with forms by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/9928
- fix: exclude Elasticsearch ignore_throttled warnings from log by @Masterchen09 in https://github.com/datahub-project/datahub/pull/10042
- Update build-and-test.yml by @david-leifker in https://github.com/datahub-project/datahub/pull/10127
- fix(mae-consumer): fix aspect retriever injections mae-consumer by @david-leifker in https://github.com/datahub-project/datahub/pull/10125
- fix(docs): fix docs build by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10129
- fix(search): respect the search flags term bucket size by @david-leifker in https://github.com/datahub-project/datahub/pull/10130
- fix(ingestProposal): fix/handle no-op ingestion by @david-leifker in https://github.com/datahub-project/datahub/pull/10126
- fix(ci): simplify python release process by @hsheth2 in https://github.com/datahub-project/datahub/pull/10133
- feat(lineage): add a parameter to allow limiting the per hop exploration of lineage search by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10062
- feat(ingest/bigquery): Respect dataset and table patterns when ingesting lineage via catalog api by @ANich in https://github.com/datahub-project/datahub/pull/10080
- feat(ingest): emit platform for query entities by @hsheth2 in https://github.com/datahub-project/datahub/pull/10103
- feat(ingest): loosen pyarrow dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/10141
- fix(ingest/dbt): respect
convert_column_urns_to_lowercase
in mapping CLL by @hsheth2 in https://github.com/datahub-project/datahub/pull/10132 - chore(ingestion-base): update base requirements by @david-leifker in https://github.com/datahub-project/datahub/pull/10142
- feat(ingest/dbt): dbt model performance by @hsheth2 in https://github.com/datahub-project/datahub/pull/9992
- fix(ingest/databricks): support hive metastore schemas with special char by @mayurinehate in https://github.com/datahub-project/datahub/pull/10049
- feat(ui): sort partition keys to the top of the table for better visibility by @ngamanda in https://github.com/datahub-project/datahub/pull/9959
- fix: OBS-729 | Filters: Fix alignment on nested dropdown by @sumitappt in https://github.com/datahub-project/datahub/pull/10140
- feat(ingest/dynamodb): add support for classification by @mayurinehate in https://github.com/datahub-project/datahub/pull/10138
- feat(incidents) incident resolution note more clearly displayed by @jayacryl in https://github.com/datahub-project/datahub/pull/10151
- fix(entity-client): fix entity client cache and test by @david-leifker in https://github.com/datahub-project/datahub/pull/10149
- chore(ingest): update doc & log detail by @HuanjieGuo in https://github.com/datahub-project/datahub/pull/10139
- feat(ingest): loosen airflow plugin dependencies requirements by @hsheth2 in https://github.com/datahub-project/datahub/pull/10106
- feat(ingest): fix validators by @hsheth2 in https://github.com/datahub-project/datahub/pull/10115
- feat(ingest/bigquery): improve debug logs by @hsheth2 in https://github.com/datahub-project/datahub/pull/10101
- fix(graphQL): Ignore soft-deleted assertions in UI calls by @pedro93 in https://github.com/datahub-project/datahub/pull/10148
- fix(openapi): fix system-metadata response by @david-leifker in https://github.com/datahub-project/datahub/pull/10155
- docs: update markprompt project key by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10134
- add row type for athena types by @rae89 in https://github.com/datahub-project/datahub/pull/10131
- fix(setup): fix postgres setup to create temp table with no data by @trialiya in https://github.com/datahub-project/datahub/pull/10154
- feat(ingest/looker): update browse paths to align with looker UI by @mayurinehate in https://github.com/datahub-project/datahub/pull/10147
- feat(ingest/airflow): allow plugin to load on listener exception by @hsheth2 in https://github.com/datahub-project/datahub/pull/10152
- feat(ingestion/bigquery): BigQuery Owner Label to Datahub Ownership by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/10047
- feat(ingest): bump sqlglot dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/10144
- docs(website): tweak eyebrow copy by @hsheth2 in https://github.com/datahub-project/datahub/pull/10143
- docs: upgrade markprompt version by @yoonhyejin in https://github.com/datahub-project/datahub/pull/10159
- fix(openapi): fix index out of bounds for sort order by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10168
- fix(search): fix field name in api by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10170
- build(docker): prefix pr on pr sha tags by @david-leifker in https://github.com/datahub-project/datahub/pull/10171
- Revert docker helper changes by @david-leifker in https://github.com/datahub-project/datahub/pull/10172
- feat(metadata-jobs): improve consumer logging by @darnaut in https://github.com/datahub-project/datahub/pull/10173
- test(graph): refactor graph test by @david-leifker in https://github.com/datahub-project/datahub/pull/10175
- fix(ingest/tableau) Fix Tableau lineage ingestion from Clickhouse by @valeral in https://github.com/datahub-project/datahub/pull/10167
- <fix>[oracle ingestion]: get database name when using service by @Nelvin73 in https://github.com/datahub-project/datahub/pull/10158
- fix(docker): fix versioning for compose file post release by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10176
- fix(restoreIndices): batchSize vs limit by @david-leifker in https://github.com/datahub-project/datahub/pull/10178
- feat(ui): show classification in test connection by @hsheth2 in https://github.com/datahub-project/datahub/pull/10156
- fix(ingest): add classification dep for dynamodb by @hsheth2 in https://github.com/datahub-project/datahub/pull/10162
- feat(ingest/dbt): enable model performance and compiled code by default by @hsheth2 in https://github.com/datahub-project/datahub/pull/10164
- refactor(docker): move to acryldata repo for all images by @david-leifker in https://github.com/datahub-project/datahub/pull/9459
- fix(github): fix docker publish by @david-leifker in https://github.com/datahub-project/datahub/pull/10186
- feat(lineage): mark nodes as explored by @RyanHolstien in https://github.com/datahub-project/datahub/pull/10180
- feat(ingest/gc): add index truncation logic by @anshbansal in https://github.com/datahub-project/datahub/pull/10099
- fix(entity-service): fix findFirst when already present by @david-leifker in https://github.com/datahub-project/datahub/pull/10187
- fix(ingestion/salesforce): fixed the issue by escaping the markdown string by @dushayntAW in https://github.com/datahub-project/datahub/pull/10157
New Contributors
- @AvaniSiddhapuraAPT made their first contribution in https://github.com/datahub-project/datahub/pull/9757
- @akarsh991 made their first contribution in https://github.com/datahub-project/datahub/pull/9882
- @filipe-caetano-ovo made their first contribution in https://github.com/datahub-project/datahub/pull/9974
- @dushayntAW made their first contribution in https://github.com/datahub-project/datahub/pull/9862
- @LePuppy made their first contribution in https://github.com/datahub-project/datahub/pull/10005
- @kopax-polyconseil made their first contribution in https://github.com/datahub-project/datahub/pull/10033
- @poorvi767 made their first contribution in https://github.com/datahub-project/datahub/pull/8325
- @diegmonti made their first contribution in https://github.com/datahub-project/datahub/pull/9996
- @alexs-101 made their first contribution in https://github.com/datahub-project/datahub/pull/9838
- @pinakipb2 made their first contribution in https://github.com/datahub-project/datahub/pull/9895
- @trialiya made their first contribution in https://github.com/datahub-project/datahub/pull/9972
- @k7ragav made their first contribution in https://github.com/datahub-project/datahub/pull/10093
- @jayacryl made their first contribution in https://github.com/datahub-project/datahub/pull/10151
- @HuanjieGuo made their first contribution in https://github.com/datahub-project/datahub/pull/10139
- @rae89 made their first contribution in https://github.com/datahub-project/datahub/pull/10131
- @valeral made their first contribution in https://github.com/datahub-project/datahub/pull/10167
- @Nelvin73 made their first contribution in https://github.com/datahub-project/datahub/pull/10158
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.13.0...v0.13.1
v0.13.0
Released on 2024-02-29 by @RyanHolstien.
View the release notes for v0.13.0 on GitHub.
DataHub v0.12.1
Released on 2023-12-08 by @david-leifker.
View the release notes for DataHub v0.12.1 on GitHub.
v0.12.1rc2
Released on 2023-11-28 by @david-leifker.
View the release notes for v0.12.1rc2 on GitHub.
v0.12.0
Released on 2023-10-25 by @pedro93.
View the release notes for v0.12.0 on GitHub.
v0.11.0
Released on 2023-09-08 by @iprentic.
View the release notes for v0.11.0 on GitHub.
v0.10.5
Released on 2023-08-02 by @david-leifker.
View the release notes for v0.10.5 on GitHub.
v0.10.4
Released on 2023-06-09 by @pedro93.
View the release notes for v0.10.4 on GitHub.
v0.10.3
Released on 2023-05-25 by @iprentic.
View the release notes for v0.10.3 on GitHub.
DataHub v0.10.2
Released on 2023-04-13 by @iprentic.
View the release notes for DataHub v0.10.2 on GitHub.
DataHub v0.10.1
Released on 2023-03-23 by @aditya-radhakrishnan.
View the release notes for DataHub v0.10.1 on GitHub.
DataHub v0.10.0
Released on 2023-02-07 by @david-leifker.
View the release notes for DataHub v0.10.0 on GitHub.
DataHub v0.9.6.1
Released on 2023-01-31 by @david-leifker.
View the release notes for DataHub v0.9.6.1 on GitHub.
DataHub v0.9.6
Released on 2023-01-13 by @maggiehays.
View the release notes for DataHub v0.9.6 on GitHub.
DataHub v0.9.5
Released on 2022-12-23 by @jjoyce0510.
View the release notes for DataHub v0.9.5 on GitHub.
[Known Issues] DataHub v0.9.4
Released on 2022-12-20 by @maggiehays.
View the release notes for [Known Issues] DataHub v0.9.4 on GitHub.
DataHub v0.9.3
Released on 2022-11-30 by @maggiehays.
View the release notes for DataHub v0.9.3 on GitHub.
DataHub v0.9.2
Released on 2022-11-04 by @maggiehays.
View the release notes for DataHub v0.9.2 on GitHub.
DataHub v0.9.1
Released on 2022-10-31 by @maggiehays.
View the release notes for DataHub v0.9.1 on GitHub.
DataHub v0.9.0
Released on 2022-10-11 by @szalai1.
View the release notes for DataHub v0.9.0 on GitHub.
DataHub v0.8.45
Released on 2022-09-23 by @gabe-lyons.
View the release notes for DataHub v0.8.45 on GitHub.
DataHub v0.8.44
Released on 2022-09-01 by @jjoyce0510.
View the release notes for DataHub v0.8.44 on GitHub.
DataHub v0.8.43
Released on 2022-08-09 by @maggiehays.
View the release notes for DataHub v0.8.43 on GitHub.
v0.8.42
Released on 2022-08-03 by @gabe-lyons.
View the release notes for v0.8.42 on GitHub.
v0.8.41
Released on 2022-07-15 by @anshbansal.
View the release notes for v0.8.41 on GitHub.
v0.8.40
Released on 2022-06-30 by @gabe-lyons.
View the release notes for v0.8.40 on GitHub.
v0.8.39
Released on 2022-06-24 by @maggiehays.
View the release notes for v0.8.39 on GitHub.
[!] DataHub v0.8.38
Released on 2022-06-09 by @jjoyce0510.
View the release notes for [!] DataHub v0.8.38 on GitHub.