Releases: databrickslabs/dqx
Releases · databrickslabs/dqx
v0.1.13
- Fixed cli installation and demo (#177). In this release, changes have been made to adjust the dashboard name, ensuring compliance with new API naming rules. The dashboard name now only contains alphanumeric characters, hyphens, or underscores, and the reference section has been split for clarity. In addition, demo for the tool has been updated to work regardless if a path or UC table is provided in the config. Furthermore, documentation has been refactored and udpated to improve clarity. The following issue have been closed: #171 and #198. It may be required to uninstall and install DQX again to redeploy the dashboard.
- [Feature] Update is_(not)_in_range (#87) to support max/min limits from col (#153). In this release, the
is_in_range
andis_not_in_range
quality rule functions have been updated to support a column as the minimum or maximum limit, in addition to a literal value. This change is accomplished through the introduction of optionalmin_limit_col_expr
andmax_limit_col_expr
arguments, allowing users to specify a column expression as the minimum or maximum limit. Extensive testing, including unit tests and integration tests, has been conducted to ensure the correct behavior of the new functionality. These enhancements offer increased flexibility when defining quality rules, catering to a broader range of use cases and scenarios.
Contributors: @karthik-ballullaya-db, @mwojtyczka
v0.1.12
- Fixed installation process for Serverless (#150). This commit removes the pyspark dependency from the librar to avoid spark version conflicts in Serverless and future DBR versions. CLI has been updated to install pyspark for local command execution.
- Updated demos and documentation (#169). In this release, the quality checks in the demos have been updated to better showcase the capabilities of DQX. Documentation has been updated in various places for increase clarity. Additional contributing guides have been added.
Contributors: @mwojtyczka
v0.1.11
What's Changed
- Provided option to customize reporting column names (#127). In this release, the DQEngine library has been enhanced to allow for customizable reporting column names. A new constructor has been added to DQEngine, which accepts an optional ExtraParams object for extra configurations. A new Enum class, DefaultColumnNames, has been added to represent the columns used for error and warning reporting. New tests have been added to verify the application of checks with custom column naming. These changes aim to improve the customizability, flexibility, and user experience of DQEngine by providing more control over the reporting columns and resolving issue #46. Contributors @hrfmartins @mwojtyczka
- Fixed parsing error when loading checks from a file (#165). In this release, we have addressed a parsing error that occurred when loading checks (data quality rules) from a file, fixing issue #162. The specific issue being resolved is a SQL expression parsing error. The changes include refactoring tests to eliminate code duplication and improve maintainability, as well as updating method and variable names to use
filepath
instead of "path". Additionally, new unit and integration tests have been added and manually tested to ensure the correct functionality of the updated code. Contributors @mwojtyczka - Removed usage of try_cast spark function from the checks to make sure DQX can be run on more runtimes (#163). In this release, we have refactored the code to remove the usage of the
try_cast
Spark function and replace it withcast
andisNull
checks to improve code compatibility, particularly for runtimes wheretry_cast
is not available. The affected functionality includes null and empty column checks, checking if a column value is in a list, and checking if a column value is a valid date or timestamp. We have added unit and integration tests to ensure functionality is working as intended. Contributors @mwojtyczka - Added filter to rules so that you can make conditional checks (#141). The filter serves as a condition that data must meet to be evaluated by the check function. The filters restrict the evaluation of checks to only apply to rows that meet the specified conditions. This feature enhances the flexibility and customizability of data quality checks in the DQEngine. Contributors @pierre-monnet @mwojtyczka
Full Changelog: v0.1.8...v0.1.11
v0.1.10
What's Changed
- Fixed docs-build by @mwojtyczka in #129
- Patch user agent by @sundarshankar89 in #121
- New dashboard query, Update to demos and docs by @mwojtyczka in #133
- Support datetime arguments for column range functions by @ghanse in #142
- DQX engine refactor and docs update by @mwojtyczka in #138
- Add column functions to check for valid date strings by @ghanse in #144
- Generate rules for DLT as Python dictionary by @alexott in #148
- Make DQX compatible with Serverless by @mwojtyczka in #147
Full Changelog: v0.1.8...v0.1.10
v0.1.9
What's Changed
- Fixed docs-build by @mwojtyczka in #129
- Patch user agent by @sundarshankar89 in #121
- New dashboard query, Update to demos and docs by @mwojtyczka in #133
Full Changelog: v0.1.8...v0.1.9
v0.1.8
What's Changed
- Updated docs by @mwojtyczka in #117
- added search for docs by @sundarshankar89 in #119
- ✨ improve docs styling by @renardeinside in #118
- Add Dashboard as Code, DQX Data Quality Summmary Dashboard by @nehamilak-db in #86
- updated profiling documentation with cost consideration by @canan-girgin in #126
- Release v0.1.8 by @mwojtyczka in #128
Full Changelog: v0.1.7...v0.1.8
v0.1.7
What's Changed
- Set cache invalidation for pypi badge by @mwojtyczka in #102
- Correct handling of Decimal, Short and Byte types by @alexott in #103
- ✨ introduce docs by @renardeinside in #104
- Rollback for readme and contributing by @mwojtyczka in #112
- 🛠️ fix docs path by @renardeinside in #111
- Updated runner for docs release by @mwojtyczka in #113
- 🔧 fix runner for docs deployment by @renardeinside in #114
- Updated docs by @mwojtyczka in #115
- Release v0.1.7 by @mwojtyczka in #116
Full Changelog: v0.1.6...v0.1.7
v0.1.6
What's Changed
- Fix for image links in README on PyPi by @alexott in #95
- added test methods for InstallationMixin.py, log.py and dlt_rules by @canan-girgin in #93
- issue 47 - new check is_not_null_and_not_empty_array and fixed timestamp mismatch issue in profiler by @dinbab1984 in #98
- Updated logo by @mwojtyczka in #96
- Release v0.1.6 by @mwojtyczka in #101
Full Changelog: v0.1.5...v0.1.6
v0.1.5
What's Changed
- Fix README on PyPi by using
hatch-fancy-pypi-readme
in the build by @alexott in #81 - Readme update by @mwojtyczka in #82
- add OIDC codecov by @sundarshankar89 in #83
- Release 0.1.5 by @mwojtyczka in #91
Full Changelog: v0.1.4...v0.1.5
v0.1.4
Release v0.1.4 (#79) ## Changes Updated release process