Changelog
0.42.1 - (2025-04-10)
- Databricks - support sessions auto-stop
- Bug fixes (ignore delta/hudi file indexes, ignore post-query-end root updates).
0.42.0 - (2025-04-09)
- Databricks - Bug fixes & Support for 12.2, python tasks sessions and connect.
- Multi session apps - report default id.
- Bug fixes (definity stats, output diversion & spark connect).
0.41.0 - (2025-04-02)
- Support output diversion for files
- Metrics - fixes for task cpus & new active tasks metric
- Improved agent logs
0.40.2 - (2025-03-21)
- Fix task status on yarn & cluster mode
- Support custom rdd inputs
0.40.0 - (2025-03-16)
- Added Events - skew, slow planning, slow load, broadcasts & stages
- Added metrics - dynamic allocation tracking, shuffle times, cache size & disk spill over time
- Databricks support & fixes
0.30.1 - (2025-03-09)
- Move to use async calls to the server
- Added metrics - driver GC time and task retries cost
- Added tracking for Python UDFs
- Fixed BQ output diversion issue
- Fixed old Delta version empty catalog table bug
0.20.3 - (2025-02-05)
- Support Databricks multitask notebooks jobs
- Heartbeat bug fix
0.20.2 - (2025-02-04)
- Databricks related bug fixes
- Support volume metrics for old Delta version (0.4.x)
0.20.1 - (2025-02-02)
- Support Big Query output diversion
- Expose task id on spark session conf ("spark.definity.task.id")
0.20.0 - (2025-01-30)
- Server protocol v2 (minimum supported server version: 0.20.0)
0.11.0 - (2025-01-13)
- Add support for custom metrics
- Add avg vcores used over time metric
0.10.2 - (2024-11-20)
- Skew keys detection
- Support Big Query Connector
- Code comparison v2 - clean plans
0.9.16 - (2024-10-21)
- Support Multi Session Apps
- Added new metrics:
- Skew score
- Broadcasts - count, failures, time & size
- vCore time, CPU time & GC time
- Heap & off-heap utilization metrics
- Used executors over time
- Input, output, shuffle write, shuffle read - records & bytes
- Tasks/stages - retries & failures
- Lost executors
- Spill
- Tasks result size
- Support Nvidia GPU Plugin (rapids-4-spark) for files inputs & outputs
- Added support for JDBC inputs
- Added retry mechanism on server errors
- Added driver allocated and used memory over time
- Added support for Databricks for Spark 3.5
- Support dynamic time-series metrics interval
- DALM
- Support pipelines mode
- Support setting db suffix & location
0.8.1 - (2024-07-02)
- Added executors & driver jvm memory watermark metric - on-heap, off-heap, total - used & allocated
- Added "max memory utilization" metrics for driver and executors
- Added time-series metrics for total memory, used memory, total vcores, used vcores (max value per interval)
- Shade all external dependencies to create independency from runtime env
- Added support for Spark 3.5