Spark Agent Changelog
0.74.0 - (2025-09-17)
- Support tfs batch
- Support params override by server
- Include session thread in driver thread dump
0.73.5 - (2025-09-10)
- GCP & Dataproc info
- Bug fixes (NPE in skew event when taskMetrics is null & broadcast event leak)
0.73.4 - (2025-09-07)
- Fix support for custom metrics in databricks
0.73.3 - (2025-09-04)
- Fix non-heap memory & cpu usage metrics - use custom impl and add java/python/other breakdown
- Custom metrics - increase default task run limit and report calc duration
- Reduce logs
0.73.1 - (2025-08-24)
- Fix shuffle metadata retrieval for skipped stages
- Add S3 debug metrics
- Add plugin msgs over time metric
- Use java shutdown hooks directly (instead of spark wrapper)
0.73.0 - (2025-08-20)
- Add physical plan events
- Add automatic thread dump - for idle driver & skewed tasks
- Skew significance - keep only significant skewed on stage end or delete existing ones
- Range partitioner events fix
- Executors event - add removed reason
- Report definity uncaught exceptions
- Reduce spamming logs
- Remove info field from tfs
- Limit both events & tfs per task run.
0.72.1 - (2025-08-10)
- Stage metrics
- Skew detection improvements and fixes - skewed task metrics, single event, merged intervals & bug fixes.
- Executors avg used memory metric
0.72.0 - (2025-07-23)
- Support Spark 2.4 with Scala 2.12
- Databricks & EMR on EC2 - cluster info
- Executors info
0.71.2 - (2025-07-23)
- Databricks - use job & task names by default
0.71.1 - (2025-07-20)
- default cluster name
0.71.0 - (2025-07-17)
- Multi session support v2
0.70.2 - (2025-07-06)
- Lineage support for dsv2 & delta files
- Support api token from env
- Add debug events
0.70.1 - (2025-06-26)
- Add fs & s3 metrics
0.70.0 - (2025-06-26)
- Support Spark 2.4 Plugin for memory & skew
0.60.5 - (2025-05-29)
- Add permissive load option to support missing paths
0.60.4 - (2025-05-26)
- Add index & cache refresh events
- Add shuffle info to stage events
- Bug Fix for diversion with temp view
0.60.1 - (2025-04-29)
- Databricks - use cluster name by default
- Metrics - add executors vcore time, executor cores & driver cores
- Bug fixes for local mode
0.60.0 - (2025-04-21)
- Plugin - improved integration (dont require listener config) & allow disabling executor side plugin
- Bug fix (skew event for stage retry)
0.43.0 - (2025-04-20)
- Databricks - support for premature cluster termination
- Nested queries - avoid logging
- Params - add shuffle partitions & dynamic allocation details when enabled
0.42.1 - (2025-04-10)
- Databricks - support sessions auto-stop
- Bug fixes (ignore delta/hudi file indexes, ignore post-query-end root updates).
0.42.0 - (2025-04-09)
- Databricks - Bug fixes & Support for 12.2, python tasks sessions and connect.
- Multi session apps - report default id.
- Bug fixes (definity stats, output diversion & spark connect).
0.41.0 - (2025-04-02)
- Support output diversion for files
- Metrics - fixes for task cpus & new active tasks metric
- Improved agent logs
0.40.2 - (2025-03-21)
- Fix task status on yarn & cluster mode
- Support custom rdd inputs
0.40.0 - (2025-03-16)
- Added Events - skew, slow planning, slow load, broadcasts & stages
- Added metrics - dynamic allocation tracking, shuffle times, cache size & disk spill over time
- Databricks support & fixes
0.30.1 - (2025-03-09)
- Move to use async calls to the server
- Added metrics - driver GC time and task retries cost
- Added tracking for Python UDFs
- Fixed BQ output diversion issue
- Fixed old Delta version empty catalog table bug
0.20.3 - (2025-02-05)
- Support Databricks multitask notebooks jobs
- Heartbeat bug fix
0.20.2 - (2025-02-04)
- Databricks related bug fixes
- Support volume metrics for old Delta version (0.4.x)
0.20.1 - (2025-02-02)
- Support Big Query output diversion
- Expose task id on spark session conf ("spark.definity.task.id")
0.20.0 - (2025-01-30)
- Server protocol v2 (minimum supported server version: 0.20.0)
0.11.0 - (2025-01-13)
- Add support for custom metrics
- Add avg vcores used over time metric
0.10.2 - (2024-11-20)
- Skew keys detection
- Support Big Query Connector
- Code comparison v2 - clean plans
0.9.16 - (2024-10-21)
- Support Multi Session Apps
- Added new metrics:
- Skew score
- Broadcasts - count, failures, time & size
- vCore time, CPU time & GC time
- Heap & off-heap utilization metrics
- Used executors over time
- Input, output, shuffle write, shuffle read - records & bytes
- Tasks/stages - retries & failures
- Lost executors
- Spill
- Tasks result size
- Support Nvidia GPU Plugin (rapids-4-spark) for files inputs & outputs
- Added support for JDBC inputs
- Added retry mechanism on server errors
- Added driver allocated and used memory over time
- Added support for Databricks for Spark 3.5
- Support dynamic time-series metrics interval
- DALM
- Support pipelines mode
- Support setting db suffix & location
0.8.1 - (2024-07-02)
- Added executors & driver jvm memory watermark metric - on-heap, off-heap, total - used & allocated
- Added "max memory utilization" metrics for driver and executors
- Added time-series metrics for total memory, used memory, total vcores, used vcores (max value per interval)
- Shade all external dependencies to create independency from runtime env
- Added support for Spark 3.5