# Release Notes

### Version 0.9.9

This release adds first-class support for native query accelerators - Apache Gluten/Velox, Apache DataFusion Comet, NVIDIA RAPIDS, and Databricks Photon - so you can now use DataFlint side-by-side with the accelerator of your\
choice and still see the full SQL plan, stage breakdown, and operator-level metrics.

What changed:

🆕 Native accelerator support in the DataFlint UI

The SQL plan view now recognizes accelerated operators and renders them as first-class nodes, with an accelerator badge so you can tell at a glance whether a node ran on Velox, Photon, RAPIDS, or DataFusion:

* Apache Gluten / Velox - WholeStageCodegenTransformer, VeloxResizeBatches, RowToVeloxColumnar, TakeOrderedAndProjectExecTransformer, and related boundary nodes are now classified, named, and assigned to stages correctly. Native\
  Velox timing metrics - aggregation / filter / sort / window time, peak memory, and spill - are surfaced on the plan nodes.
* Apache DataFusion Comet - CometExchange, CometColumnarExchange, and CometHashAggregate are now recognized in stage assignment, shuffle metrics, aggregate naming, and Comet's Keys: / Functions: plan-description format.
* NVIDIA RAPIDS - GpuColumnarExchange is recognized and Gpu\* blocking operators are now treated as stage boundaries.
* Databricks Photon - Photon\* columnar exchanges and blocking operators are now treated as stage boundaries.

Columnar exchanges (ColumnarExchange, CometExchange, GpuColumnarExchange) are now split into separate write/read visual nodes across stage boundaries, the same way Spark's native Exchange already was.

🛠 DataFlint instrumentation now coexists with native accelerators  \
───────────────────────────────────────────────Previously, enabling spark.dataflint.instrument.spark.enabled = true on a cluster that also had Gluten / Comet / RAPIDS / Photon on the classpath would silently disable native acceleration - DataFlint's TimedExec wrapper hid Spark operators from the accelerator's plan-substitution rules, and the accelerator quietly fell back to the JVM path.

In 0.9.9, DataFlint now:

* Detects each supported accelerator via a classpath probe at startup and logs which ones are active.
* Runs its instrumentation pass after every accelerator's pre-columnar phase (via postColumnarTransitions, which Spark applies in reverse registration order), so accelerator-substituted nodes like FilterExecTransformer and the Comet/Gpu/Photon equivalents are never wrapped.

The result: instrumentation and native acceleration coexist out of the box, with no extra configuration and no manual skip list. You can leave spark.dataflint.instrument.spark.enabled on.

Bug fixes:

* Broadcast duration was under-counted. SqlReducer.calcBroadcastExchangeDuration was only summing the "time to broadcast" component - the "time to build" and "collect time" terms were being dropped due to a statement-termination\
  bug. All three are now correctly summed into the reported broadcast exchange duration.
* Plan descriptions for nodes with spaces. SqlReducer.buildFallbackPlanDescriptions could not match nodes whose names contained spaces (Scan csv, WholeStageCodegenTransformer (1)), so their plan descriptions were silently dropped from the UI. Fixed.
* Fallback plan descriptions for native engines. Re-added parsing of the SQL-level planDescription so that nodes from native engines display their plan text even when DataFlint's custom endpoint returns empty.
* Stage propagation through Gluten boundary nodes - AQE codegen renumbering and Gluten's wrapped boundary operators no longer cause stages to be lost when walking the plan graph.

Examples and getting started:

* New docker/gluten example - Spark 3.5 + Apache Gluten 1.2 + Velox, including a runnable GlutenVeloxExample and a run-gluten-example.sh that builds the UI, the plugin, and the example jar and brings everything up via Docker\
  Compose.
* New docker/comet example - Spark 3.5.7 + Apache DataFusion Comet 0.4.0, mirroring the Gluten setup. The released Comet jar bundles both linux\_amd64 and linux\_aarch64 natives, so a single image works on Intel and Apple Silicon\
  (via Rosetta).

Both examples enable DataFlint instrumentation by default, so they're the fastest way to see the coexistence path in action.

Compatible with Spark 3.0 → 4.x.

### Version 0.9.8 <a href="#version-0.9.6" id="version-0.9.6"></a>

**Bug fixes:**\
&#x20; \#74 — Fix `NullPointerException` in `Block.code` when `from_json` (or any other\
&#x20;   `CodegenFallback` expression) is used under whole-stage codegen with DataFlint\
&#x20;   instrumentation enabled. `TimedWithCodegenExec` now reports `supportCodegen = false`\
&#x20;   for wrapped operators that contain a `CodegenFallback`, mirroring Spark's own\
&#x20;   `CollapseCodegenStages` check that the transparent wrapper had been hiding.\
&#x20; \* Compatible with Spark 3.0 → 4.x\
\
**Hardening:**\
&#x20; \* `TimedExec.postRddId` now overwrites the `rddId` metric instead of summing across\
&#x20;   re-executions of the same plan instance.\
&#x20; \* `TimedExec` and `TimedWithCodegenExec` no longer compare equal — fixes a corner\
&#x20;   case in plan canonicalization / AQE plan reuse.\
&#x20; \* `executeCollect` write-path is now bounds-safe; falls back to the standard path\
&#x20;   on unexpected plan shapes (vendor write commands, future Spark layouts).\
&#x20; \* `rddId` metric switched from a "size" type to plain sum (no longer rendered as\
&#x20;   bytes — `"12 B"` — in the SparkUI).

### Version 0.9.7 <a href="#version-0.9.6" id="version-0.9.6"></a>

This release adds first-class support for **Databricks Runtime 17.3 LTS and newer**, and fixes a metric formatting issue that could blank parts of the DataFlint SQL plan view on Databricks.

**What changed:**&#x20;

🆕 **New artifact: `dataflint-spark4-databricks_2.13`**

Databricks Runtime 17.3 is Spark 4–based but ships `javax.servlet` instead of the standard `jakarta.servlet`. The regular `dataflint-spark4_2.13` jar was crashing the cluster at startup with `NoClassDefFoundError: jakarta/servlet/Servlet`.

We now publish a separate jar built for Databricks runtimes:

```
io.dataflint:dataflint-spark4-databricks_2.13:0.9.7
```

It’s the same plugin, same `spark.plugins` class — only the jar coordinate changes.

🛠 **Duration metric now displays correctly on Databricks**

DataFlint’s `TimedExec` instrumentation wraps each operator with a "duration" timing metric. On Databricks runtimes, this was previously appearing as a bare number (e.g. `1058`) in the Spark UI rather than the expected `5s (1s, 2s, 3s)` formatting, which also crashed the DataFlint SQL plan view with `Unsupported time unit`. Fixed.

The DataFlint UI is also more defensive about unexpected metric formats — it now logs a warning and skips the value rather than blanking the page.

**Stock Spark 4.x (non-Databricks)**

Unchanged. `dataflint-spark4_2.13` continues to be the right artifact for stock Spark 4, EMR, and any other vanilla Spark 4 deployment.

### Version 0.9.6 <a href="#version-0.9.6" id="version-0.9.6"></a>

* Fixed a startup crash on Databricks Runtime `17.3 and up, using spark 4`\
  The **DataFlint** tab will not be registed

### Version 0.9.0 <a href="#version-0.9.5" id="version-0.9.5"></a>

#### UI and usability

* Added a **YouTube tutorial link** in the footer.
* Added a **new version notification**.
* DataFlint now fetches the latest version from Sonatype Central, It shows a chip when a newer version is available. Version checks use semver comparison. The check fails silently if Sonatype Central is unreachable.
* Fixed duration display for zero values, Nodes with `duration=0` now show `0 ms` instead of being hidden.

#### Instrumentation

* `DataFlintRDDUtils` now uses a custom RDD instead of `mapPartitions` for duration timing.
* The custom RDD captures `startTime` inside `compute()` It does this before `firstParent.iterator()` This now captures eager parent work correctly.\
  That includes operators like `SortExec` and `HashAggregateExec`\
  Previously, `mapPartitions` started timing too late That missed full partition sort work and hash map build time.
* `TimedExec.executeCollect` now records **per-partition write duration** It reconstructs `DataWritingCommandExec` with the data plan wrapped in `RDDTimingWrapper` On Spark `3.4+`, this happens inside `WriteFilesExec` On older Spark versions, it wraps the data plan directly The write command then consumes the timed RDD via `sparkContext.runJob` This captures both data production time and write I/O per partition Previously, write duration used driver-side wall-clock timing That was inconsistent with other per-partition metrics.
* `doProduce` now wraps child code in `try/finally` for blocking operators Duration metrics now flush even when operators exit early through `shouldStop()` or `return` This fixes `duration=0` in codegen paths for operators like `SortExec`
* `doProduce` now sanitizes `ctx.freshNamePrefix` This strips non-alphanumeric characters from generated code variable prefixes It fixes invalid Java identifiers for nodes with spaces in `nodeName` That includes names like `Scan ExistingRDD` and `Execute InsertIntoHadoopFsRelationCommand`.
* Added `RDDScanExec` to instrumented nodes `Scan ExistingRDD` nodes now get duration metrics on Spark `3` and Spark `4`.

### Version 0.8.9 <a href="#version-0.8.9" id="version-0.8.9"></a>

#### Instrumentation

* A single generic `TimedExec` wrapper now replaces 19 per-type `DataFlint*Exec` classes `TimedExec` adds a `duration` metric and an `rddId` metric.
* Existing Spark metrics stay intact on the wrapped node The wrapper exposes `child.children` as its own children The SQL graph stays as one node You do not get double nodes in Spark UI or DataFlint UI.
* `InMemoryTableScanExec` and all `Exchange` nodes are never wrapped.
* Version-specific nodes are matched by class name string This avoids `NoClassDefFoundError` on older Spark versions.
* Join codegen is cancelled where codegen instrumentation does not work.
* `DataWritingCommandExec` now gets duration support through `doPrepare` delegation.

#### Stage grouping and duration attribution

* Stage grouping is now topology-based from the SQL plan graph, `Exchange` boundaries define the stage graph, The result is deterministic across live runs and history server.
* Stage view now supports **Inclusive** and **Exclusive** duration modes, **Inclusive** shows native Spark metrics as-is **Exclusive** is the default **Exclusive** normalizes stage durations to `executorRunTime`.
* Attribution mode auto-enables when any instrumented node exists.
* Exchange read and write durations now come from shuffle metrics.
* Producer and consumer stages split those durations correctly.
* Metric reduction is now null-safe throughout the reducer.

### Version 0.8.8 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. Fix for instumenation on Spark 3.1.
2. Better extraction for Iceberg and bigQuery save operation.

### Version 0.8.7 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. New spark instrumentation - which adds duration metrics for more Python actions and window operation nodes ( spark versions 3.1+) enabled using : spark.dataflint.instrument.spark.window\.enabled\
   spark.dataflint.instrument.spark.arrowEvalPython.enabled\
   spark.dataflint.instrument.spark.batchEvalPython.enabled\
   spark.dataflint.instrument.spark.flatMapGroupsInPandas.enabled spark.dataflint.instrument.spark.flatMapCoGroupsInPandas.enabled

### Version 0.8.3 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. New spark instrumentation - spark.dataflint.instrument.spark.mapInArrow\.enabled and spark.dataflint.instrument.spark.mapInPandas.enabled, which adds duration metrics to all spark versions 3.3+ with these new operators. So you can know exactly how much a UDF took.
2. Add full table name to iceberg table reads

### Version 0.8.3 <a href="#version-0.3.0" id="version-0.3.0"></a>

Fix to stage identification algorithm\
Fix to stage read visuals to show hashed/ranges fields\
Visual improvements to stage sidebar

### Version 0.8.2 <a href="#version-0.3.0" id="version-0.3.0"></a>

* Improvement to sql plan layout with stage nodes
  * Button to switch to plan without stage nodes
* Improvements to stage identification

### Version 0.8.1 <a href="#version-0.3.0" id="version-0.3.0"></a>

* Small fix for fixing the support of spark version 3.3.X
* Small fix for shuffle write metrics real time update

### Version 0.8.0 <a href="#version-0.3.0" id="version-0.3.0"></a>

✨ New Features\
New Flow Graph UI with Stage Parent Nodes - Completely redesigned the SQL flow graph visualization with stage parent nodes for better understanding of query execution\
Task View in Stage Nodes - Added task progress indicators directly within stage nodes for real-time task monitoring\
"Rows Aggregated" Metric - Added a new metric to track aggregated row counts\
Exchange Node Separation - Shuffle read and write operations are now displayed separately when 2 stages exist, providing clearer visibility into data exchange operations

🎨 UI Improvements\
Enhanced visual consistency across UI components\
Replaced CheckIcon with CancelIcon for clearer error representation\
Updated FlowLegend to include task progress indicators\
Improved styling for node elements in SQL flow\
General SQL plan UI improvements\
⚡ Performance Improvements\
SQLNodeStageReducer Optimization - Implemented O(1) access with lookup maps for nodes and stages, significantly improving efficiency in SQL node stage calculations\
Smarter Update Cycles - Skip OnCycleEnd calculations when there are no changes in SQL and stages\
Stage Map for Alerts - Use stage map for faster alert processing

🐛 Bug Fixes\
Fixed wall clock duration calculation\
Fixed duration node hover display issue\
Fixed idle cores bug where idle cores showed 0% when all executors closed (Spark incorrectly detected local mode)

🔧 Other Changes\
Removed Scala distribution from Spark distribution package

### Version 0.7.0 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. Delta lake collector (experimental)
2. Stage identification improvements
3. UI Improvements

### Version 0.6.1 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. Add better delta lake support - delta write command, optimize command, optimize shuffle before write
2. Add support for SortAggregate nodes
3. Fix maven dependency issues with spark 3 POM fiels<br>

### Version 0.6.0 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. Spark 4 support
2. Better stage identification using metrics with statistics
3. Shrinking metrics text in case of high number of metrics

### Version 0.5.1 <a href="#version-0.3.0" id="version-0.3.0"></a>

Visual improvements for the new SQL plan UI

### Version 0.5.0 <a href="#version-0.3.0" id="version-0.3.0"></a>

New and updated design for the sql plan nodes

<figure><img src="/files/hN0NSi2Z89Efj2uC2tu7" alt=""><figcaption></figcaption></figure>

### Version 0.4.4 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. Improvement to query presentation
2. Support Extended node:

<figure><img src="/files/I7ilvPAE3JTK4k9wOp2o" alt=""><figcaption></figcaption></figure>

### Version 0.4.3 <a href="#version-0.3.0" id="version-0.3.0"></a>

Support query params, like sql-id and node-ids in link

### Version 0.4.2 <a href="#version-0.3.0" id="version-0.3.0"></a>

Supports partition pruning

Enriching filter/select nodes with the UDF python function names:

<figure><img src="/files/jqEJNRr1e9Q0vxgbCIOd" alt=""><figcaption></figcaption></figure>

Add support for Generate node - explode, inline and more!

<figure><img src="/files/gLfbyUBQ8Bhvj2HZxOri" alt=""><figcaption></figcaption></figure>

### Version 0.4.1 <a href="#version-0.3.0" id="version-0.3.0"></a>

Support newer DBX versions - 14 and up.

### Version 0.4.0 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. Support map by pandas and arrow functions
2. Added new flag to silence alert for a job -

   ```
   spark.dataflint.alert.disabled
   Which accepts a column seperated list of alerts such as:
   smallTasks,idleCoresTooHigh

   ```
3. Added short recommendation on top of alert
4. Updated DataFlint logo
5. Support better stage identifications for varios readers
6. Shows stage failures with an orange V on sql node and list of complete stage failures

### Version 0.3.2 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. New alerts - cross joins, change join to broadcast, large partition size
2. Better support for cross joins
3. Additional metrics for joins and shuffles

<figure><img src="/files/IyBD1v6eMziFTKz50bGC" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/RMup6BJb84DJLyBn8EdO" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/mioBINIQbRCADBMj3mdL" alt=""><figcaption></figcaption></figure>

### Version 0.3.1 <a href="#version-0.3.0" id="version-0.3.0"></a>

1. Support Distinct, skewed join, coalesce nodes
2. Sorting by default by duration on history server mode
3. Filter sql's without any job/stage automatically, add a switch to show them.
4. Add rows filtered percentage for filter and distinct sql nodes

<figure><img src="/files/8LLNEvYRYWFNPeHYNTd1" alt="" width="563"><figcaption></figcaption></figure>

Version 0.3.0

1. Support for window functions
2. Better support for databricks
3. Detecting resource configuration better in configuration tab
4. Bug and UI fixes

<figure><img src="/files/21jlBdGybeVhoeXXRBYv" alt="" width="292"><figcaption></figcaption></figure>

## Version 0.2.7

1. DataFusion Comet support
2. bug fixes

## Version 0.2.6

1. Nvidia RAPIDS for Spark support
2. bug fixes

## Version 0.2.5

1. Bug Fixes

## Version 0.2.4

1. New visibility & alerts - Driver's memory
2. Updated README
3. Bug Fixes

## Version 0.2.3

1. Driver memory monitoring & alert
2. Updated readme
3. Bug fixes

<figure><img src="/files/A0SqrSJuaWVNYicFc6cP" alt="" width="325"><figcaption></figcaption></figure>

## Version 0.2.3

1. New alert - Large data Broadcast, for requesting to broadcast large data sets with the broadcast() function
2. New alert - Large filter conditions, for wiring long filter conditions instead of using join logic
3. UI Improvements

## Version 0.2.2

1. Support spark versions 2.4 logs in history server with version later than 3.2 Limited feature-set is available due to events having less data than spark 3.0 and up

## Version 0.2.1

1. Better Databricks stage to node support
2. Support spark.dataflint.runId in custom history server providers when appId is not the spark appId

## Version 0.2.0

1. Better support for Databricks Photon plans
2. Input nodes shows partitions filters and push down filters
3. Stage Breakdown - press the blue down arrow on sql node to see stage information
4. New alert - large number of small tasks (see [Alerts](/dataflint-for-spark/advanced/alerts.md#large-number-of-small-tasks))

<figure><img src="/files/OSmw8HraLhSUSbnBwAoK" alt=""><figcaption></figcaption></figure>

## Version 0.1.7

1. Apache Iceberg alerts improvements
2. Add avg file size in read/write
3. More information when hovering on stage

<figure><img src="/files/pXNthiZ7kK7LXbocFcVR" alt=""><figcaption></figcaption></figure>

## Version 0.1.6

1. Apache Iceberg support
   1. Better node naming
   2. Read metrics and reading small files alerts
   3. Write metrics and overwriting most of table alerts
      1. Require enabling iceberg metric reporter, can be done for you by enabling **spark.dataflint.iceberg.autoCatalogDiscovery** to true, or setting the iceberg metric reporter manually for each catalog, for example:

         ```
         spark.sql.catalog.[catalog name].metrics-reporter-impl org.apache.spark.dataflint.iceberg.DataflintIcebergMetricsReporter
         ```

<figure><img src="/files/JNjKU9Xu9fckTJtotc68" alt=""><figcaption><p>Replacing entire table only to change 1% of records (1 in a 100)</p></figcaption></figure>

## Version 0.1.5

1. Add support for history server with cluster-mode jobs (i.e. with attempt numbet)
2. Fix "wasted cores" calculation
3. Fix status tab SQL is flickering when there is SQL with sub queriers

## Version 0.1.4

Fix scala 2.13 support

## Version 0.1.3

1. DataFlint SaaS support
2. partition Skew Alert:

<figure><img src="/files/2zcrvPNaxxB4B4f3EAoR" alt=""><figcaption></figcaption></figure>

## Version 0.1.2

1. Scala 2.13 support
2. A spark flag to disable web app mixpanel telemetries - `spark.dataflint.telemetry.enabled`(true/false)
3. Renamed Core Activity Rate to Wasted Cores Ratio (which is 100 - Core Activity Rate), and added an alert for wasted cores too high

<figure><img src="/files/vjj4pkiOHKPXZViw3mZj" alt=""><figcaption></figcaption></figure>

##

## Version 0.1.1

1. Resources tab - see a graph of your cluster executors count over time, use it to tune your resource allocation settings and save cost!
2. Minor visual fixes

DataFlint Resource Tab:

<figure><img src="/files/KPvzhYDJ1EnR6KkcBOhV" alt=""><figcaption></figcaption></figure>

## Version 0.1.0

1. Small fix to platform identification

## Version 0.0.8

1. Databricks support
2. Visual improvements
3. public release

## Version 0.0.7

Heat map

<img src="/files/CbWLyVszPiKXO88y6C4E" alt="" data-size="original">

## Version 0.0.6

Flint Assistant, require OpenAI Key

<figure><img src="/files/uhEE2XbvczHw5djiCexO" alt=""><figcaption></figcaption></figure>

## Version 0.0.5

#### Syntax highlighting for SQL plan parts

<figure><img src="/files/ruhAPd5DU7twYHGEzI7v" alt="" width="359"><figcaption><p>Show selected fields</p></figcaption></figure>

#### Calculating container memory usage and using it for GB memory/hour calculations

<figure><img src="/files/eTjFB2oASISGZQf049Hc" alt=""><figcaption></figcaption></figure>

## Version 0.0.4

1. Minor fix relates to spark operator and nginx

## Version 0.0.3

#### SQL plan modes

IO only, shows only input, joins and output:

<figure><img src="/files/9xbFNGB7keIXJXvTtE0X" alt=""><figcaption></figcaption></figure>

Basic mode (default), shows also transformations like filters, aggregations and selects:

<figure><img src="/files/R89agwZZWUr2fceijeNv" alt=""><figcaption></figcaption></figure>

#### Advances, shows repartitions, broadcasts and sorts

<figure><img src="/files/YyA3XtjhqeHClUEjYk7H" alt=""><figcaption></figcaption></figure>

Also there is plans informations for:

1. Joins
2. Sorts
3. Selects
4. Repartitions

## Version 0.0.2

#### DBU calculation instead of core/hour in summary bar

<figure><img src="/files/Fdq5i1HOY5pCRtyNRS6x" alt="" width="563"><figcaption></figcaption></figure>

#### Add memory config to configuration tab

<figure><img src="/files/3PkSIISE71VCs6FQx8xV" alt=""><figcaption></figcaption></figure>

#### Filter Nodes has condition:

<figure><img src="/files/b6SD9GQCNHO5f7w3yue1" alt="" width="239"><figcaption></figcaption></figure>

#### Advanced mode for SQL plan, that also presents shuffle nodes

<figure><img src="/files/nTemddcw5ppmnF1AJ2K4" alt=""><figcaption></figcaption></figure>

#### Additional changes

1. Support both http and https access with enabling mix-content only on https mode
2. Support for spark 3.5.X

## Version 0.0.1

Initial version, includes:

1. Status page
2. Summary page
3. Configuration Page
4. Alerts page


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://dataflint.gitbook.io/dataflint-for-spark/overview/release-notes.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
