🔓Security & Stability

DataFlint for Spark is highly secure and stable, and this page is explaining why:

DataFlint is running locally on your spark driver or history server
DataFlint is using the existing Spark UI endpoint. so no new endpoints or ports are being exposed
DataFlint is open source, so you can see what the plugin does. No black-box wizardry!
The dataflint liberary jar is a stable version in maven central OSS repo, and maven does not enable editing or changing stable versions. Meaning it's not possible that the code that runs in your cluster to change.

If DataFlint failing on startup it will throw a warning and let the app continue
Dataflint is running code in Spark when you access the Web UI
Errors in DataFlint in the driver are in a separate thread and should not effect the app runtime

Most of the compute is being done in the DataFlint Web UI side
DataFlint only runs compute on the driver, not on the executors
DataFlint runs compute on the driver only when the DataFlint Web UI is open and the tab is active
DataFlint query the driver API ever 1 second, so the performance impact is similar to looking at the existing Spark UI and refreshing constantly

We collect anonymous metrics via MixPanel on usage of DataFlint. We do not collect any data about your actual spark job beside:

Last updated 9 months ago