Dataflint is implemented as a spark-plugin and spark plugins can be installed in variety of ways, all installation options should take no more than a few minutes to implement.
Dataflint installation is very similar to other spark libraries such as deltalake and iceberg
In case you have long conditions in your queries, consider increasing the config spark.sql.maxMetadataStringLength to 1000, so spark will log your filter/select/join conditions without trancating them
For Spark 4.0 users: Replace spark_2.12 in the artifact/package name to dataflint_spark4_2.13
For example: libraryDependencies += "io.dataflint" %% "dataflint_spark4_2.13" % "0.8.6"
For package name: io.dataflint:dataflint_spark4_2.13:0.8.6
For Scala 2.13users: replace artifactId spark_2.12 to spark_2.13
For Iceberg support, add set spark.dataflint.iceberg.autoCatalogDiscovery to true for iceberg write metrics support. For more details, see Apache Iceberg.
Option 1: With package installation and code changes (Scala only)