Skip to content

Data-Driven Spark 1.8.0

Compare
Choose a tag to compare
@FRosner FRosner released this 03 Jun 09:35
· 272 commits to master since this release

Analysis and Visualization

  • Implement strategy for missing values in pearson correlation matrix function
  • Add three color scale to heat maps
  • Allow manual adjustment of color scale range in heat maps
  • Add mutual information matrix function (non-normalized and no binning of numerical data, yet)

Example Data Sets

  • Flights data set having many numerical and nullable columns

Bugfixes

  • Median function requires numerical RDD but was throwing NPE in case of non-numeric one instead of showing that it requires an implicit numeric