Releases · FRosner/spawncamping-dds

03 Feb 10:04

FRosner

release/4.0.0-beta

10be1a5

Data-Driven Spark 4.0.0-beta Pre-release

Pre-release

Core

Scan for all @Help annotated methods when DDS.help() is called
Review and fix short and long descriptions in @Help annotations

Build

Fixed problem where pull requests would fail due to some travis misconfiguration

Assets 4

02 Jan 14:44

FRosner

release/4.0.0-alpha

e73304c

Data-Driven Spark 4.0.0-alpha Pre-release

Pre-release

This release introduces a completely new project structure. DDS now has sub modules (core, datasets and web-ui).

Core

new servables API
rework Z scale implementation of heatmap servable
support for Spark 1.5.x

Datasets

rework datasets creation to use SQLContext implicit conversions (toDF)
remove non-necessary Spark context argument from the DataFrame versions of the datasets
use java.sql.* instead of java.util.* for date and timestamps

Web-UI

make servable titles and history browser more informative

Assets 4

22 Sep 16:05

FRosner

release/3.0.1

67fc89b

Data-Driven Spark 3.0.1 Latest

Latest

Bugfixes

Fix NPE when showing case classes that contain null values
Fix Travis CI build including coveralls runtime dependency by accident (S3 bucket)

Assets 4

10 Sep 12:39

FRosner

release/3.0.0

81e7bb9

Data-Driven Spark 3.0.0

Spark

Upgrade to Spark 1.4.0

Usability

Add visualization REST interface + visualization history browser (drop down menu); This actually allows multiple users to access the same UI and also to refresh the page. However, after refreshing you lose the settings you made (will be fixed in one of the next releases)

Additionally, there have been a lot of refactorings happening (build script, DDS core, Spark SQL functions).

Assets 4

02 Sep 12:21

FRosner

release/2.3.1

24dcb1b

Data-Driven Spark 2.3.1

Bugfixes

Fix a problem where require.js would sometimes not load d3 correctly. This caused the parallel coordinates to break.

Assets 4

18 Aug 12:23

FRosner

release/2.3.0

71c52fd

Data-Driven Spark 2.3.0

Example Data Sets

Added a small data set description and source to the user guide for each data set
Added a GraphX example data set (Enron email network)

Bugfixes

Mutual information function crashed on double columns containing NaN values => now NaNs are binned separately
Fixed a problem where changing the heatmap scale changed black cells (null values) to white

Build and Architecture

Remove git call in build file that caused the build to crash on systems with older versions of git (<= 1.7.x)
Use require.js as dependency management system for front-end code

Assets 4

01 Aug 10:40

FRosner

release/2.2.0

784fbf0

Data-Driven Spark 2.2.0

Analysis and Visualization

Heatmap draws black cells when values are NaN / null. This is especially useful when the normalized mutual information is not defined.
New key-value visualization for summary statistics
Nodes in force layout are movable
Add charge to force layout to visually separate connected components
Bin numerical columns before computing mutual information

Misc

Completely recreate main content div after each visualization
Compute running covariance, mean and variance for correlation aggregation for better numerical stability
Log build information (version, revision, time) at DDS object initialization

Assets 4

12 Jul 20:28

FRosner

release/2.1.0

181197c

Data-Driven Spark 2.1.0

General

Visualizations can now have a title
Mutual information is rescaled by maximum entropy of both variables to allow comparison of multiple MI values
Sturge's formula to compute optimal number of histogram bins when user does not provide a number
Fixed description of RDD summarize function

Spark SQL

Summary statistics function (summarize) for data frames
Bar chart for single data frame columns
Pie chart for single data frame columns
Histogram for single data frame columns
Median for single data frame columns
Dashboard now uses data frame summarize for column statistics
Dashboard provides useful titles for individual visualizations

Assets 4

19 Jun 08:43

FRosner

release/2.0.0

29e8289

Data-Driven Spark 2.0.0

Build

Upgrade from Spark 1.2 to Spark 1.3
- SchemaRDD to DataFrame
- Resolve SLF4J class path conflicts
- Avoid serialization bug in flights example data set in Spark shell
Change default Scala version for sbt build to 2.10 (was 2.11)

Analysis and Visualization

First version of dashboard function
- Visualizations are now drawn independently from each other using a document-wide cache to store configuration under their content id as a key
- Bootstrap CSS layout for columnar layout
- Dashboard shows a sample, column dependencies and summary statistics for each column

Bugfixes

Changing the upper bound of heatmap scales caused heatmap to ignore the selected colors and redraw with default

Assets 4

03 Jun 09:35

FRosner

release/1.8.0

ebefc08

Data-Driven Spark 1.8.0

Analysis and Visualization

Implement strategy for missing values in pearson correlation matrix function
Add three color scale to heat maps
Allow manual adjustment of color scale range in heat maps
Add mutual information matrix function (non-normalized and no binning of numerical data, yet)

Example Data Sets

Flights data set having many numerical and nullable columns

Bugfixes

Median function requires numerical RDD but was throwing NPE in case of non-numeric one instead of showing that it requires an implicit numeric

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core

Build

Releases: FRosner/spawncamping-dds

Data-Driven Spark 4.0.0-beta

Core

Build

Data-Driven Spark 4.0.0-alpha

Data-Driven Spark 3.0.1

Data-Driven Spark 3.0.0

Data-Driven Spark 2.3.1

Data-Driven Spark 2.3.0

Data-Driven Spark 2.2.0

Data-Driven Spark 2.1.0

Data-Driven Spark 2.0.0

Data-Driven Spark 1.8.0