-
Notifications
You must be signed in to change notification settings - Fork 13
/
old_doc.html
341 lines (276 loc) · 15.9 KB
/
old_doc.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
<meta name="layout" content="post"/>
<title>"supergroup"</title>
<meta name="comments" content="true"/>
<meta name="categories" content="[repo]"/>
<meta name="source" content="https://github.com/Sigfried/supergroup"/>
<link type="text/css" rel="stylesheet" href="./style.css"/>
<link type="text/css" rel="stylesheet" href="./assets/prism.css"/>
</head>
<body>
<script src="https://cdnjs.cloudflare.com/ajax/libs/underscore.js/1.8.2/underscore.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.js"></script>
<script src="https://rawgit.com/Sigfried/supergroup/master/supergroup.js"></script>
<!-- more -->
<div>
<section>
<h1 id="supergroup.js">Supergroup.js</h1>
<p>Supergroup brings extreme convenience and understandability to the manipulation of
Javascript data collections, especially in the context of D3.js visualization
programming.</p>
<p>As if in submission to the great programmers commandment–<em>Don’t
Repeat Yourself</em>–every time I find myself writing a piece of code
that solves basically the same problem I’ve solved a dozen times
before, a little piece of my soul dies.</p>
<p>Utilities for grouping record collections into maps or nests abound:
<a href="https://github.com/mbostock/d3/wiki/Arrays#-nest">d3.nest</a>,
<a href="https://github.com/mbostock/d3/wiki/Arrays#associative-arrays">d3.map</a>,
<a href="http://underscorejs.org/#groupBy">Underscore.groupBy</a>,
<a href="https://github.com/iros/underscore.nest">Underscore.Nest</a>, to name
a few. But after these tools relieve us of a certain amount of
repetitive stress, we’re often left with a tangle of hairy details
that fill us with a dreadful sense of deja vu. Supergroup may seem
like the kind of tacky wonder gadget you’d find on a late-night
Ronco ad, but, for the low, low price of free, it makes data-centric
Javascript programming fun again. <strong>And</strong>, when you find yourself
in a D3.js callback routine holding a datum object that might have
come from anywhere–for instance, with a tooltip callback used on
disparate object types–everything you want to know about your
object and its associated metadata and records is right there at
your fingertips.</p>
<p>Just to be clear about the problem—you start with tabular data from a CSV
file, a SQL query, or some AJAX call:
<p><span class="iframe">Some very fake hospital data in a CSV file…</span>
<iframe width="100%" height="70px" src="examples/examples.html?data">
</iframe></p></p>
<p><span class="iframe">...turned into canonical array of Objects (using d3.csv, for instance)</span>
<iframe width="100%" height="80px" src="examples/examples.html?json">
</iframe></p>
<p>Without Supergroup, you’d group the records on the values of one or more fields
with a standard grouping function, giving you data like:</p>
<p><span class="iframe">d3.nest().key(function(d) { return d.Physician; }).key(function(d) { return d.Unit; }).map(data)</span>
<iframe width="100%" height="150px" src="examples/examples.html?d3map">
</iframe></p>
<p><span class="iframe">d3.nest().key(function(d) { return d.Physician; }).key(function(d) { return d.Unit; }).entries(data)</span>
<iframe width="100%" height="150px" src="examples/examples.html?d3nest">
</iframe></p>
<p>To my mind, these are awkward data structures (not to mention the awkwardness
of the calling functions.) The <code>map</code> version looks ok in the console, but
D3 wants data in arrays, not as objects. The <code>entries</code> version gives us
arrays of key/value pairs, but on upper levels <code>values</code> is another array of
key/value pairs while on the bottom level <code>values</code> is an array of records. In
both <code>entries</code> and <code>map</code>, you can’t tell from a node at any level what
dimension was being grouped at that level. </p>
<p>Supergroup gives you almost everything you’d want for every item in your nest
(or in your single array if you have a one-level grouping):</p>
<ul>
<li>An array of the values grouped on (could be strings, numbers, or dates) (<a href="#sgphysunit">example</a>)</li>
<li>The records associated with each group (<a href="#records">example</a>)</li>
<li>Information about the values at any level
<ul>
<li>Parent group if any</li>
<li>Immediate child groups if any</li>
<li>All descendant groups</li>
<li>Only descendant groups at the leaf level</li>
<li>Aggregate calculations on records for that group and its descendants</li>
<li>Path of group names from root to current group</li>
<li>Path of group dimension names from root to current group</li>
</ul></li>
<li>Information about the groupings at any level</li>
<li>For a group at any level, the name of the dimension (attribute, column, property, etc.) grouped on</li>
<li>Any of these in a format D3 or some other tool expects</li>
</ul>
<h2 id="supergroup">Supergroup</h2>
<p><code class="language-javascript">
var foo = bar;
</code></p>
<p>Works as an Underscore (or Lo-Dash) mixin: </p>
<pre class="language-markup" data-src="mixin_example.html"></pre>
<h2 id="aplainarrayofstringsenhancedwithchildrenandrecords">A plain Array of Strings, enhanced with children and records</h2>
<p><code>_.supergroup(data, fieldname)</code> returns an array whose elements are the
distinct values of <code><fieldname></code> in the original data records. These elements,
or Values can be String or Number objects (Dates to be implemented eventually).
Each Value holds a <code>.records</code> property which is an array containing the subset of
original records matching that Value.</p>
<p>In the example below we do a multi-level grouping by Physician and Unit. So
<code>sg = _.supergroup(data,['Physician','Unit'])</code> returns a list of
physicians (the top-level grouping). The first item in this list,
<code>sg[0]</code>, is “Adams”, a String object. <code>sg[0].records</code> is an array
containing the records where Physician=“Adams”. <code>sg[0].children</code> is a
list of the Units (our second-level grouping) in the records where
Physician=“Adams”. <code>sg[0].children[0].records</code> would be the subset of
records where Physician=“Adams” and Unit=“preop”.</p>
<p><a id='sgphysunit'></a>
<p><span class="iframe">Supergroup on physician and unit</span>
<iframe width="100%" height="400px" src="examples/examples.html?sgphysunit">
</iframe></p></p>
<p>It does a bunch more I still need to document.</p>
<hr/>
<h2 id="everythingbelowisolddocumentationimtryingtoreplace">Everything below is old documentation I’m trying to replace</h2>
<pre><code class="json Some records loaded from a CSV or JSON file">var gradeBook = [
{lastName: "Gold", firstName: "Sigfried", class: "Remedial Programming", grade: "C", num: 2},
{lastName: "Gold", firstName: "Sigfried", class: "Literary Posturing", grade: "B", num: 3},
{lastName: "Gold", firstName: "Sigfried", class: "Documenting with Pretty Colors", grade: "B", num: 3},
{lastName: "Sassoon", firstName: "Sigfried", class: "Remedial Programming", grade: "A", num: 3},
{lastName: "Androy", firstName: "Sigfried", class: "Remedial Programming", grade: "B", num: 3}
];
</code></pre>
<pre><code class="javascript Grouping on one dimension">var byLastName = _.supergroup(gradeBook, "lastName"); // an Array of Strings: ["Gold","Sassoon","Androy"]
byLastName[0].records; // Array of Sigfried Gold's original 3 records
byLastName.rawValues(); // Array of native strings (easier to look at or use in contexts where you need a plain string)
</code></pre>
<pre><code class="javascript Grouping by a calculated value">var byName = _.supergroup(gradeBook, function(d) { return d.firstName + ' ' + d.lastName; });
// an Array of Strings: ["Sigfried Gold","Sigfried Sassoon","Sigfried Androy"]
</code></pre>
<pre><code class="javascript It's a native Array, but you can treat it as map, and then do cool stuff. Here's a GPA:">byName.lookup("Sigfried Gold").records.pluck("num").mean(); // 2.6666666666666665
</code></pre>
<p>The above example shows how Supergroup can chain Underscore methods (and mixins), functionality
it gets from <a href="../underscore-unchained">underscore-unchained</a>.</p>
<pre><code class="javascript Grouping hierarchically">var byClassGrade = _.supergroup(gradeBook, ["class", "grade"]); // Array of top-level groups: ["Remedial Programming", "Literary Posturing", "Documenting with Pretty Colors"]
byClassGrade[0].children; // Children of a single group: ["C", "B"]
byClassGrade[0].records; // Array original records for a single group
byClassGrade.lookup("Remedial Programming"); // lookup a top-level group by name
byClassGrade.lookup(["Remedial Programming","B"]); // lookup a second-level group by name path
byClassGrade.lookup(["Remedial Programming","B"]).namePath(' -> '); // "Remedial Programming -> B"
byClassGrade.lookup(["Remedial Programming","B"]).dimPath() // "class/grade"
</code></pre>
<p>Supergroup can flatten a tree into an array of nodes much like D3’s hierarchy layout, but in a way
that’s easier to use IMHO.
<code>javascript
byClassGrade.flattenTree(); // ["Remedial Programming", "C", "A", "B", "Literary Posturing", "B", "Documenting with Pretty Colors", "B"]
byClassGrade.flattenTree().invoke('namePath'); // ["Remedial Programming", "Remedial Programming/C", "Remedial Programming/A", "Remedial Programming/B", "Literary Posturing", "Literary Posturing/B", "Documenting with Pretty Colors", "Documenting with Pretty Colors/B"]
// only want leaf nodes?
byClassGrade.leafNodes().invoke('namePath'); // ["Remedial Programming/C", "Remedial Programming/A", "Remedial Programming/B", "Literary Posturing/B", "Documenting with Pretty Colors/B"]
</code></p>
<!--
{ old stuff % jsfiddle us9k9/2 %
}
In a SQL group by query you get one record for each resulting group and
you can calculate values based on the aggregate of the rows comprised by
each group. Another step is needed to go back from the group to
the individual rows in that group. D3's nest and Underscore's groupBy do
slightly better in that they offer a collection of groups where each group
is tied to its associated records.
To explain the advantages of supergroup over Underscore and D3's versions, let's compare the results:
``` javascript Underscore's groupBy
_.groupBy(gradeBook,'lastName')
=> {
Gold: [
{ firstName: "Sigfried", lastName: "Gold", class: "Remedial Programming", grade: "C", num: 2 },
{ firstName: "Sigfried", lastName: "Gold", class: "Literary Posturing", grade: "B", num: 3 },
{ firstName: "Sigfried", lastName: "Gold", class: "Documenting with Pretty Colors", grade: "B", num: 3 }
],
Else: [
{ firstName: "Someone", lastName: "Else", class: "Remedial Programming", grade: "B", num: 3 }
]
}
```
``` javascript D3's nest and map
d3.nest().key(function(d) { return d.lastName }).map(gradeBook) // same result as Underscore.
```
Because D3 visualizations depend so completely on arrays, D3 provides a way of structuring groups as arrays:
``` javascript D3's nest and entries
d3.nest().key(function(d) { return d.lastName }).entries(gradeBook)
=> [
{ key: "Gold",
values: [
{ firstName: "Sigfried", lastName: "Gold", class: "Remedial Programming", grade: "C", num: 2 },
{ firstName: "Sigfried", lastName: "Gold", class: "Literary Posturing", grade: "B", num: 3 },
{ firstName: "Sigfried", lastName: "Gold", class: "Documenting with Pretty Colors", grade: "B", num: 3 }
]
},
{ key: "Else",
values: [
{ firstName: "Someone", lastName: "Else", class: "Remedial Programming", grade: "B", num: 3 }
]
}
]
// making a list with this data in D3 might look like this:
gradeBookEntries = d3.nest()
.key(function(d) { return d.lastName })
.key(function(d) { return d.grade })
.entries(gradeBook)
_.rebind(console, 'log') // so console.log can be used as a callback
d3.select('div#main').append('ul').selectAll('li')
.data(gradeBookEntries)
.enter()
.append('li')
.text(function(d) { return d.key })
.on('click', console.log)
.append('ul').selectAll('li')
.data(function(d) { return d.values})
.enter()
.append('li')
.text(function(d) { return d.key + ': ' + d.values.map(function(r) { return r.class }).join(', ') })
.on('click', console.log)
gradeBookNames = _.supergroup(gradeBook,['lastName','grade']);
d3.select('div#main').append('ul').selectAll('li')
.data(gradeBookNames)
.enter()
.append('li')
.text(_.identity)
.on('click', console.log)
.append('ul').selectAll('li')
.data(function(d) { return d.children})
.enter()
.append('li')
.text(function(d) { return d + ': ' + d.records.pluck('class').join(', ') })
.on('click', console.log)
```
These produce identical results with fairly similar syntax, but when the visualization
becomes more complex, the supergroup nodes are much more useful. A common use case
is providing information about a node on mouseover.
One drawback of d3.nest above is a difference in datum types between parent and leaf
nodes: datum.values at a parent node is an array of {key:'...',values:[...]}, but at
the leaf node it's an array of raw records.
Supergroup does not mix up raw records and hierarchy children in this way. At every
level 'records' refers to raw records (which you can only access as leaf nodes in
d3.nest) and 'children' refers to nested children if there are any at that node.
gradeBookNames = _.supergroup(gradeBook,['lastName','grade']);
d3.select('div#main').append('ul').selectAll('li')
.data(gradeBookEntries)
.enter()
.append('li')
.text(_.identity)
.append('ul').selectAll('li')
.data(function(d) { return d.records})
.enter()
.append('li')
.text(function(d) { return d.namePath() })
d3.select('body').append('ul').selectAll('li')
.data(gradeBookEntries)
.enter()
.append('li')
.text(function(d) { return d.key })
.append('p')
.text(function(d) { return d.values.length + ' records in group ' + this.parentNode.__data__.key })
```
has the exact same result (with less pleasant syn
``` javascript
var gradeBook = [
{firstName: 'Sigfried', lastName: 'Gold', class: 'Remedial Programming', grade: 'C+', num: 2.2},
{firstName: 'Sigfried', lastName: 'Gold', class: 'Literary Posturing', grade: 'B', num: 3},
{firstName: 'Sigfried', lastName: 'Gold', class: 'Documenting with Pretty Colors', grade: 'B-', num: 2.7},
{firstName: 'Someone', lastName: 'Else', class: 'Remedial Programming', grade: 'A'}];
var gradesByLastName = enlightenedData.group(gradeBook, 'lastName')
```
``` javascript
var gradesByName = enlightenedData.group(gradeBook,
function(d) { return d.lastName + ', ' + d.firstName },
{dimName: 'fullName'})
var sigfried = gradesByName.lookup('Gold, Sigfried');
sigfried.records.length; // 3
var sigfriedGPA = sigfried.records.reduce(function(result,rec) { return result+rec.num }, 0) / sigfried.records.length;
(it does much much more, will explain below)
```
{ old % include_code supergroup-test.js %
}
-->
</section>
</div>
<script src="assets/prism.js"></script>
</body>
</html>