-
Notifications
You must be signed in to change notification settings - Fork 11
/
VisualizingLargeModels.tex
1018 lines (867 loc) · 47.2 KB
/
VisualizingLargeModels.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Visualizing Large Models}
\label{chap:VisualizingLargeModels}
ParaView is used frequently at Sandia National Laboratories and other
institutions for visualizing data from large-scale simulations run on the
world's largest supercomputers including the examples shown here.
\begin{inlinefig}
\begin{tabular}{cc}
\includegraphics[width=0.431\linewidth]{images/Asteroid} &
\includegraphics[width=0.469\linewidth]{images/PolarVortex} \\
\parbox[t]{0.431\linewidth}{\footnotesize CTH shock physics simulation
with over 1 billion cells of a 10 megaton explosion detonated at the
center of the Golevka asteroid.} &
\parbox[t]{0.469\linewidth}{\footnotesize SEAM Climate Modeling
simulation with 1 billion cells modeling the breakdown of the polar
vortex, a circumpolar jet that traps polar air at high latitudes.}
\end{tabular}
\begin{tabular}{cc}
\includegraphics[width=0.474\linewidth]{images/LargeAMR} &
\includegraphics[width=0.426\linewidth]{images/Crossflow} \\
\parbox[t]{0.474\linewidth}{\footnotesize A CTH simulation that
generates AMR data. We have used ParaView to visualize CTH
simulation AMR data comprising billions of cells, 100's of thousands
of blocks, and eleven levels of hierarchy (not shown).} &
\parbox[t]{0.426\linewidth}{\footnotesize A PHASTA simulation of 3.3
billion tetrahedral cells involving the flow over a full wing where a
synthetic jet issues an unsteady crossflow jet.}
\end{tabular}
\begin{tabular}{cc}
\includegraphics[width=0.383\linewidth]{images/Crossflow} &
\includegraphics[width=0.517\linewidth]{images/WingWake}
\end{tabular}
\parbox{0.95\linewidth}{\footnotesize ParaView visualizations run in situ
with large scale PHASTA simulations. On the left is a 3.3 billion
tetrahedral mesh simulating the flow over a full wing where a synthetic
jet issues an unsteady crossflow jet (run on 160 thousand MPI
processes). On the right is a 1.3 billion element mesh simulating the
wake of a deflected wing flap (run on 256 thousand MPI processes).
Images courtesy of Michel Rasquin, Argonne National Laboratory.}
\end{inlinefig}
In this section we discuss visualizing large meshes like these using the
parallel visualization capabilities of ParaView. This section is less
``hands-on'' than the previous section. You will learn the conceptual
knowledge needed to perform large parallel visualization instead. We
present the basic ParaView architecture and parallel algorithms and
demonstrate how to apply this knowledge.
\section{ParaView Architecture}
ParaView is designed as a three-tier client-server architecture. The three
logical units of ParaView are as follows.
\index{ParaView Server}
\begin{description}
\item[Data Server] \index{data server} The unit responsible for data
reading, filtering, and writing. All of the pipeline objects seen in the
pipeline browser are contained in the data server. The data server can
be parallel.
\item[Render Server] \index{render server}The unit responsible for
rendering. The render server can also be parallel, in which case built
in parallel rendering is also enabled.
\item[Client] \index{client}The unit responsible for establishing
visualization. The client controls the object creation, execution, and
destruction in the servers, but does not contain any of the data (thus
allowing the servers to scale without bottlenecking on the client). If
there is a GUI, that is also in the client. The client is always a
serial application.
\end{description}
These logical units need not be physically separated. Logical units are
often embedded in the same application, removing the need for any
communication between them. There are three modes in which you can run
ParaView.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/RunModeStandalone}
\end{inlinefig}
The first mode, which you are already familiar with, is
\keyterm{standalone} mode. In standalone mode, the client, data server,
and render server are all combined into a single serial application. When
you run the \progname{paraview} application, you are automatically connected
to a \keyterm{builtin} server so that you are ready to use the full
features of ParaView.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/RunModeClientServer}
\end{inlinefig}
The second mode is \keyterm{client-server} mode. In client-server mode,
you execute the \progname{pvserver} program on a parallel machine and
connect to it with the \progname{paraview} client application. The
\progname{pvserver} program has both the data server and render server
embedded in it, so both data processing and rendering take place there.
The client and server are connected via a socket, which is assumed to be a
relatively slow mode of communication, so data transfer over this socket is
minimized.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/RunModeClientRenderDataServer}
\end{inlinefig}
The third mode is \keyterm{client--render-server--data-server} mode. In this
mode, all three logical units are running in separate programs. As before,
the client is connected to the render server via a single socket
connection. The render server and data server are connected by many socket
connections, one for each process in the render server. Data transfer over
the sockets is minimized.
Although the client-render server-data server mode is supported, we almost
never recommend using it. The original intention of this mode is to take
advantage of heterogeneous environments where one might have a large,
powerful computational platform and a second smaller parallel machine with
graphics hardware in it. However, in practice we find any benefit is
almost always outstripped by the time it takes to move geometry from the
data server to the render server. If the computational platform is much
bigger than the graphics cluster, then use software rendering on the large
computational platform. If the two platforms are about the same size just
perform all the computation on the graphics cluster.
\section{Setting up a ParaView Server}
Setting up standalone ParaView is usually trivial. You can download a
pre-compiled binary, install it on your computer, and go. Setting up a
ParaView server, however, is intrinsically harder. First, you will have to
compile the server yourself. Because there are so many versions of MPI,
the library that makes parallel programming possible, and each version of
MPI may be altered to match the communication hardware of a parallel
computer, it is impossible to reliably provide binary files to match every
possible combination.
To compile ParaView on a parallel machine, you will need the following.
\begin{itemize}
\item CMake cross-platform build setup tool
(\href{http://www.cmake.org}{www.cmake.org})
\item MPI
\item OpenGL (or use Mesa 3D \href{http://www.mesa3d.org}{www.mesa3d.org}
if otherwise unavailable)
\item Qt 4.7 (optional)
\item Python +NumPy +Matplotlib (optional)
\end{itemize}
Compiling without one of the optional libraries means a feature will not be
available. Compiling without Qt means that you will not have the GUI
application and compiling without Python means that you will not have
scripting available.
To compile ParaView, you first run CMake, which will allow you to set up
compilation parameters and point to libraries on your system. This will
create the make files that you then use to build ParaView. For more
details on building a ParaView server, see the ParaView Wiki.
{
\footnotesize
\href{http://www.paraview.org/Wiki/Setting_up_a_ParaView_Server#Compiling}{http://www.paraview.org/Wiki/Setting\_up\_a\_ParaView\_Server\#Compiling}
}
Running ParaView in parallel is also intrinsically more difficult than
running the standalone client. It typically involves a number of steps
that change depending on the hardware you are running on: logging in to
remote computers, allocating parallel nodes, launching a parallel program,
establishing connections, and tunneling through firewalls.
Client-server connections are established through the \texttt{paraview}
client application. You connect to servers and disconnect from servers
with the \connect and \disconnect buttons. When ParaView starts, it
automatically connects to the builtin server. It also connects to
builtin whenever it disconnects~\disconnect from a server.
When you hit the \connect button, ParaView presents you with a dialog box
containing a list of known servers you may connect to. This list of
servers can be both site- and user-specific.
\begin{inlinefig}
\includegraphics[width=.75\scw]{images/ChooseServer}
\end{inlinefig}
You can specify how to connect to a server either through the GUI by
pressing the \gui{Add Server} button or through an XML definition file.
There are several options for specifying server connections, but ultimately
you are giving ParaView a command to launch the server and a host to
connect to after it is launched. Consult the ParaView Wiki for more
information on establishing server connections.
{
\footnotesize
\href{http://www.paraview.org/Wiki/Setting_up_a_ParaView_Server#Running_the_Server}{http://www.paraview.org/Wiki/Setting\_up\_a\_ParaView\_Server\#Running\_the\_Server}
}
\section{Parallel Visualization Algorithms}
We are fortunate in that once you have a parallel framework, performing
parallel visualization tasks is straightforward. The data we deal with is
contained in a mesh, which means the data is already broken into little
pieces by the cells. We can do visualization on a distributed parallel
machine by first dividing the cells among the processes. For
demonstrative purposes, consider this very simplified mesh.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelExampleMesh}
\end{inlinefig}
Now let us say we want to perform visualizations on this mesh using three
processes. We can divide the cells of the mesh as shown below with the
blue, yellow, and pink regions.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelExamplePartitions}
\end{inlinefig}
Once partitioned, some visualization algorithms will work by simply
allowing each process to independently run the algorithm on its local
collection of cells. For example, take clipping (which is demonstrated in
multiple exercises including \ref{ex:UsingMultipleViews}). Let us say that
we define a clipping plane and give that same plane to each of the
processes.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelExampleClip1}
\end{inlinefig}
Each process can independently clip its cells with this plane. The end
result is the same as if we had done the clipping serially. If we were to
bring the cells together (which we would never actually do for large data
for obvious reasons) we would see that the clipping operation took place
correctly.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelExampleClip2}
\end{inlinefig}
\section{Ghost Levels}
Unfortunately, blindly running visualization algorithms on partitions of
cells does not always result in the correct answer. As a simple example,
consider the \keyterm{external faces} algorithm. The external faces
algorithm finds all cell faces that belong to only one cell, thereby
identifying the boundaries of the mesh. What happens when we run external
faces independently on our partitions?
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelExampleExternalFaces1}
\end{inlinefig}
Oops. We see that when all the processes ran the external faces algorithm
independently, many internal faces were incorrectly identified as being
external. This happens where a cell in one partition has a neighbor in
another partition. A process has no access to cells in other partitions,
so there is no way of knowing that these neighboring cells exist.
The solution employed by ParaView and other parallel visualization systems
is to use \keyterm{ghost cells} (sometimes also called
\keyterm{halo regions}). Ghost cells are cells that are held in one
process but actually belong to another. To use ghost cells, we first have
to identify all the neighboring cells in each partition. We then copy
these neighboring cells to the partition and mark them as ghost cells, as
indicated with the gray colored cells in the following example.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelExampleExternalFaces2}
\end{inlinefig}
When we run the external faces algorithm with the ghost cells, we see that
we are still incorrectly identifying some internal faces as external.
However, all of these misclassified faces are on ghost cells, and the faces
inherit the ghost status of the cell it came from. ParaView then strips
off the ghost faces and we are left with the correct answer.
In this example we have shown one layer of ghost cells: only those cells
that are direct neighbors of the partition's cells. ParaView also has the
ability to retrieve multiple layers of ghost cells, where each layer
contains the neighbors of the previous layer not already contained in a
lower ghost layer or the original data itself. This is useful when we have
cascading filters that each require their own layer of ghost cells. They
each request an additional layer of ghost cells from upstream, and then
remove a layer from the data before sending it downstream.
\section{Data Partitioning}
Since we are breaking up and distributing our data, it is prudent to
address the ramifications of how we partition the data. The data shown in
the previous example has a \keyterm{spatially coherent} partitioning. That
is, all the cells of each partition are located in a compact region of
space. There are other ways to partition data. For example, you could
have a random partitioning.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelExampleRandomPartition1}
\end{inlinefig}
Random partitioning has some nice features. It is easy to create and is
friendly to load balancing. However, a serious problem exists with respect
to ghost cells.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelExampleRandomPartition2}
\end{inlinefig}
In this example, we see that a single level of ghost cells nearly
replicates the entire data set on all processes. We have thus removed any
advantage we had with parallel processing. Because ghost cells are used so
frequently, random partitioning is not used in ParaView.
\section{D3 Filter}
The previous section described the importance of load balancing and ghost
levels for parallel visualization. This section describes how to achieve
that.
Load balancing and ghost cells are handled automatically by ParaView when
you are reading structured data (image data, rectilinear grid, and
structured grid). The implicit topology makes it easy to break the data
into spatially coherent chunks and identify where neighboring cells are
located.
It is an entirely different matter when you are reading in unstructured
data (poly data and unstructured grid). There is no implicit topology and
no neighborhood information available. ParaView is at the mercy of how the
data was written to disk. Thus, when you read in unstructured data there
is no guarantee about how well load balanced your data will be. It is also
unlikely that the data will have ghost cells available, which means that
the output of some filters may be incorrect.
Fortunately, ParaView has a filter that will both balance your unstructured
data and create ghost cells. This filter is called D3, which is short for
distributed data decomposition. Using D3 is easy; simply attach the filter
(located in \gui{Filters} \ra \gui{Alphabetical} \ra \gui{D3}) to whatever
data you wish to repartition.
\begin{inlinefig}
\includegraphics[height=.3\linewidth]{images/D3ExampleBefore}
\includegraphics[height=.3\linewidth]{images/D3ExampleAfter}
\end{inlinefig}
The most common use case for D3 is to attach it directly to your
unstructured grid reader. Regardless of how well load balanced the incoming
data might be, it is important to be able to retrieve ghost cells so that
subsequent filters will generate the correct data. The example above shows
a cutaway of the extract surface filter on an unstructured grid. On the
left we see that there are many faces improperly extracted because we are
missing ghost cells. On the right the problem is fixed by first using the
D3 filter.
\section{Matching Job Size to Data Size}
\emph{How many cores should I have in my ParaView server?} This is a
common question with many important ramifications. It is also an
enormously difficult question. The answer depends on a wide variety of
factors including what hardware each processor has, how much data is being
processed, what type of data is being processed, what type of visualization
operations are being done, and your own patience.
Consequently, we have no hard answer. We do however have several rules of thumb.
\textbf{If you are loading structured data} (image data, rectilinear grid,
structured grid), try to have a minimum of one core per 20 million
cells. If you can spare the cores, one core for every 5 to 10
million cells is usually plenty.
\textbf{If you are loading unstructured data} (poly data, unstructured
grid), try to have a minimum of one core per 1 million cells. If you
can spare the cores, one core for every 250 to 500 thousand cells
is usually plenty.
As stated before, these are just rules of thumb, not absolutes. You should
always try to experiment to gage what your core to data size should
be. And, of course, there will always be times when the data you want to
load will stretch the limit of the resources you have available. When this
happens, you will want to make sure that you avoid data explosion and that
you cull your data quickly.
\section{Avoiding Data Explosion}
\label{sec:AvoidingDataExplosion}
The pipeline model that ParaView presents is very convenient for
exploratory visualization. The loose coupling between components provides
a very flexible framework for building unique visualizations, and the
pipeline structure allows you to tweak parameters quickly and easily.
The downside of this coupling is that it can have a larger memory
footprint. Each stage of this pipeline maintains its own copy of the data.
Whenever possible, ParaView performs \keyterm{shallow copies} of the data
so that different stages of the pipeline point to the same block of data in
memory. However, any filter that creates new data or changes the values or
topology of the data must allocate new memory for the result. If ParaView
is filtering a very large mesh, inappropriate use of filters can quickly
deplete all available memory. Therefore, when visualizing large data sets,
it is important to understand the memory requirements of filters.
Please keep in mind that the following advice is intended \emph{only for
when dealing with very large amounts of data and the remaining available
memory is low}. When you are not in danger of running out of memory,
ignore all of the following advice.
When dealing with structured data, it is absolutely important to know what
filters will change the data to unstructured. Unstructured data has a much
higher memory footprint, per cell, than structured data because the
topology must be explicitly written out. There are many filters in
ParaView that will change the topology in some way, and these filters will
write out the data as an unstructured grid, because that is the only data
set that will handle any type of topology that is generated. The following
list of filters will write out a new unstructured topology in its output
that is roughly equivalent to the input. These filters should \emph{never}
be used with structured data and should be used with caution on
unstructured data.
%TODO: there are surely more filters in each category now
\ifthenelse{\boolean{savetrees}}{\noindent\begin{minipage}{\linewidth}}{}
\begin{multicols}{2}
\begin{itemize}
\item \gui{Append Datasets}
\item \gui{Append Geometry}
\item \gui{Clean}
\item \gui{Clean to Grid}
\item \gui{Connectivity}
\item \gui{D3}
\item \gui{Delaunay 2D/3D}
\item \gui{Extract Edges}
\item \gui{Linear Extrusion}
\item \gui{Loop Subdivision}
\item \gui{Reflect}
\item \gui{Rotational Extrusion}
\item \gui{Shrink}
\item \gui{Smooth}
\item \gui{Subdivide}
\item \gui{Tessellate}
\item \gui{Tetrahedralize}
\item \gui{Triangle Strips}
\item \gui{Triangulate}
\end{itemize}
\end{multicols}
\ifthenelse{\boolean{savetrees}}{\end{minipage}}{}
Technically, the \gui{Ribbon} and \gui{Tube} filters should fall into this
list. However, as they only work on 1D cells in poly data, the input data
is usually small and of little concern.
This similar set of filters also output unstructured grids, but they also
tend to reduce some of this data. Be aware though that this data reduction
is often smaller than the overhead of converting to unstructured data.
Also note that the reduction is often not well balanced. It is possible
(often likely) that a single process may not lose any cells. Thus, these
filters should be used with caution on unstructured data and extreme
caution on structured data.
\ifthenelse{\boolean{savetrees}}{\noindent\begin{minipage}{\linewidth}}{}
\begin{multicols}{2}
\begin{itemize}
\item \gui{Clip}~\clip
\item \gui{Decimate}
\item \gui{Extract Cells by Region}
\item \gui{Extract Selection}~\extractSelection
\item \gui{Quadric Clustering}
\item \gui{Threshold}~\threshold
\end{itemize}
\end{multicols}
\ifthenelse{\boolean{savetrees}}{\end{minipage}}{}
Similar to the items in the preceding list, \gui{Extract
Subset}~\extractSubset performs data
reduction on a structured data set, but also outputs a structured data set.
So the warning about creating new data still applies, but you do not have
to worry about converting to an unstructured grid.
This next set of filters also outputs unstructured data, but it also
performs a reduction on the dimension of the data (for example 3D to 2D),
which results in a much smaller output. Thus, these filters are usually
safe to use with unstructured data and require only mild caution with
structured data.
\ifthenelse{\boolean{savetrees}}{\noindent\begin{minipage}{\linewidth}}{}
\begin{multicols}{2}
\begin{itemize}
\item \gui{Cell Centers}
\item \gui{Contour}~\contour
\item \gui{Extract CTH Fragments}
\item \gui{Extract CTH Parts}
\item \gui{Extract Surface}
\item \gui{Feature Edges}
\item \gui{Mask Points}
\item \gui{Outline (curvilinear)}
\item \gui{Slice}~\slice
\item \gui{Stream Tracer}~\streamTracer
\end{itemize}
\end{multicols}
\ifthenelse{\boolean{savetrees}}{\end{minipage}}{}
These filters do not change the connectivity of the data at all. Instead,
they only add field arrays to the data. All the existing data is shallow
copied. These filters are usually safe to use on all data.
\ifthenelse{\boolean{savetrees}}{\noindent\begin{minipage}{\linewidth}}{}
\begin{multicols}{2}
\begin{itemize}
\item \gui{Block Scalars}
\item \gui{Calculator}~\calculator
\item \gui{Cell Data to Point Data}
\item \gui{Curvature}
\item \gui{Elevation}
\item \gui{Generate Surface Normals}
\item \gui{Gradient}
\item \gui{Level Scalars}
\item \gui{Median}
\item \gui{Mesh Quality}
\item \gui{Octree Depth Limit}
\item \gui{Octree Depth Scalars}
\item \gui{Point Data to Cell Data}
\item \gui{Process Id Scalars}
\item \gui{Python Calculator}
\item \gui{Random Vectors}
\item \gui{Resample with dataset}
\item \gui{Surface Flow}
\item \gui{Surface Vectors}
\item \gui{Texture Map to...}
\item \gui{Transform}
\item \gui{Warp (scalar)}
\item \gui{Warp (vector)}~\warp
\end{itemize}
\end{multicols}
\ifthenelse{\boolean{savetrees}}{\end{minipage}}{}
This final set of filters are those that either add no data to the output
(all data of consequence is shallow copied) or the data they add is
generally independent of the size of the input. These are almost always
safe to add under any circumstances (although they may take a lot of time).
\ifthenelse{\boolean{savetrees}}{\noindent\begin{minipage}{\linewidth}}{}
\begin{multicols}{2}
\begin{itemize}
\item \gui{Annotate Time}
\item \gui{Append Attributes}
\item \gui{Extract Block}
\item \gui{Extract Datasets}
\item \gui{Extract Level}~\extractGroup
\item \gui{Glyph}~\glyph
\item \gui{Group Datasets}~\group
\item \gui{Histogram}~\histogram
\item \gui{Integrate Variables}
\item \gui{Normal Glyphs}
\item \gui{Outline}
\item \gui{Outline Corners}
\item \gui{Plot Global Variables Over Time}
\item \gui{Plot Over Line}~\plotOverLine
\item \gui{Plot Selection Over Time}~\plotSelectionOverTime
\item \gui{Probe Location}~\probe
\item \gui{Temporal Shift Scale}
\item \gui{Temporal Snap-to-Time-Steps}
\item \gui{Temporal Statistics}
\end{itemize}
\end{multicols}
\ifthenelse{\boolean{savetrees}}{\end{minipage}}{}
There are a few special case filters that do not fit well into any of the
previous classes. Some of the filters, currently \gui{Temporal
Interpolator} and \gui{Particle Tracer}, perform calculations based on
how data changes over time. Thus, these filters may need to load data for
two or more instances of time, which can double or more the amount of data
needed in memory. The \gui{Temporal Cache} filter will also hold data for
multiple instances of time. Also keep in mind that some of the temporal
filters such as the temporal statistics and the filters that plot over time
may need to iteratively load all data from disk. Thus, it may take an
impractically long amount of time even though it does not require any extra
memory.
The \gui{Programmable Filter}~\icon{pqProgrammableFilter24} is also a
special case that is impossible to classify. Since this filter does
whatever it is programmed to do, it can fall into any one of these
categories.
\section{Culling Data}
\label{sec:CullingData}
When dealing with large data, it is clearly best to cull out data whenever
possible, and the earlier the better. Most large data starts as 3D
geometry and the desired geometry is often a surface. As surfaces usually
have a much smaller memory footprint than the volumes that they are derived
from, it is best to convert to a surface soon. Once you do that, you can
apply other filters in relative safety.
A very common visualization operation is to extract isosurfaces from a
volume using the \gui{Contour}~\contour filter. The \gui{Contour} filter
usually outputs geometry much smaller than its input. Thus, the
\gui{Contour} filter should be applied early if it is to be used at all.
Be careful when setting up the parameters to the \gui{Contour} filter
because it still is possible for it to generate a lot of data. This
obviously can happen if you specify many isosurface values. High
frequencies such as noise around an isosurface value can also cause a
large, irregular surface to form.
Another way to peer inside of a volume is to perform a \gui{Slice}~\slice
on it. The \gui{Slice}~\slice filter will intersect a volume with a plane
and allow you to see the data in the volume where the plane intersects. If
you know the relative location of an interesting feature in your large data
set, slicing is a good way to view it.
If you have little \emph{a-priori} knowledge of your data and would like to
explore the data without paying the memory and processing time for the full
data set, you can use the \gui{Extract Subset}~\extractSubset filter to
subsample the data. The subsampled data can be dramatically smaller than
the original data and should still be well load balanced. Of course, be
aware that you may miss small features if the subsampling steps over them
and that once you find a feature you should go back and visualize it with
the full data set.
There are also several features that can pull out a subset of a volume:
\gui{Clip}~\clip, \gui{Threshold}~\threshold, \gui{Extract Selection}, and
\gui{Extract Subset}~\extractSubset can all extract cells based on some
criterion. Be aware, however, that the extracted cells are almost never
well balanced; expect some processes to have no cells removed. Also, all
of these filters with the exception of \gui{Extract Subset}~\extractSubset
will convert structured data types to unstructured grids. Therefore, they
should not be used unless the extracted cells are of at least an order of
magnitude less than the source data.
When possible, replace the use of a filter that extracts 3D data with one
that will extract 2D surfaces. For example, if you are interested in a
plane through the data, use the \gui{Slice}~\slice filter rather than the
\gui{Clip}~\clip filter. If you are interested in knowing the location of
a region of cells containing a particular range of values, consider using
the \gui{Contour}~\contour filter to generate surfaces at the ends of the
range rather than extract all of the cells with the
\gui{Threshold}~\threshold filter. Be aware that substituting filters can
have an effect on downstream filters. For example, running the
\gui{Histogram}~\histogram filter after
\gui{Threshold}~\threshold will have an entirely different effect than
running it after the roughly equivalent \gui{Contour}~\contour filter.
\section{Keeping Track of Memory}
\index{memory inspector|(}
When working with very large models, it is important to keep track of
memory usage on your computer. One of the most common and frustrating
problems encountered with large models is running out of memory. This in
turn will lead to thrashing in the virtual memory system or an outright
program fault.
Sections \ref{sec:AvoidingDataExplosion} and \ref{sec:CullingData} provide
suggestions to reduce your memory usage. Even so, it is wise to keep an eye
on the memory available in your system. ParaView provides a tool called the
\keyterm{memory inspector} designed to do just that.
\begin{inlinefig}
\includegraphics[width=\scw]{images/MemoryInspector}
\end{inlinefig}
To access the memory inspector, select in the menu bar \gui{View} \ra
\gui{Memory Inspector}. The memory inspector provides information for both
the client you are running on and any server you might be connected to. It
will tell you the total amount of memory used on the system and the amount
of memory ParaView is using. For servers containing multiple nodes,
information both for the conglomerate job and for each individual node are
given. Note that a memory issue in any single node can cause a problem for
the entire ParaView job.
\index{memory inspector|)}
\section{Rendering}
\index{rendering|(}
Rendering is the process of synthesizing the images that you see based on
your data. The ability to effectively interact with your data depends
highly on the speed of the rendering. Thanks to advances in 3D hardware
acceleration, fueled by the computer gaming market, we have the ability to
render 3D quickly even on moderately priced computers. But, of course, the
speed of rendering is proportional to the amount of data being rendered.
As data gets bigger, the rendering process naturally gets slower.
\index{rendering!interactive|see{interactive render}}
\index{rendering!still|see{still render}}
To ensure that your visualization session remains interactive, ParaView
supports two modes of rendering that are automatically flipped as
necessary. In the first mode, \keyterm{still render}, the data is rendered
at the highest level of detail. This rendering mode ensures that all of
the data is represented accurately. In the second mode,
\keyterm{interactive render}, speed takes precedence over accuracy. This
rendering mode endeavors to provide a quick rendering rate regardless of
data size.
While you are interacting with a 3D view, for example rotating, panning, or
zooming with the mouse, ParaView uses an interactive render. This is
because during the interaction a high frame rate is necessary to make these
features usable and because each frame is immediately replaced with a new
rendering while the interaction is occurring so that fine details are less
important during this mode. At any time when interaction of the 3D view is
not taking place, ParaView uses a still render so that the full detail of
the data is available as you study it. As you drag your mouse in a 3D view
to move the data, you may see an approximate rendering while you are moving
the mouse, but the full detail will be presented as soon as you release the
mouse button.
The interactive render is a compromise between speed and accuracy. As
such, many of the rendering parameters concern when and how lower levels of
detail are used.
\subsection{Basic Rendering Settings}
\label{sec:BasicRenderingSettings}
Some of the most important rendering options are the LOD parameters.
During interactive rendering, the geometry may be replaced with a lower
\keyterm{level of detail} (\keyterm{LOD}), an approximate geometry with
fewer polygons.
\begin{inlinefig}
\includegraphics[width=.28\linewidth]{images/GeometricLODFull}
\includegraphics[width=.28\linewidth]{images/GeometricLOD50}
\includegraphics[width=.28\linewidth]{images/GeometricLOD10}
\end{inlinefig}
The resolution of the geometric approximation can be controlled. In the
proceeding images, the left image is the full resolution; the middle image
is the default decimation for interactive rendering, and the right image is
ParaView's maximum decimation setting.
The 3D rendering parameters are located in the settings dialog box which is
accessed in the menu from \gui{Edit} \ra \gui{Settings} (\gui{ParaView} \ra
\gui{Preferences} on the Mac). The rendering options in the dialog
are in the \gui{Render View} tab.
\begin{inlinefig}
\includegraphics[width=0.8\scw]{images/SettingsRendering}
\end{inlinefig}
The options pertaining to the geometric decimation for interactive
rendering are located in a section labeled \gui{Interactive Rendering
Options}. Some of these options are considered advanced, so to access
them you have to either toggle on the advanced options with the
\icon{pqAdvanced26} button or search for the option using the edit box at
the top of the dialog. The interactive rendering options include the
following.
\begin{itemize}
\item \index{LOD Threshold} Set the data size at which to use a decimated
geometry in interactive rendering. If the geometry size is under this
threshold, ParaView always renders the full geometry. Increase this value
if you have a decent graphics card that can handle larger data. Try
decreasing this value if your interactive renders are too slow.
\item \index{LOD Resolution} Set the factor that controls how large the
decimated geometry should be. This control is set to a value between 0
and 1. 0 produces a very small number of triangles but possibly with a
lot of distortion. 1 produces more detailed surfaces but with larger
geometry. \icon{pqAdvanced26}
\item \index{interactive render!delay} Add a delay between an interactive
render and a still render. ParaView usually performs a still render
immediately after an interactive motion is finished (for example,
releasing the mouse button after a rotation). This option can add a delay
that can give you time to start a second interaction before the still
render starts, which is helpful if the still render takes a long time to
complete. \icon{pqAdvanced26}
\item \index{interactive render!outline} Use an outline in place of
decimated geometry. The outline is an alternative for when the geometry
decimation takes too long or still produces too much geometry. However, it
is more difficult to interact with just an outline.
\end{itemize}
ParaView contains many more rendering settings. Here is a summary of some
other settings that can effect the rendering performance regardless of
whether ParaView is run in client-server mode or not. These options are
spread among several categories, and several are considered advanced.
\begin{description}
\item[\gui{Geometry Mapper Options}]~
\begin{itemize}
\item \index{immediate mode rendering} \index{display lists} Enable or
disable the use of display lists. Display lists are internal structures
built by graphics systems. They can potentially speed up rendering but
can also take up memory.
\end{itemize}
\item[\gui{Translucent Rendering Options}]~
\begin{itemize}
\item \index{depth peeling} Enable or disable depth peeling. Depth
peeling is a technique ParaView uses to properly render translucent
surfaces. With it, the top surface is rendered and then ``peeled away''
so that the next lower surface can be rendered and so on. If you find
that making surfaces transparent really slows things down or renders
completely incorrectly, then your graphics hardware may not be
implementing the depth peeling extensions well; try shutting off depth
peeling. \icon{pqAdvanced26}
\item Set the maximum number of peels to use with depth peeling. Using
more peels allows more depth complexity but allowing less peels runs
faster. You can try adjusting this parameter if translucent geometry
renders too slow or translucent images do not look correct.
\icon{pqAdvanced26}
\end{itemize}
\item[\gui{Miscellaneous}]~
\begin{itemize}
\item When creating very large datasets, default to the outline
representation. Surface representations usually require ParaView to
extract geometry of the surface, which takes time and memory. For data
with size above this threshold, use the outline representation, which
has very little overhead, by default instead.
\item \index{rendering!performance} Show or hide annotation providing
rendering performance information. This information is handy when
diagnosing performance problems. \icon{pqAdvanced26}
\end{itemize}
\end{description}
Note that this is not a complete list of ParaView rendering settings. We
have left out settings that do not significantly effect rendering
performance. We have also left out settings that are only valid for
parallel client-server rendering, which are discussed in
Section~\ref{sec:ParallelRenderParameters}.
\subsection{Basic Parallel Rendering}
\index{rendering!parallel|(}
When performing parallel visualization, we are careful to ensure that the
data remains partitioned among all of the processes up to and including
the rendering processes. ParaView uses a parallel rendering library called
\keyterm{IceT}. IceT uses a \keyterm{sort-last} algorithm for parallel
rendering. This parallel rendering algorithm has each process
independently render its partition of the geometry and then
\keyterm{composites} the partial images together to form the final image.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelRendering}
\end{inlinefig}
The preceding diagram is an oversimplification. IceT contains multiple
parallel image compositing algorithms such as \keyterm{binary tree},
\keyterm{binary swap}, and \keyterm{radix-k} that efficiently divide work
among processes using multiple phases.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelRenderingDetail}
\end{inlinefig}
The wonderful thing about sort-last parallel rendering is that its
efficiency is completely insensitive to the amount of data being rendered.
This makes it a very scalable algorithm and well suited to large data.
However, the parallel rendering overhead does increase linearly with the
number of pixels in the image. Consequently, some of the rendering
parameters deal with the image size.
\begin{inlinefig}
\includegraphics[scale=\bbscale]{images/ParallelRenderingTiles}
\end{inlinefig}
IceT also has the ability to drive tiled displays, large, high-resolution
displays comprising an array of monitors or projectors. Using a sort-last
algorithm on a tiled display is a bit counterintuitive because the number
of pixels to composite is so large. However, IceT is designed to take
advantage of spatial locality in the data on each process to drastically
reduce the amount of compositing necessary. This spatial locality can be
enforced by applying the \gui{D3} filter to your data.
Because there is an overhead associated with parallel rendering, ParaView
has the ability to turn off parallel rendering at any time. When parallel
rendering is turned off, the geometry is shipped to the location where
display occurs. Obviously, this should only happen when the data being
rendered is small.
\subsection{Image Level of Detail}
The overhead incurred by the parallel rendering algorithms is proportional
to the size of the images being generated. Also, images generated on a
server must be transfered to the client, a cost that is also proportional
to the image size. To help increase the frame rate during interaction,
ParaView introduces a new LOD parameter that controls the size of the
images.
During interaction while parallel rendering, ParaView can optionally
\index{subsample}subsample the image. That is, ParaView will reduce the
resolution of the image in each dimension by a factor during interaction.
Reduced images will be rendered, composited, and transfered. On the
client, the image is inflated to the size of the available space in the
GUI.
\begin{inlinefig}
\includegraphics[width=.2\linewidth]{images/ImageLODFull}
\includegraphics[width=.2\linewidth]{images/ImageLOD2}
\includegraphics[width=.2\linewidth]{images/ImageLOD4}
\includegraphics[width=.2\linewidth]{images/ImageLOD8}
\end{inlinefig}
The resolution of the reduced images is controlled by the factor with which
the dimensions are divided. In the proceeding images, the left image has
the full resolution. The following images were rendered with the
resolution reduced by a factor of 2, 4, and 8, respectively.
ParaView also has the ability to compress images before transferring them
from server to client. Compression, of course, reduces the amount of data
transferred and therefore makes the most of the available bandwidth.
However, the time it takes to compress and decompress the images adds to
the latency.
ParaView contains two different image compression algorithms for
client-server rendering. The first is a custom algorithm called
\keyterm{Squirt}, which stands for Sequential Unified Image Run Transfer.
Squirt is a run-length encoding compression that reduces color depth to
increase run lengths. The second algorithm uses the \keyterm{Zlib}
compression library, which implements a variation of the Lempel-Ziv
algorithm. Zlib typically provides better compression than Squirt, but
takes longer to perform and hence adds to the latency.
\subsection{Parallel Render Parameters}
\label{sec:ParallelRenderParameters}
\begin{inlinefig}
\includegraphics[width=0.8\scw]{images/SettingsServer}
\end{inlinefig}
Like the other 3D rendering parameters, the parallel rendering parameters
are located in the settings dialog box, which is accessed in the menu from
\gui{Edit} \ra \gui{Settings} (\gui{ParaView} \ra \gui{Preferences} on the
Mac). The parallel rendering options in the dialog are in the \gui{Render
View} tab (intermixed with several other rendering options such as those
described in Section~\ref{sec:BasicRenderingSettings}). The parallel and
client-server options are divided among several categories, and several are
considered advanced.
\begin{description}
\item[\gui{Remote/Parallel Rendering Options}]~
\begin{itemize}
\item \index{remote render threshold} Set the data size at which to
render remotely in parallel or to render locally. If the geometry is
over this threshold (and ParaView is connected to a remote server), the
data is rendered in parallel remotely and images are sent back to the
client. If the geometry is under this threshold, the geometry is sent
back to the client and images are rendered locally on the client.
\item Set the sub-sampling factor for still (non-interactive) rendering.
Some large displays have more resolution than is really necessary, so
this sub-sampling reduces the resolution of all images displayed.
\icon{pqAdvanced26}
\end{itemize}
\item[\gui{Client/Server Rendering Options}]~
\begin{itemize}
\item \index{interactive render!subsample} \index{subsample} Set the
interactive subsampling factor. The overhead of parallel rendering is
proportional to the size of the images generated. Thus, you can speed
up interactive rendering by specifying an image subsampling rate. When
this box is checked, interactive renders will create smaller images,
which are then magnified when displayed. This parameter is only used
during interactive renders. \icon{pqAdvanced26}
\end{itemize}
\item[\gui{Image Compression}]~
\begin{itemize}
\item Before images are shipped from server to client, they optionally
can be compressed using one of two compression algorithms:
Squirt\index{Squirt} or Zlib\index{Zlib}. To make the compression more
effective, either algorithm can reduce the color resolution of the
image before compression. The sliders determine the amount of color
bits saved. Full color resolution is always used during a still
render. \icon{pqAdvanced26}
\item Suggested image compression presets are provided for several common
network types. When attempting to select the best image compression
options, try starting with the presets that best match your connection.
\icon{pqAdvanced26}
\end{itemize}
\end{description}
\index{rendering!parallel|)}
\subsection{Parameters for Large Data}
The default rendering parameters are suitable for most users. However,
when dealing with very large data, it can help to tweak the rendering
parameters. The optimal parameters depend on your data and the hardware
ParaView is running on, but here are several pieces of advice that you
should follow.
\begin{itemize}
\item Try turning off display lists. Turning this option off will
prevent the graphics system from building special rendering structures.
If you have graphics hardware, these rendering structures are important
for feeding the GPUs fast enough. However, if you do not have GPUs,
these rendering structures do not help much.
\item If there is a long pause before the first interactive render of a
particular data set, it might be the creation of the decimated
geometry. Try using an outline instead of decimated geometry for
interaction. You could also try lowering the factor of the decimation to
0 to create smaller geometry.
\item Avoid shipping large geometry back to the client. The remote
rendering will use the power of entire server to render and ship images
to the client. If remote rendering is off, geometry is shipped back to
the client. When you have large data, it is always faster to ship images
than to ship data (although if your network has a high latency, this
could become problematic for interactive frame rates).
\item Adjust the interactive image sub-sampling for client-server rendering
as needed. If image compositing is slow, if the connection between
client and server has low bandwidth, or if you are rendering very large
images, then a higher subsample rate can greatly improve your interactive
rendering performance.
\item Make sure \gui{Image Compression} is on. It has a tremendous effect