Standardize code for HLS #42

lforg37 · 2021-04-14T16:55:42Z

Add monostate in front of each variant (#41), replace remaining float with real_t

keryell

There are some ideas but think generic & functional programming! :-)

include/constant_medium.hpp

keryell · 2021-04-15T00:56:54Z

The float_t will allow some fixed point artistic stuff by trying some famous bibliothèques lyonnaises. :-)

Ralender · 2021-04-15T10:35:14Z

include/rtweekend.hpp

-constexpr float infinity = std::numeric_limits<float>::infinity();
-constexpr float pi = 3.1415926535897932385f;
+constexpr real_t infinity = std::numeric_limits<real_t>::infinity();
+constexpr real_t pi = 3.1415926535897932385f;


literals are still assuming float which means that if we change real_t to something else we will get some conversions. maybe make our own user-defined literal could help abstract this.

real_t operator "" _r(long double d) { return d; }

this would force conversion to be on the literals directly such that they can be constant folded

Just use constinit.
I guess a lot of constexpr should be constinit for FPGA. To mention in the poster...

keryell

Great work!
But I am not convinced by the new design.
I cannot see a reason to use this explicit visitor pattern from a pre-modern C++ or Java world when there was no generic lambdas...
You have created a new explicit visitor with all the object-dependent implementation in it breaking the encapsulation by in-lining here all what was in the hit function before.
After looking at the code now with the increased complexity with new kitchen-sink visitor requiring direct access to public object members or über-friendship, I am even less convinced...
Having a distributed hit like before does prevent to add some hierarchy if you have 2 different implementations of sphere for example, by having a hit implemented in some sphere specific class which is inherited by the various spheres.
To avoid blocking this PR I suggest to split it in 2, by moving this complex refactoring in a new one where we can think more about it.
Thanks for this experimentation! It would be interesting if there is any impact on performance and QoR now we have it...

keryell · 2021-04-16T16:37:21Z

include/hit_record.hpp

+  }
+};
+} // namespace raytracer::visitor
+#endif


Missing end-of-line.
Yes, this is good to have this in its own file.

Curious... An IDE configuration bug?

keryell · 2021-04-16T16:40:03Z

include/hit_record.hpp

+  point p;         // hit point
+  vec normal;      // normal at hit point
+  bool front_face; // to check if hit point is on the outer surface
+  /*local coordinates for rectangles
+  and mercator coordinates for spheres */
+  real_t u;
+  real_t v;
+
+  // To set if the hit point is on the front face


While you are here, add more Doxygen

Suggested change

point p; // hit point

vec normal; // normal at hit point

bool front_face; // to check if hit point is on the outer surface

/*local coordinates for rectangles

and mercator coordinates for spheres */

real_t u;

real_t v;

// To set if the hit point is on the front face

point p; ///< Hit point

vec normal; ///< Normal at hit point

bool front_face; ///< To check if hit point is on the outer surface

/// Local coordinates for rectangles and mercator coordinates for spheres

real_t u;

real_t v;

/// To set if the hit point is on the front face

include/hitable_visitor.hpp

include/localrandom.hpp

include/primitives.hpp

lforg37 · 2021-04-16T18:26:25Z

Discussion about the refactoring can be continued on #43

keryell

Functional + generic C++ = Zen++ :-)

include/hitable.hpp

keryell

By looking at the code I wonder whether we cannot merge dev_visit(monostate_dispatch into a simpler monostate_visit including the visit in the implementation.

keryell · 2021-05-18T02:18:31Z

Where are we about this?

keryell · 2021-06-09T15:18:05Z

@lforg37 what's up with this?
Probably you should also at least make a WIP branch with the current version we discussed today with the random generator with low spatial quality.

lforg37 · 2021-06-09T18:58:05Z

The current version is based on this branch, commits have been added.
I still need to integrate some suggestion from the review.
I'll "undraft" the PR once its done.

keryell · 2021-06-09T20:14:39Z

include/box.hpp

@@ -26,7 +26,7 @@ class box {
  }

  /// Compute ray interaction with the box
-  bool hit(auto& ctx, const ray& r, real_t min, real_t max, hit_record& rec,
+  bool hit(const ray& r, real_t min, real_t max, hit_record& rec,
           material_t& hit_material_type) const {
    hit_record temp_rec;
    material_t temp_material_type;


I have the feeling that if we remove the std::monostate from the material it will always create a sphere here even when there is no intersection.

keryell · 2021-06-09T20:23:00Z

include/constant_medium.hpp

@@ -62,7 +60,8 @@ class constant_medium {
    /// Distance between the two hitpoints affect of probability
    /// of the ray hitting a smoke particle
    const auto distance_inside_boundary = (rec2.t - rec1.t) * ray_length;
-    const auto hit_distance = neg_inv_density * sycl::log(rng.float_t());
+    auto rng = LocalPseudoRNG { toseed(r.direction()) };


Ah, I realize you have removed the context passed everywhere and replaced basically the random generator with some local computation.
What was the rationale?

As getting a value from the RNG update its internal state, having a shared RNG creates a very long chain of read after write dependency that prevent the HLS compiler to parallelize the otherwise independent computation steps.

Perhaps we could have this context with some special HLS decorations to ignore some dependencies.

For example https://xilinx.github.io/Vitis_Accel_Examples/2020.2/html/dependence_inter.html

Otherwise, perhaps something like this pseudo HLS-SYCL code:

auto rng(auto local_seed = std::source_location::current()) { return [=] { static auto state = local_seed.line(); #pragma HLS dependence variable=state inter false state = crunch(state); return mix(state); }; }

or

auto rng(auto local_seed = std::source_location::current()) { return [state = local_seed.line()] mutable { #pragma HLS dependence variable=state inter false state = crunch(state); return mix(state); }; }

Perhaps we need extension intel/llvm#3746 to have the big picture working.

@aisoard, @yu810226 any feedback?

Above is basically a good idea, if I understand the intention, to separate the state into multiple independent RNGs. I'm not sure what you're trying to achieve with the false dependence pragma, however.

@stephenneuendorffer yes this is the idea to distribute the state here in a lazy way to even initialize the seed with something depending from the source location (here with the line number, we could also use the file name, etc.).
I has put probably too much here with the

#pragma HLS dependence variable=state inter false

as a way to have an II=1 without consuming too much latency at the price of a semantics change in the code and hoping we still have enough randomness...
But I agree this is a second order optimization if we are still in trouble...
Perhaps this code will not work with SYCL-HLS because it uses some function pointer. If so, we can try with a macro to avoid returning the lambda, or perhaps just have a macro which inline a random generator wherever we want one... Ahem... :-)

include/hitable.hpp

keryell

I made some comparisons between this version and the main one.
You have not update the README about how to run the program.
It looks like time ./sycl-rt 800 480 50 100 is a way to get the same behaviour as before.
It is a little bit faster than before, I guess because there are some parts of the variant which have been removed...
Also I wonder how it is possible to have the motion blur working without a random generator... by having the time or motion parameter included in the hash? It starts being painful...

include/rtweekend.hpp

src/main.cpp

keryell · 2021-06-15T14:29:20Z

include/rtweekend.hpp

-  uint32_t shifted4 = (y2 & 63) << 10;
-  uint32_t shifted5 = (z1 & 63) << 5;
-  uint32_t shifted6 = (z2 & 63);
-  return shifted1 ^ shifted2 ^ shifted3 ^ shifted4 ^ shifted5 ^ shifted6;


Ah I read too quickly and was thinking to 63 for 5 bits...
Actually that seems like a good idea then to have an xor. Would more than 6 bits better than? 127 or 255?

keryell

There are still some super old comments and a merge conflict

keryell · 2021-07-05T13:54:14Z

@lforg37 where are we on this?

keryell

Almost there!

keryell · 2021-07-06T12:52:26Z

include/hit_record.hpp

+  }
+};
+} // namespace raytracer::visitor
+#endif


Curious... An IDE configuration bug?

keryell · 2021-07-06T12:53:07Z

include/material.hpp

-    auto& rng = ctx.rng;
+  bool scatter(auto& ctx, const ray& r_in, const hit_record& rec, color& attenuation,
+               ray& scattered) const {
+    LocalPseudoRNG rng {  toseed(r_in.direction(), r_in.origin()) };


Suggested change

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

keryell · 2021-07-06T12:53:55Z

include/material.hpp

-    auto& rng = ctx.rng;
+  bool scatter(auto&, const ray& r_in, const hit_record& rec, color& attenuation,
+               ray& scattered) const {
+    LocalPseudoRNG rng {  toseed(r_in.direction(), r_in.origin()) };


Suggested change

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

keryell · 2021-07-06T12:54:13Z

include/material.hpp

    // Attenuation of the ray hitting the object is modified based on the color
    // at hit point
-    auto& rng = ctx.rng;
+    LocalPseudoRNG rng {  toseed(r_in.direction(), r_in.origin()) };


Suggested change

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

keryell · 2021-07-06T12:54:49Z

include/material.hpp

-    auto& rng = ctx.rng;
+  bool scatter(auto& ctx, const ray& r_in, const hit_record& rec, color& attenuation,
+               ray& scattered) const {
+    LocalPseudoRNG rng {  toseed(r_in.direction(), r_in.origin()) };


Suggested change

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

I do not know why I am getting bored here...

keryell · 2021-07-06T13:06:56Z

include/rtweekend.hpp

+  std::memcpy(&x2, &val2.x(), sizeof(uint32_t));
+  std::memcpy(&y2, &val2.y(), sizeof(uint32_t));
+  std::memcpy(&z2, &val2.z(), sizeof(uint32_t));
+  uint32_t shifted1 = x1 << 26;


Use auto everywhere

keryell · 2021-07-06T13:09:00Z

include/rtweekend.hpp

+uint32_t toseed(vec const& val1, vec const& val2) {
+  uint32_t x1, y1, z1, x2, y2, z2;
+  std::memcpy(&x1, &val1.x(), sizeof(uint32_t));
+  std::memcpy(&y1, &val1.y(), sizeof(uint32_t));


Suggested change

std::memcpy(&y1, &val1.y(), sizeof(uint32_t));

std::memcpy(&y1, &val1.y(), sizeof y1);

everywhere.
This is clearer to use the type of the destination. Also it is shorter since you can avoid () with sizeof expression. :-)

keryell · 2021-07-06T13:12:57Z

src/main.cpp

 #include <algorithm>
 #include <chrono>
 #include <cstdint>
+#include <ctime>


Curious, why?

keryell · 2021-07-06T13:21:10Z

include/rtweekend.hpp

+  @return uint32_t 
+ */
+uint32_t toseed(vec const& val1, vec const& val2) {
+  uint32_t x1, y1, z1, x2, y2, z2;


You can use the C++ version everywhere instead of the C version : std::uint32_t

keryell · 2021-07-06T13:24:06Z

src/main.cpp

+int main(int argc, char* argv[]) {
+  if (argc < 5 || argc > 7) {
+    std::cerr << "Usage: sycl-rt OUT_WIDTH OUT_HEIGHT DEPTH SAMPLES "
+                 "[SPHERE_INC [RAND_SEED]]"


Update the README to describe this new API

Replace float with real_t

ee32c11

lforg37 requested review from keryell and Ralender April 14, 2021 16:55

keryell requested changes Apr 15, 2021

View reviewed changes

include/constant_medium.hpp Show resolved Hide resolved

Ralender reviewed Apr 15, 2021

View reviewed changes

lforg37 force-pushed the issue41 branch from 6fff7f3 to 3979043 Compare April 16, 2021 09:07

Adding monostate in front of each variant

3979043

lforg37 force-pushed the issue41 branch from 63911e5 to 125e43e Compare April 16, 2021 15:50

keryell reviewed Apr 16, 2021

View reviewed changes

lforg37 force-pushed the issue41 branch from 125e43e to 3979043 Compare April 16, 2021 18:23

lforg37 mentioned this pull request Apr 16, 2021

Refactoring #43

Closed

keryell requested changes Apr 16, 2021

View reviewed changes

include/hitable.hpp Outdated Show resolved Hide resolved

lforg37 force-pushed the issue41 branch from fa67cae to 1893a47 Compare April 20, 2021 11:28

Luc Forget added 2 commits April 20, 2021 04:30

Replace visitor object by function

1893a47

Simplified version

1741496

lforg37 requested a review from keryell April 20, 2021 13:12

keryell reviewed Apr 21, 2021

View reviewed changes

Get rid of context, makeall dimension runtime values

8d3c225

Luc Forget added 2 commits May 25, 2021 06:59

WIP feature restauration

6ceb4ce

Restore Features

60c883a

Get rid of monostate

8a73547

lforg37 marked this pull request as draft June 9, 2021 18:56

keryell reviewed Jun 9, 2021

View reviewed changes

Luc Forget added 2 commits June 11, 2021 00:49

Formatting

4644777

Improving decentralised RNG seed

70c34f9

lforg37 marked this pull request as ready for review June 11, 2021 12:03

lforg37 marked this pull request as draft June 11, 2021 12:05

FPGA friendly local random seed generator

b92d9c2

lforg37 marked this pull request as ready for review June 14, 2021 13:26

keryell reviewed Jun 14, 2021

View reviewed changes

include/rtweekend.hpp Show resolved Hide resolved

include/rtweekend.hpp Show resolved Hide resolved

include/rtweekend.hpp Outdated Show resolved Hide resolved

src/main.cpp Outdated Show resolved Hide resolved

Luc Forget added 3 commits June 15, 2021 01:27

Address review comments

635037d

Restore motion blur

5e96817

Restore image texture and xy_rect

092b972

keryell reviewed Jun 15, 2021

View reviewed changes

keryell requested changes Jun 28, 2021

View reviewed changes

Merging main

8afb9cb

keryell requested changes Jul 6, 2021

View reviewed changes

	LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };
	LocalPseudoRNG rng { toseed(r_in.direction(), r_in.origin()) };

	std::memcpy(&y1, &val1.y(), sizeof(uint32_t));
	std::memcpy(&y1, &val1.y(), sizeof y1);

Standardize code for HLS #42

Are you sure you want to change the base?

Standardize code for HLS #42

Conversation

lforg37 commented Apr 14, 2021

keryell left a comment

Choose a reason for hiding this comment

keryell commented Apr 15, 2021

Choose a reason for hiding this comment

keryell Apr 16, 2021 • edited Loading

Choose a reason for hiding this comment

keryell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lforg37 commented Apr 16, 2021

keryell left a comment

Choose a reason for hiding this comment

keryell left a comment

Choose a reason for hiding this comment

keryell commented May 18, 2021

keryell commented Jun 9, 2021

lforg37 commented Jun 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keryell left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keryell left a comment

Choose a reason for hiding this comment

keryell commented Jul 5, 2021

keryell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keryell Apr 16, 2021 •

edited

Loading

keryell left a comment •

edited

Loading