Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

host implementations of atomic functions? #6

Open
seibert opened this issue Jun 17, 2013 · 6 comments
Open

host implementations of atomic functions? #6

seibert opened this issue Jun 17, 2013 · 6 comments

Comments

@seibert
Copy link

seibert commented Jun 17, 2013

I think it would be useful to offer a simple implementation of functions like atomicAdd(), atomicInc, etc, that are used when compiling hemi code for CPU execution. Currently I have to use #ifdef HEMI_DEV_CODE to hide uses of atomicAdd() from the host compiler.

If this sounds reasonable, I'm happy to do the implementation and issue a pull request.

@harrism
Copy link
Owner

harrism commented Jun 17, 2013

I would consider a pull request, definitely. But first: how would you
implement the atomics on the CPU?

Also, nice to hear from a Hemi user. Can you tell me how you are using it?

Thanks!
Mark

On Tue, Jun 18, 2013 at 1:27 AM, Stan Seibert [email protected]:

I think it would be useful to offer a simple implementation of functions
like atomicAdd(), atomicInc, etc, that are used when compiling hemi code
for CPU execution. Currently I have to use #ifdef HEMI_DEV_CODE to hide
uses of atomicAdd() from the host compiler.

If this sounds reasonable, I'm happy to do the implementation and issue a
pull request.


Reply to this email directly or view it on GitHubhttps://github.com//issues/6
.

@seibert
Copy link
Author

seibert commented Jun 18, 2013

Since the CPU version of the code-path with Hemi is inherently single-threaded, I imagined that the atomic operations when compiling for the CPU would literally be things like:

inline int atomicAdd(int *address, int val)
{
  int old = *address;
  *address = old + val;
  return old;
}

This code has a race condition, of course, but I don't see generating multi-threaded CPU code as a use case for Hemi. (If I'm incorrect about that, then my proposed solution won't work.) This assumption would need to be clearly stated in the documentation, just in case someone tries to do something bizarre.

We're using Hemi to implement a new CUDA program where we want to preserve the option of compiling the code in CPU-only mode. We are working around cases that Hemi doesn't easily handle yet (like atomics, streams and shared memory) with #ifdefs in our code. The atomics have an easy solution, so that's why I proposed it first.

Eventually, it would be nice to be able to compile Hemi in CPU mode without the CUDA headers at all, but that would require stubbing out a few things, I think

@harrism
Copy link
Owner

harrism commented Jun 18, 2013

I would like to preserve the ability to use, for example, OpenMP with Hemi.

On Tue, Jun 18, 2013 at 10:37 AM, Stan Seibert [email protected]:

Since the CPU version of the code-path with Hemi is inherently
single-threaded, I imagined that the atomic operations when compiling for
the CPU would literally be things like:

inline int atomicAdd(int *address, int val){
int old = *address;
*address = old + val;
return old;}

This code has a race condition, of course, but I don't see generating
multi-threaded CPU code as a use case for Hemi. (If I'm incorrect about
that, then my proposed solution won't work.) This assumption would need to
be clearly stated in the documentation, just in case someone tries to do
something bizarre.

We're using Hemi to implement a new CUDA program where we want to preserve
the option of compiling the code in CPU-only mode. We are working around
cases that Hemi doesn't easily handle yet (like atomics, streams and shared
memory) with #ifdefs in our code. The atomics have an easy solution, so
that's why I proposed it first.

Eventually, it would be nice to be able to compile Hemi in CPU mode
without the CUDA headers at all, but that would require stubbing out a few
things, I think


Reply to this email directly or view it on GitHubhttps://github.com//issues/6#issuecomment-19584230
.

@seibert
Copy link
Author

seibert commented Jun 18, 2013

That makes this harder, since I can't think of a generic way to implement the atomic that works for any multi-threading situation. If Hemi specifically only supported OpenMP for multi-threading on the CPU, then I think this would work:

inline int atomicAdd(int *address, int val)
{
  int old;
  #pragma omp critical
  {
    old = *address;
    *address = old + val;
  }
  return old;
}

@harrism
Copy link
Owner

harrism commented Jun 26, 2013

I would prefer to do this in as flexible and unobtrusive way as possible. I would include it in a separate header, and just provide the sequential CPU implementation. omp critical could be added around the calls to atomicAdd rather than inside them (for example).

@lordofhyphens
Copy link

Since OpenMP is compiler-based, no harm in leaving the pragmas for the critical section in Hemi. The way that critical sections are handled, directive-wise in OpenMP are all global names. Trying to come up with a scheme that is efficient is likely beyond the scope of Hemi. A lock/unlock pattern using the primitives is as probably as good as we're going to get.
I'd probably overload these functions to allow passing in a mutex or other platform-specific way.

Flexible, should be very unobtrusive (if you know you're using something special, then you can pass in what you want). Even the overloads wouldn't need to duplicate code -- they'd just be a call to the basic sequential function with whatever's needed in guards.

The only "trick" I can think of right now is that we'd want to ensure we replicate the behavior of the device function, specifically because there isn't any waiting on the device if another thread tries simultaneous access. The operation simply fails.

lordofhyphens referenced this issue in lordofhyphens/hemi Jul 25, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants