This repository has been archived by the owner on Jan 26, 2024. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
SWDEV-351980 - Fix a race condition when setting the callbacks
When acquiring the reader's lock (sem_sync()...sem_release()), it is possible for a writer to squeeze by if the reader goes into sync_wait(). The writer could re-acquire the entry.sync between entry.sem > 0 and entry.sync = 0. void sync_wait(uint32_t id) { sem_decrement(id); while (entry(id).sync.load()) {} // <--- HERE sem_increment(id); } This could result in both the reader and the writer accessing { callback, arg } at the same time, and the reader could read inconsistent data, for example: { new callback, old arg }. The solution is to re-test entry.sync when returning from sync_wait(): void sem_sync(uint32_t id) { sem_increment(id); - if (entry(id).sync.load() == true) sync_wait(id); + while (entry(id).sync.load() == true) sync_wait(id); } (cherry picked from commit 7e9b355) SWDEV-351980 - Use std::shared_mutex Replace the custom reader/writer lock with the standard implementation available in C++17. (cherry picked from commit 7526b42) Improve hip_prof_api.h's readability - Don't pass uint32_t arguments by reference. - Use nullptr instead of NULL. - Don't add frivolous typedefs. - Use correct types when available instead of generic integral types. - Make all roctracer callbacks extern "C" to prepare for a future change that will be removing their declaration from hip_runtime_api.h. - Rename cb/sem sync and release functions -> reader_lock/writer_lock acquire and release. (cherry picked from commit 5a6a83e) SWDEV-351980 - Move hip_api_data and record to the HIP function's stack Since the hip_api_data and record are only needed at the HIP function's scope, there is no need to allocate/free them in the ROCtracer activity callback, they can reside on the HIP function's stack frame. This solves an issue with the thread local stacks of records the tracer maintains that are destroyed first (before any global destructor) on process exit, making it impossible to use HIP functions in global destructors when the profiler is enabled. (cherry picked from commit 5ea0e6b) SWDEV-351980 - Remove IS_PROFILER_ON The CallbacksTable::is_enabled() can simply be implemented by checking if enabled_api_count is > 0. The ROCclr does not use IS_PROFILER_ON to report asynchronous activities. (cherry picked from commit dae2ea8) SWDEV-351980 - Consolidate registration tables in the roctracer library Remove the api_callbacks_table_t that was holding the API activities and user callbacks. Instead use a single roctracer callback (TracerCallback) used to report both API activities and callbacks. Remove the hipInitActivityCallback that was setting the ROCtracer callback and memory pool for asynchronous activities as it did not allow disctinct pools to be used for each activity. Instead, use hipRegisterTracerCallback to set the single roctracer callback. (cherry picked from commit b9cf518) SWDEV-351980 - Remove the ROCtracer private interface from the public header (cherry picked from commit 5d71ec1) SWDEV-359838 - Add a phase data pointer to the hip_api_data_t To avoid using the thread local std::stack to remember the phase enter timestamp, the tracer tool uses the phase data to store the timestamp. (cherry picked from commit 928684d) Fix a build error when compiling with clang Fix the following error: hip_intercept.cpp:52:7: error: reinterpret_cast from 'const void *' to 'decltype(activity_prof::report_activity.load())' (aka 'int (*)(activity_domain_t, unsigned int, void *)') casts away qualifiers reinterpret_cast<decltype(activity_prof::report_activity.load())>(function), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ by replacing the 'const void *function' argument with the correct type. (cherry picked from commit be33ec5) Change-Id: I0fd0121ea58eb8f6aa3f6511303d3f2baa1191ad
- Loading branch information