Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda: Improve CUDA component PAPI_read() overhead, issue 85 #99

Conversation

anustuvicl
Copy link
Contributor

  • Improves overhead from 15000ms to 2200ms for 100 PAPI_read() calls
  • Improves speed of papi_native_avail utility
  • Improvements over old component
    • PAPI_start() correctly resets counter values
    • Measurements consistent across reads

@anustuvicl anustuvicl requested a review from gcongiu October 6, 2023 18:02
@anustuvicl anustuvicl linked an issue Oct 6, 2023 that may be closed by this pull request
Copy link
Contributor

@gcongiu gcongiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit is too hard to review. Please split it into simpler commits

@@ -22,6 +22,7 @@ typedef struct byte_array_s byte_array_t;
typedef struct cuptip_gpu_state_s cuptip_gpu_state_t;
typedef struct list_metrics_s list_metrics_t;
typedef struct NVPA_MetricsContext NVPA_MetricsContext;
typedef NVPW_CUDA_MetricsContext_Create_Params MCCP_t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this adds nothing and makes code harder to read.

static int check_num_passes(struct NVPA_RawMetricsConfig *pRawMetricsConfig, int rmr_count,
NVPA_RawMetricRequest *rmr, int *num_pass);
static enum collection_method_e get_event_collection_method(const char *evt_name);

static int nvpw_cuda_metricscontext_create(cuptip_control_t state);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is static no need for name spacing

@@ -37,27 +38,30 @@ static int initialize_cupti_profiler_api(void);
static int finalize_cupti_profiler_api(void);
static int initialize_perfworks_api(void);
static int get_chip_name(int dev_num, char* chipName);
static int init_all_metrics(void);
static int find_same_chipname(int gpu_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function name is confusing. Same chip name as what?

@anustuvicl
Copy link
Contributor Author

I already merged it from several commits into one. I dont have the original history due to force push. Please try to review as one commit.

Copy link
Contributor

@gcongiu gcongiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase onto master before merging

@gcongiu gcongiu force-pushed the 2023.08.15-papi_cuda_reduce_read_overhead branch from d3e7751 to 8d62d2c Compare October 10, 2023 14:13
@anustuvicl anustuvicl force-pushed the 2023.08.15-papi_cuda_reduce_read_overhead branch from 8d62d2c to 903c36b Compare October 10, 2023 15:03
@anustuvicl anustuvicl force-pushed the 2023.08.15-papi_cuda_reduce_read_overhead branch from 903c36b to e7307f0 Compare October 18, 2023 16:04
@anustuvicl anustuvicl force-pushed the 2023.08.15-papi_cuda_reduce_read_overhead branch from e7307f0 to d70c7cf Compare October 18, 2023 16:08
@anustuvicl anustuvicl merged commit 8f337c6 into icl-utk-edu:master Oct 18, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PAPI CUDA: PAPI_read performance degradation
2 participants