Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Snippets] SplitDimensionM: heuristic update #28180

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

v-Golubev
Copy link
Contributor

@v-Golubev v-Golubev commented Dec 23, 2024

Details:

  • SplitDimensionM heuristics are divided into several logic parts
  • Added new heuristic for big shapes which minimizes m_kernel

Tickets:

  • N/A

@v-Golubev v-Golubev requested review from a team as code owners December 23, 2024 10:15
@v-Golubev v-Golubev requested review from tsavina and removed request for a team December 23, 2024 10:15
@github-actions github-actions bot added category: CPU OpenVINO CPU plugin category: docs OpenVINO documentation labels Dec 23, 2024
@v-Golubev v-Golubev force-pushed the vg/snippets/split_m_update_heuristic branch 6 times, most recently from 205e16c to 78a43ee Compare December 23, 2024 19:20
@v-Golubev
Copy link
Contributor Author

@a-sidorova @IvanNovoselov could you please take a look? Thanks in advance

Comment on lines +70 to +76
if (divisor >= min_kernel_m)
return std::make_pair(m_dim / divisor, divisor);
const size_t m_kernel = m_dim / divisor;
if (m_kernel >= min_kernel_m) {
best_result.first = divisor;
best_result.second = m_kernel;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear from this code why we try to find maximal divisor? If divisor >= min_kernel_m we return, but if m_kernel >= min_kernel_m we continue. Why?
It seems to me that this problem is symmetrical and should be treated accordingly:

  1. if (m_dim % divisor == 0) => we can split M, great. There are 2 ways to do that:
    a. (m_dim / divisor, divisor)
    b. (divisor, m_dim / divisor)
  2. These 2 ways are absolutely identical except for that this always holds: divisor < (m_dim / divisor)
  3. It means that one way to split optimally is to start from the max divisor (=sqrt(m_dim)), go downward and return as soon as the parallel work is sufficient: if(batch_dim * m_dim/divisor >= optimal_parallelism_work_amount). This way we'll make sure that both the kernel WA is maximal (since we're going downwards) and the parallel WA is optimal.
  4. Alternatively, we can start from the minimal acceptable divisor, which should be min_kernel_m, go upward and return as soon as (m_dim % divisor == 0). This way we'll guarantee that the kernel work amount is larger than the minimal one (since we started from min_kernel_m) and the parallel work amount is maximal (since m_dim / divisor is deceasing). But that's not the case in this particular function, since we want to maximize the kernel WA.
  5. Sometimes these min_kernel_m and optimal_parallelism_work_amount can be mutually exclusive, so we should think carefully which is more important. I guess that the parallel work amount should be prioritized, so the approach from 3 should be used.
  6. It looks like we try to implement a mix of the above strategies here: we inspect both a and b splits, and return a if the min_kernel WA is achieved and b if parallel WA is satisfied. This may be not consistent in some circumstances, especially when the parallel & kernel WA limitations can't be fulfilled simultaneously.

Comment on lines +73 to +81
static std::pair<size_t, size_t> compute_ideal_cases_heuristic(size_t batch_dim, size_t m_dim, size_t optimal_parallelism_work_amount);
/**
* @brief Aggressively splits m_dim to minimize kernel_m in order to reduce waiting time for idle threads at the last parallel loop iteration.
*/
static std::pair<size_t, size_t> compute_aggressive_heuristic(size_t batch_dim, size_t m_dim, size_t optimal_parallelism_work_amount);
/**
* @brief Conservatively splits m_dim to get the batch in (optimal_parallelism_work_amount, 2 * optimal_parallelism_work_amount) interval
*/
static std::pair<size_t, size_t> compute_conservative_heuristic(size_t batch_dim, size_t m_dim, size_t optimal_parallelism_work_amount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments with respect to the methods' names:

  1. the functions try to split M dimension into two: batch_m and new_m, so it seems logical to call them smth like get_splitted_* or split_m_* or compute_m_split etc. Heuristic is used inside this functions to perform the split, so the functions do not compute heuristic, but rather use it to compute smth else (split).
  2. We should try to use more descriptive names instead of aggressive or conservative, because what seems conservative to one may seem aggressive to the other. Possible options include: split_ideally, split_max_kernel_work and split_max_parallel_work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin category: docs OpenVINO documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants