[Project Proposal] HPCToolkit #18

blue42u · 2024-11-10T16:09:22Z

1. Name of Project

HPCToolkit

2. Project Description

HPCToolkit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to GPU-accelerated supercomputers. By using statistical sampling of timers and hardware performance counters on CPUs, HPCToolkit collects accurate measurements of a program's CPU work, resource consumption, and inefficiency and attributes them to the full calling context in which they occur. By monitoring GPU operations, gathering instruction-level metrics within GPU kernels, and attributing the costs of GPU work to heterogeneous calling contexts, HPCToolkit provides insight into the performance of GPU-accelerated codes. HPCToolkit works with multilingual, fully optimized applications that are dynamically linked. HPCToolkit is designed for use on large parallel systems. HPCToolkit's presentation tools enable rapid analysis of a program's execution costs, inefficiency, and scaling characteristics both within and across nodes of a parallel system. HPCToolkit supports measurement and analysis of serial codes, threaded codes (e.g. pthreads, OpenMP), MPI, and hybrid (MPI+threads) parallel codes, as well as GPU-accelerated codes that offload computation to AMD, Intel, or NVIDIA GPUs.

3. Statement on Alignment with High Performance Software Foundation's Mission

HPCToolkit is and aims to be a best-in-class performance tool for leadership supercomputers. It is one of the only performance tools able to run at leadership scales with detailed instruction-level performance attribution. Its functionality rivals the performance tools provided by Nvidia, AMD, Intel, and Cray on their own hardware. These features make HPCToolkit a necessary piece of a future HPC ecosystem dominated by cloud and AI at scale.

HPCToolkit is committed to providing quality performance analysis for a wide range of languages and platforms, particularly targeting developers of large-scale HPC applications. HPSF provides HPCToolkit with a neutral home and safe stewardship for our stakeholders in government and academia, and opens HPCToolkit to future collaboration opportunities.

4. Project Website (please provide a link)

Project Website

5. Open Source License (please provide a link)

SPDX Identifier: BSD-3-Clause (considering a relicense to Apache-2.0)
LICENSE.md

Data artifacts are licensed under the CDLA Permissive 2.0 license (SPDX: CLDA-Permissive-2.0).

6. Code of Conduct (please provide a link)

We adopt the generic LF Code of Conduct.

7. Governance Practices (please provide a link)

Project Governance

8. Two Sponsors from the High Performance Software Foundation's Technical Advisory Committee

Todd Gamblin and Damien Lebrun-Grandie

9. What is the project's solution for source control?

GitLab.com, Git repositories under the @hpctoolkit group (e.g. hpctoolkit/hpctoolkit>).

10. What is the project's solution for issue tracking?

GitLab issues

11. Please list all external dependencies and their license

C/C++:

Meson (Apache-2.0)
Python (PSF-2.0)
TBB (Apache-2.0)
Elfutils (GPL-2.0-only OR LGPL-3.0-or-later)
Boost (BSL-1.0)
libunwind (MIT)
Xerces C/C++ (Apache-2.0)
XZ Utils (0BSD)
yaml-cpp (MIT)
libiberty (LGPL-2.1-only)
Intel Xed (Apache-2.0)
Perfmon2 / libpfm4 (MIT)
Dyninst (LGPL-2.1-only)
Valgrind (headers) (bzip2-1.0.6)
Optional:
- PAPI (BSD-3-Clause)
- An MPI implementation (licenses vary)
- Nvidia's CUDA Toolkit (closed-source)
- AMD's ROCm (MIT)
- OpenCL (headers) (Apache-2.0)
- Level Zero (MIT)
- Intel Graphics Compiler (MIT)
- Intel's GTPin (closed-source)

Java:

Maven (Apache-2.0)
Vavr.io (Apache-2.0)
org.json (public domain)
com.google.code.gson (Apache-2.0)
Jetty (Apache-2.0 AND EPL-2.0)
Apache Log4j (Apache-2.0)
JUnit 4.x (EPL-1.0)
Apache HTTPClient (Apache-2.0)
Snakeyaml (Apache-2.0)
Eclipse Collections (EPL-1.0 AND BSD-3-Clause)
Glazed Lists (LGPL-2.1-only AND MPL-2.0)
Apache Commons Math (Apache-2.0)
Eclipse platform (EPL-2.0)
Eclipse SWTChart (EPL-2.0)
Eclipse Nebula NatTable (EPL-2.0)
Logback (EPL-1.0 OR LGPL-2.1-only)
SLF4J (MIT)

12. Please describe your release methodology and mechanics

HPCToolkit is released roughly semiyearly (summer and winter), although this is often adjusted due to customer needs. Releases are made as Git tags with corresponding GitLab releases and subsequently published as Spack package versions. Binary artifacts are produced automatically using Continuous Deployment practices (with minimal exceptions).

13. Please describe Software Quality efforts (CI, security, auditing)

All changes to the mainline must pass a series of automated tests and linter-style checks, run via GitLab CI. These tests cover major releases of 4 common Linux distributions (Ubuntu, RHEL, Fedora, SUSE Leap), multiple CPU architectures (amd64, aarch64, ppc64le), and multiple GPU architectures (CUDA/Nvidia, HIP/AMD). Builds include multiple GCC and Clang compiler versions.

We do not have security screening in place. This is an area we would like to improve under HPSF.

14. Please list the project's leadership team

The HPCToolkit Technical Steering Committee (@hpctoolkit/tsc) is made of the following members:

Jonathon Anderson (@blue42u) -- Chair
John Mellor-Crummey (@jmellorcrummey)
Mark Krentel (@mwkrentel)
Laksono Adhianto (@adhianto)

15. Please list the project members with access to commit to the mainline of the project

Jonathon Anderson (@blue42u)
John Mellor-Crummey (@jmellorcrummey)
Mark Krentel (@mwkrentel)
Laksono Adhianto (@adhianto)
Marty Itzkowitz (@martyitz) -- only for hpctoolkit/qahpct>
Dragana Grbic (@draganaurosgrbic) -- only for hpctoolkit/hpc-analysis>

16. Please describe the project's decision-making process

We implement consensus-based decisions among our maintainers/committers, and we will resort to a fair vote of the TSC when consensus is not reached. These discussions happen primarily in GitLab issues/MRs or internally among the team.

Merge requests (MRs) must be approved and merged by a committer with sufficient access, although the review itself may be delegated to another contributor or reviewed in an informal meeting.

17. What is the maturity level of your project?

We aim to join the HPSF as an Established stage project.

The Established stage characterizes our project well. We are looking to create a plan for continued support for our users. We have a very small developer community and wish to expand it by leveraging the experience at LF and HPSF. And we are working with the eventual goal of achieving a Core project status.

18. Please list the project's official communication channels

GitLab issues and MRs
Email mailing list (hpctoolkit-forum =at= rice.edu, archive)

19. Please list the project's social media accounts

N/A

20. Please describe any existing financial sponsorships

Development on HPCToolkit is primarily funded from DOE grants and industry collaboration contracts via Rice University. The full list of sponsors is available on our website.

21. Please describe the project's infrastructure needs or requests

We are interested in expanding our CI system, to enable continuous testing on exotic compute hardware such as late-model AMD and Intel GPUs. We have a representative attending the CI working group to eventually facilitate this.
We are interested in assistance refreshing our website to meet modern design expectations. We are additionally interested in assistance creating/maintaining our public communication channels, such as a public chatroom for users (Slack/Discord/etc.).
We are interested in strengthening our contributor base within the HPC ecosystem. This is an area where we are immature and wish to leverage the experience of LF and the HPSF, as well as users' meetings and hackathons in HPC.

Criteria for Sandbox Stage

Meet all requirements to be a Linux Foundation project
- Evidence: Rice University signed HPCToolkit paperwork and became an LF project
Have 2 TAC sponsors to champion the project & provide mentorship as needed
- Evidence: @tgamblin, @dalg24
Submit a proposal for membership and present it at a meeting of the TAC
- Evidence: This proposal
Have a charter document with an intellectual property policy that leverages open licenses, including, in the case of contributions of code, the use of one or more licenses approved as “open” by the Open Source Initiative. The staff of the High Performance Software Foundation can assist projects in preparing a technical charter following the High Performance Software Foundation’s standard template.
- Evidence: Our charter, section 7
Have a code of conduct (part of default governance for LF – there is a template)
- Evidence: We adopt the LF CoC, see our contributing guidelines

Criteria for Established Stage

Document that it is being used successfully in production by at least three independent end users which, in the TAC’s judgment, are of adequate quality and scope.
- Evidence: HPCToolkit is deployed on supercomputers at LLNL, OLCF, and NERSC as part of E4S. We also have contracts with LLNL, ANL, and TotalEnergies to develop HPCToolkit for their needs.
Demonstrate development processes (e.g., use of pull requests, code review, testing, CI) that lower barriers to contribution and ensure software quality necessary for increased adoption.
- Evidence: See software quality efforts and decision-making processes above.
Demonstrate a substantial ongoing flow of commits and merged contributions.
- Evidence: See https://gitlab.com/hpctoolkit/hpctoolkit/-/graphs/develop?ref_type=heads

The text was updated successfully, but these errors were encountered:

dalg24 · 2024-12-12T23:52:13Z

About that criterion

Demonstrate development processes (e.g., use of pull requests, code review, testing, CI) that lower barriers to contribution and ensure software quality necessary for increased adoption.

From what I can tell from browsing the history of the project over the last 2 years, most merge requests are approved by the author and their no actual review from another developer.

slandath added the Project Proposal Label for project proposals to HPSF label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Project Proposal] HPCToolkit #18

[Project Proposal] HPCToolkit #18

blue42u commented Nov 10, 2024 •

edited

Loading

dalg24 commented Dec 12, 2024

[Project Proposal] HPCToolkit #18

[Project Proposal] HPCToolkit #18

Comments

blue42u commented Nov 10, 2024 • edited Loading

1. Name of Project

2. Project Description

3. Statement on Alignment with High Performance Software Foundation's Mission

4. Project Website (please provide a link)

5. Open Source License (please provide a link)

6. Code of Conduct (please provide a link)

7. Governance Practices (please provide a link)

8. Two Sponsors from the High Performance Software Foundation's Technical Advisory Committee

9. What is the project's solution for source control?

10. What is the project's solution for issue tracking?

11. Please list all external dependencies and their license

12. Please describe your release methodology and mechanics

13. Please describe Software Quality efforts (CI, security, auditing)

14. Please list the project's leadership team

15. Please list the project members with access to commit to the mainline of the project

16. Please describe the project's decision-making process

17. What is the maturity level of your project?

18. Please list the project's official communication channels

19. Please list the project's social media accounts

20. Please describe any existing financial sponsorships

21. Please describe the project's infrastructure needs or requests

Criteria for Sandbox Stage

Criteria for Established Stage

dalg24 commented Dec 12, 2024

blue42u commented Nov 10, 2024 •

edited

Loading