Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support adding atoms 'zero-copy' #361

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

support adding atoms 'zero-copy' #361

wants to merge 1 commit into from

Conversation

arvidn
Copy link
Contributor

@arvidn arvidn commented Jan 9, 2024

This adds a feature to the Allocator and its NodePtr where a NodePtr can point into a pre-existing, immutable, buffer. Currently, a NodePtr atom can only point into the (dynamically allocated) heap. The NodePtr representation has 6 bits to encode "type", which currently is either Bytes or Pair. This adds 3 additional types, indicating which external buffer the atom is allocated in.

The main rationale are to reduce memory footprint and avoid unnecessary memory copying when:

  1. parsing a program from an (immutable) buffer to run it
  2. passing in a (potentially) large blob as the argument to a program (e.g. a block reference when running a generator)

This patch doesn't in itself take us all the way there, but it adds the basic support for it in Allocator.

Lifetime annotation

With this change, Allocator needs a lifetime annotation, to ensure the buffers added to it (via add_external_buffer()) do in fact outlive the allocator itself.
This lifetime annotation propagates to LazyNode, which contains a reference to an Allocator. The problem is that pyo3 cannot expose types with lifetime annotation to python.

This is solved by "laundering" the Allocator when creating a LazyNode, by turning it into an ImmutableAllocator, which simply does not contain any references to the external buffers, and will just fail if any such nodes are part of the LazyNode tree. (None of the intended use cases for external buffers return LazyNodes, so this shouldn't be a problem).

ObjectType

The main reason to hard code the indices in ExternalBytes0, ExternalBytes1 is to keep the enum a plain int, to support casting it in the shift-expression when constructing the underlying value of a NodePtr. I suspect that we can make this a little bit nicer by detaching the enum's representation from the NodePtr representation.

as_index()

As we add more object types to NodePtr, the purpose of as_index() is probably increasingly defeated. It was introduced as a cheap and efficient alternative to a hash-table with NodePtr as the key.

Copy link

coveralls-official bot commented Jan 9, 2024

Pull Request Test Coverage Report for Build 7507336535

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.2%) to 93.425%

Totals Coverage Status
Change from base Build 7505735168: -0.2%
Covered Lines: 5556
Relevant Lines: 5947

💛 - Coveralls

@arvidn arvidn requested review from richardkiss and Rigidity January 9, 2024 11:23
@arvidn arvidn marked this pull request as ready for review January 9, 2024 15:25
@arvidn arvidn force-pushed the external-buffers branch 2 times, most recently from 3cb70df to e0c328c Compare January 10, 2024 12:34
@arvidn arvidn changed the title add support adding atoms 'zero-copy' support adding atoms 'zero-copy' Jan 11, 2024
@richardkiss
Copy link
Contributor

Have you considered creating a wrapping object PythonAllocator (or something) that has lifetime 'py? The idea is that the allocator now also has lifetime 'py and objects coming from python can have their refcounts bumped by jamming a clone so they are also 'py. You may not need the laundering allocator this way.

(That said, I've had lots of great ideas with lifetimes and python that end up not being feasible, so it's possible there are shortcomings here. Also wresting with lifetime error messages is sometimes just not worth it.)

@arvidn
Copy link
Contributor Author

arvidn commented Jan 11, 2024

you mean having the allocator contain PyBuffer objects, essentially, to keep python buffers alive?
I don't think there's a way to expose types with any lifetime annotations. But a wrapper that takes python buffers could possibly encapsulate the lifetime requirements.

One thing that would make me a little bit uneasy is that we can take both bytes and bytearray, and bytearray is mutable. So it's not just a matter of holding references.

@arvidn arvidn marked this pull request as draft January 21, 2024 17:14
@arvidn
Copy link
Contributor Author

arvidn commented Jan 21, 2024

in its current form, this patch causes a performance regression. When testing on RPi5:

run_program/block-2000  time:   [366.63 ns 366.75 ns 366.87 ns]
                        change: [+1.0676% +1.2314% +1.4306%] (p = 0.00 < 0.05)
                        Performance has regressed.
run_program/compressed-2000
                        time:   [1.7944 s 1.7947 s 1.7951 s]
                        change: [+8.1965% +8.2266% +8.2580%] (p = 0.00 < 0.05)
                        Performance has regressed.
run_program/large-block time:   [501.45 ms 501.74 ms 502.17 ms]
                        change: [+5.4493% +5.5198% +5.6198%] (p = 0.00 < 0.05)
                        Performance has regressed.

@arvidn
Copy link
Contributor Author

arvidn commented Feb 4, 2024

I think the main use case for this is block references being passed in to the generator. I will update this at some point with that use case in mind.

Copy link

github-actions bot commented Apr 4, 2024

'This PR has been flagged as stale due to no activity for over 60
days. It will not be automatically closed, but it has been given
a stale-pr label and should be manually reviewed.'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants