Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting VMMAP to RawPOSIX [REVIEW for comments] #87

Closed
wants to merge 25 commits into from
Closed

Conversation

Yaxuan-w
Copy link
Member

@Yaxuan-w Yaxuan-w commented Nov 4, 2024

Description

Issue # 45

Type of change

Porting @pranav-bhatt and @ruchjoshi-nyu implementation of VMMAP to RawPOSIX. Request review for porting location and if more comments are needed.

I also added few comments to make things more clear. That would be great if @ruchjoshi-nyu could review their correctness.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Checklist:

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • Any dependent changes have been added to a pull request and/or merged in other modules (native-client, lind-glibc, lind-project)

@@ -0,0 +1,32 @@
pub const PROT_READ: i32 = 0x1; /* Page can be read. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a lot of these are constants that will be used elsewhere, we probably want to figure out how to structure constants in general for RawPOSIX going forward

FileDescriptor(u64), // stores file descriptor addr
}

/// In the old native client based vmmap, we relied on the fd, shmid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where this comment pertains to.

pub removed: bool, /* flag set in fn Update(); */
pub file_offset: i64, /* offset into desc */
pub file_size: i64, /* backing store size */
pub cage_id: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a note here for why we store the cageid here would be good. I believe its because we need it for when we need to check file protections for a memory backing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is fine for now, but in the future, it may be that there are multiple cage_ids able to simultaneously access a part of memory. Perhaps add a comment indicating this may need to be rethought later, unless there is a simple structural change which would make sense in this case...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the cage_id to refer to for file backings. That probably should be noted like that.

If we wanted multiple cages to access the same memory, we'd probably have to keep the cage id per file backed memory, though doing something like that would probably require rethinking the whole vmmap.

};
}

fn max_prot(&self) -> i32 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the function to find the max protection from a file backed memory region, and is currently incomplete? @pranav-bhatt @ruchjoshi-nyu can you confirm?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe it is incomplete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change this function name to get_max_prot()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruchjoshi-nyu change args to (self, cageid: u64, fd, u64)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yaxuan-w can you change the contents of this function to call into fdtables with the cageid and fd, and return the file protection of that fd.

Copy link

@ruchjoshi-nyu ruchjoshi-nyu Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Args and function definition changed

// memory that has no memory backing
// things that are backed by fd -> represented by -1

// Leave todo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe this is? I'm not sure what the difference is here?

fn debug() {}
}

impl VmmapOps for Vmmap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of the functions here need comments explaining generally what they do, what the arguments are, and what they return

Copy link
Contributor

@rennergade rennergade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a great start. We need more comments on the individual functions and a bit more clarity.

@ruchjoshi-nyu Can you help with this? Should be a priority.

@rennergade
Copy link
Contributor

@qianxichen233 Can you give this review as well? mostly pointing out what needs clarification for now


fn find_map_space(&self, num_pages: u32, pages_per_map: u32) -> Option<Interval<u32>> {
let start = self.first_entry();
let end = self.last_entry();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is using the existing entry in vmmap to determine the current avaliable memory range. How should the memory initialization supposed to happen then? for example, if I want to tell vmmap that 0-2^20 pages are valid page and you should find space within the range, do I need to insert two empty entries (i.e. 0 num_pages entry) with page_num set to 0 and 2^20 respectively?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pranav-bhatt this is a question were wondering about

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok it seems like we need to change this to initialize in some way.

None
}

fn find_map_space(&self, num_pages: u32, pages_per_map: u32) -> Option<Interval<u32>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is pages_per_map doing here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to align the start and end of potential memory gaps to specific boundaries, to make sure that the found space is a multiple of pages_per_map and that it is large enough to accommodate the rounded-up num_pages

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder in which situation we want to find a map space that is multiple of pages_per_map(>1) pages, instead of just multiple of 1 page. Is this something also in nacl? @rennergade

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pages_per_map probably should be a constant for the min page multiple size we can use for maps in wasm


let gap_size = aligned_end_page - aligned_start_page;
if gap_size >= rounded_num_pages {
return Some(ie(aligned_end_page - rounded_num_pages, aligned_end_page));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why return the page from the end instead of the start? For sorting performance?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may need to change this to return from start, need to figure out how to configure intervals

// pages x to y, y included
ie(
vmmap_entry_ref.page_num,
vmmap_entry_ref.page_num + vmmap_entry_ref.npages,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

passing an vmmap entry with npages equals to 0 would cause panic here. If npages is not supposed to be 0, then there should be a check at vmmap entry creation or here to prevent from this.

@ruchjoshi-nyu
Copy link

Thanks for the comments, I'll get on it

};
}

// Placeholder method to get maximum protection (currrently incomplete)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be great to have a more detailed comments here, like what kind of feature it suppose to achieve, what currently finished, and what's waiting to do

None
}

// Method to find space above a hint
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a more detailed description of this function

None
}

// Method to find map space with a hint, rounding page numbers up to the nearest multiple of pages_per_map
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a more descriptive comment

page_num: u32,
npages: u32,
prot: i32,
maxprot: i32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just set this on create since it shouldnt be able to be modified after

page_num,
npages,
prot,
maxprot,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so instead of maxprot being a parameter here it should be like

if backing == file { maxprot = getmaxprot()

@@ -1,7 +1,3 @@
// Author: Nicholas Renner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move all the removing commented out code in these files to a different pr?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/safeposix/vmmap.rs Outdated Show resolved Hide resolved
src/safeposix/vmmap.rs Outdated Show resolved Hide resolved
if start == None || end == None {
return None;
} else {
let start_unwrapped = start.unwrap().0.start();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to walk through the logic here, should be from end of entry to start of next, but also need to configure 0 and 4GB as endpoints

// this is effectively whatever mode the file was opened with
// we need this because we shouldnt be able to change filed backed mappings
// to have protections exceeding that of the file
fn get_max_prot(&self, cage_id: u64, virtual_fd: u64 -> i32 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this function, we need to change the rest to use it correctly.

@rennergade
Copy link
Contributor

Ported to monorepo

@rennergade rennergade closed this Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants