Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blockchain sharding #169

Merged
merged 11 commits into from
Feb 24, 2018
Merged

Add blockchain sharding #169

merged 11 commits into from
Feb 24, 2018

Conversation

naterush
Copy link
Collaborator

@naterush naterush commented Feb 14, 2018

A working binary blockchain sharded data structure (what a mouthful). Currently, all validators see all shards (so we don't really get any scalability gains), but I think this is fine for a first version. The number of shards is currently limited to the number of validators (an artifact of how initial message creation works).

@naterush naterush requested a review from djrtwo February 14, 2018 02:46
@naterush naterush mentioned this pull request Feb 14, 2018
djrtwo
djrtwo previously requested changes Feb 15, 2018
Copy link
Collaborator

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes/comments.

Still need to review forkchoice and view. Coming up!

def is_valid_estimate(cls, estimate):
if not isinstance(estimate, dict):
return False
if not isinstance(estimate['prev_blocks'], set):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

estimate['prev_blocks'] will throw a KeyError if 'prev_blocks' is not set. Should probably check for inclusion of keys and then if the value is in fact a set. It's probably more likely (or at least just as likely) that the key is forgotten as the value being incorrect.

if block.on_shard(shard_id):
return block

raise KeyError("Should have found previous block on shard!")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to add why we should have found a previous block:

"shard_id in estimate['shard_ids']. Should have found previous block on shard!"

raise KeyError("No previous block on that shard")

for block in self.estimate['prev_blocks']:
if block is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth a comment on what is going on here.


@property
def is_merge_block(self):
return len(self.estimate['shard_ids']) == 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we're in binary sharding so we are checking for == rather than >=, but if we aren't checking for >= here, we should probably enforce len(estimate['shard_ids']) in is_valid_estimate.

Maybe consider adding a CONSTANT to the class that enforces this max: MAX_MERGE_BLOCK_SIZE or something.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add such a constant then this method would be checking that the len in range(2, MAX_MERGE_BLOCK_SIZE+1)

@@ -0,0 +1,79 @@
"""The forkchoice module implements the estimator function a blockchain"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"sharded blockchain"

@@ -0,0 +1,105 @@
"""The blockchain plot tool implements functions for plotting blockchain data structures"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"sharded blockchain"

curr_shard_idx = 0
curr_shard_ids = ['']

"""Shard ID's look like this:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sick comment 😸

return cls.shard_genesis_blocks['']

@classmethod
def get_new_shard_id(cls):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my experience, I see these functions often named with "next" so get_next_shard_id. That said... follow your heart.

shard_id = cls.get_new_shard_id()

estimate = {'prev_blocks': set([None]), 'shard_ids': set([shard_id])}
cls.shard_genesis_blocks[''] = Block(estimate, dict(), validator, -1, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on right here? Why do we use '' as the key each time? Ultimately we are just storing the last initial_message and all the others are overwritten, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Yeah, meant to save the blocks (in case we don't want to number of shards to be equal to the number of validators :) ).


@classmethod
def initial_message(cls, validator):
"""Returns a dict from shard_id -> shard genesis block"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is incorrect. Returns a Block

djrtwo
djrtwo previously requested changes Feb 24, 2018
Copy link
Collaborator

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments. Looks good in general.

gotta figure out testing language and testing for this


def select_random_shards(self, shards_forkchoice):
"""Randomly selects a shard to build on, and sometimes selects another child shard"""
shards_to_build_on = [r.choice([key for key in self.starting_blocks.keys()])]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[r.choice(list(self.starting_blocks.keys()))] slightly less verbose but not sure what reads better..

def select_random_shards(self, shards_forkchoice):
"""Randomly selects a shard to build on, and sometimes selects another child shard"""
shards_to_build_on = [r.choice([key for key in self.starting_blocks.keys()])]
if (r.randint(0, 1) == 1):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like random.choice([True, False]). very readable.

"""Randomly selects a shard to build on, and sometimes selects another child shard"""
shards_to_build_on = [r.choice([key for key in self.starting_blocks.keys()])]
if (r.randint(0, 1) == 1):
child = str(r.randint(0, 1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare child_shard_id = shards_to_build_on[0] + child so you don't have to dynamically spell it out twice below and for clarity.

self.check_forkchoice_atomicity(shards_forkchoice)

shards_to_build_on = self.select_shards(shards_forkchoice)
return {'prev_blocks': {shards_forkchoice[shard_id] for shard_id in shards_to_build_on},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we only returning the prev_blocks for the shards_to_build_on? Might a user want to see the entire shards_forkchoice as that is the full current estimate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is the previous block links, we only point to blocks that we are explicitly building on top of w/ this new block (at most, two shards as of now).

We can add other select_shards rules in the future to modify this slightly, but in all cases I think the prev_blocks themselves will just be those of the shards selected.


return set(shards_to_build_on)

def check_forkchoice_atomicity(self, shards_forkchoice):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay here right now, but might be more appropriate in some sort of testing utils when we add testing for sharding.

def update_safe_estimates(self, validator_set):
"""Checks safety on messages in views forkchoice, and updates last_finalized_block"""
# check the safety of the top shard!
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, does oracle not work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Doesn't work yet. Have to do a bit of thinking on how to fix this up (don't think it should be bad).


tip = None

while tip and tip != self.starting_blocks['']:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is '' the key?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the root shard id. I'll define it as a constant so it's more clear.

Copy link
Collaborator Author

@naterush naterush Feb 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, just going to remove unused code for now. Will add this constant when I figure out safety stuff.


max_score = max(scores.values())

assert max_score != 0, "max_score of a block should never be zero"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe max_score > 0, "max_score should be greater than zero"

@naterush
Copy link
Collaborator Author

Review has been addressed, thanks @djrtwo. Opened an issue to fix safety oracle on this here: #172. Merging!

@naterush naterush merged commit 5529bee into develop Feb 24, 2018
@naterush naterush deleted the feat/sharding branch February 24, 2018 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants