Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop immediates array #618

Merged
merged 8 commits into from
Nov 6, 2020
Merged

Drop immediates array #618

merged 8 commits into from
Nov 6, 2020

Conversation

chfast
Copy link
Collaborator

@chfast chfast commented Oct 21, 2020

No description provided.

@codecov
Copy link

codecov bot commented Oct 22, 2020

Codecov Report

Merging #618 into master will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #618      +/-   ##
==========================================
- Coverage   98.37%   98.36%   -0.01%     
==========================================
  Files          69       69              
  Lines        9654     9622      -32     
==========================================
- Hits         9497     9465      -32     
  Misses        157      157              

@axic axic changed the base branch from master to code_as_bytes October 22, 2020 14:06
lib/fizzy/execute.cpp Outdated Show resolved Hide resolved
@chfast chfast force-pushed the immediates_merge branch 3 times, most recently from b162348 to 656e46d Compare October 22, 2020 18:14
@chfast chfast force-pushed the immediates_merge branch 2 times, most recently from 923bd69 to ad8b7d3 Compare October 22, 2020 19:12
@chfast chfast marked this pull request as ready for review October 22, 2020 19:22
@chfast chfast requested review from gumb0 and axic October 22, 2020 19:22
@@ -736,8 +730,9 @@ parser_result<Code> parse_expr(const uint8_t* pos, const uint8_t* end, FuncIdx f

drop_operand(frame, operand_stack, find_local_type(func_inputs, locals, local_idx));

push(code.immediates, local_idx);
break;
code.instructions.push_back(opcode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use push_back for these and not push(code.instructions, opcode)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the copy of line after the switch.

But I can reverse the question: Why use push(code.instructions, opcode) for these when code.instructions.push_back(opcode) is enough.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe push should be renamed to push_immediate, and optionally a new helper push_opcode added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe push should be renamed to push_immediate, and optionally a new helper push_opcode added

That would be nice for readability, in a separate PR.

break;
code.instructions.push_back(opcode);
push(code.instructions, uint32_t{0}); // Diff to the else instruction
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are many of these changed to continue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The continue skips the code.instructions.push_back(opcode); after the switch because you do it here before pushing the immediates.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As opcode is always the first to be pushed, would it be better to move code.instructions.push_back(opcode); to before the switch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible currently, but I'm not sure this will stand long term. In #622 I present a way to skip noop instructions in parsing. Also we can replace return with br, etc. So I think the control what opcode is pushed if any on the instruction granularity is useful.

Alternatively, we can push the opcode, and the "pop" it or replace it in some cases.

Base automatically changed from code_as_bytes to master October 22, 2020 21:03
break;
code.instructions.push_back(opcode);
push(code.instructions, uint32_t{0}); // Diff to the else instruction
continue;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The continue skips the code.instructions.push_back(opcode); after the switch because you do it here before pushing the immediates.

@@ -736,8 +730,9 @@ parser_result<Code> parse_expr(const uint8_t* pos, const uint8_t* end, FuncIdx f

drop_operand(frame, operand_stack, find_local_type(func_inputs, locals, local_idx));

push(code.immediates, local_idx);
break;
code.instructions.push_back(opcode);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the copy of line after the switch.

But I can reverse the question: Why use push(code.instructions, opcode) for these when code.instructions.push_back(opcode) is enough.

test/unittests/module_test.cpp Show resolved Hide resolved
ElementsAre(Instr::i32_const, 0, 0, 0, 0, Instr::loop, Instr::br, /*arity:*/ 0, 0, 0, 0,
/*code_offset:*/ 5, 0, 0, 0, /*stack_drop:*/ 0, 0, 0, 0, Instr::end, Instr::drop,
Instr::end));
// EXPECT_EQ(module_parent_stack->codesec[0].immediates,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be removed.

const auto frame_type = frame.type;
auto frame_br_immediate_offsets = std::move(frame.br_immediate_offsets);

control_stack.pop();
control_stack.emplace(Instr::else_, frame_type, static_cast<int>(operand_stack.size()),
code.instructions.size(), code.immediates.size());
code.instructions.size());
// br immediates from `then` branch will need to be filled at the end of `else`
control_stack.top().br_immediate_offsets = std::move(frame_br_immediate_offsets);

// Placeholders for immediate values, filled at the matching end instructions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this comment one line down, to before pushing the immediate

Suggested change
// Placeholders for immediate values, filled at the matching end instructions.
// Placeholder for immediate value, filled at the matching end instructions.

@@ -736,8 +730,9 @@ parser_result<Code> parse_expr(const uint8_t* pos, const uint8_t* end, FuncIdx f

drop_operand(frame, operand_stack, find_local_type(func_inputs, locals, local_idx));

push(code.immediates, local_idx);
break;
code.instructions.push_back(opcode);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe push should be renamed to push_immediate, and optionally a new helper push_opcode added

push(code.immediates, uint32_t{0}); // Diff to the end instruction.
push(code.immediates, uint32_t{0}); // Diff for the immediates
code.instructions.push_back(opcode);
push(code.instructions, uint32_t{0}); // Diff to the end instruction.

// Fill in if's immediates with offsets of first instruction in else block.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Fill in if's immediates with offsets of first instruction in else block.
// Fill in if's immediate with offset of first instruction in else block.

@@ -541,26 +534,21 @@ parser_result<Code> parse_expr(const uint8_t* pos, const uint8_t* end, FuncIdx f
const auto target_pc = control_stack.size() == 1 ?
static_cast<uint32_t>(code.instructions.size()) :
static_cast<uint32_t>(code.instructions.size() + 1);
const auto target_imm = static_cast<uint32_t>(code.immediates.size());

if (frame.instruction == Instr::if_ || frame.instruction == Instr::else_)
{
// We're at the end instruction of the if block without else or at the end of
// else block. Fill in if/else's immediates with offsets of first instruction
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// else block. Fill in if/else's immediates with offsets of first instruction
// else block. Fill in if/else's immediate with offset of first instruction

@chfast
Copy link
Collaborator Author

chfast commented Oct 23, 2020

My classic benchmarks, GCC10 LTO on Haswell 4GHz.

The parsing got slower. There can be two reasons for that: we allocate differently - single array, and the parser loop is more complex.

fizzy/parse/blake2b_mean                                           +0.1487         +0.1487            23            26            23            26
fizzy/instantiate/blake2b_mean                                     +0.1266         +0.1266            27            30            27            30
fizzy/execute/blake2b/512_bytes_rounds_1_mean                      -0.2026         -0.2026            87            69            87            69
fizzy/execute/blake2b/512_bytes_rounds_16_mean                     -0.2115         -0.2115          1317          1038          1317          1038
fizzy/parse/ecpairing_mean                                         +0.1372         +0.1372          1346          1531          1346          1531
fizzy/instantiate/ecpairing_mean                                   +0.1333         +0.1333          1401          1588          1401          1588
fizzy/execute/ecpairing/onepoint_mean                              -0.1711         -0.1711        417398        345998        417403        346001
fizzy/parse/keccak256_mean                                         +0.1362         +0.1362            42            48            42            48
fizzy/instantiate/keccak256_mean                                   +0.1084         +0.1084            47            52            47            52
fizzy/execute/keccak256/512_bytes_rounds_1_mean                    -0.3545         -0.3545           110            71           110            71
fizzy/execute/keccak256/512_bytes_rounds_16_mean                   -0.3565         -0.3565          1615          1039          1615          1039
fizzy/parse/memset_mean                                            +0.1195         +0.1195             6             6             6             6
fizzy/instantiate/memset_mean                                      +0.0739         +0.0739             9            10             9            10
fizzy/execute/memset/256_bytes_mean                                -0.1532         -0.1532             7             6             7             6
fizzy/execute/memset/60000_bytes_mean                              -0.1578         -0.1578          1623          1367          1623          1367
fizzy/parse/mul256_opt0_mean                                       +0.1160         +0.1160             8             9             8             9
fizzy/instantiate/mul256_opt0_mean                                 +0.0852         +0.0852            11            12            11            12
fizzy/execute/mul256_opt0/input1_mean                              -0.1456         -0.1456            29            25            29            25
fizzy/parse/ramanujan_pi_mean                                      +0.1479         +0.1479            24            27            24            27
fizzy/instantiate/ramanujan_pi_mean                                +0.1245         +0.1245            28            31            28            31
fizzy/execute/ramanujan_pi/33_runs_mean                            -0.2710         -0.2710           140           102           140           102
fizzy/parse/sha1_mean                                              +0.1469         +0.1468            38            44            38            44
fizzy/instantiate/sha1_mean                                        +0.1419         +0.1418            42            48            42            48
fizzy/execute/sha1/512_bytes_rounds_1_mean                         -0.1973         -0.1973            99            79            99            79
fizzy/execute/sha1/512_bytes_rounds_16_mean                        -0.1984         -0.1984          1377          1104          1377          1104
fizzy/parse/sha256_mean                                            +0.1284         +0.1284            64            72            64            72
fizzy/instantiate/sha256_mean                                      +0.1238         +0.1238            68            76            68            76
fizzy/execute/sha256/512_bytes_rounds_1_mean                       -0.1614         -0.1614            92            77            92            77
fizzy/execute/sha256/512_bytes_rounds_16_mean                      -0.1619         -0.1619          1261          1057          1261          1057
fizzy/parse/taylor_pi_mean                                         +0.0225         +0.0225             3             3             3             3
fizzy/instantiate/taylor_pi_mean                                   +0.0052         +0.0052             6             6             6             6
fizzy/execute/taylor_pi/pi_1000000_runs_mean                       -0.1427         -0.1427         42999         36865         43000         36865
fizzy/parse/micro/eli_interpreter_mean                             +0.0633         +0.0633             4             4             4             4
fizzy/instantiate/micro/eli_interpreter_mean                       +0.0282         +0.0282             8             8             8             8
fizzy/execute/micro/eli_interpreter/exec105_mean                   -0.0683         -0.0683             5             5             5             5
fizzy/parse/micro/factorial_mean                                   -0.0563         -0.0563             1             1             1             1
fizzy/instantiate/micro/factorial_mean                             -0.0198         -0.0198             1             1             1             1
fizzy/execute/micro/factorial/20_mean                              -0.0286         -0.0286             1             1             1             1
fizzy/parse/micro/fibonacci_mean                                   -0.0530         -0.0530             1             1             1             1
fizzy/instantiate/micro/fibonacci_mean                             -0.0207         -0.0207             1             1             1             1
fizzy/execute/micro/fibonacci/24_mean                              -0.1108         -0.1108          5312          4723          5312          4723
fizzy/parse/micro/host_adler32_mean                                -0.0161         -0.0161             2             2             2             2
fizzy/instantiate/micro/host_adler32_mean                          -0.0009         -0.0009             4             4             4             4
fizzy/execute/micro/host_adler32/1_mean                            +0.0013         +0.0013             0             0             0             0
fizzy/execute/micro/host_adler32/1000_mean                         -0.0390         -0.0390            30            29            30            29
fizzy/parse/micro/spinner_mean                                     -0.0311         -0.0311             1             1             1             1
fizzy/instantiate/micro/spinner_mean                               -0.0030         -0.0030             1             1             1             1
fizzy/execute/micro/spinner/1_mean                                 -0.0448         -0.0448             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                              -0.1880         -0.1880            11             9            11             9
fizzy/parse/stress/guido-fuzzer-find-1_mean                        +0.1356         +0.1356           121           138           121           138
fizzy/instantiate/stress/guido-fuzzer-find-1_mean                  +0.1057         +0.1057           151           167           151           167

@@ -363,10 +363,6 @@ struct Code
// The instructions bytecode without immediate values.
// https://webassembly.github.io/spec/core/binary/instructions.html
std::vector<uint8_t> instructions;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried using bytes, but the performance is the same. So I'm happy to leave vector as it is simpler internally (string has SSO which is useless in this case).

lib/fizzy/execute.cpp Outdated Show resolved Hide resolved
@@ -872,7 +872,7 @@ TEST(execute_control, if_else_smoke)

const auto module = parse(bin);

for (const auto param : {0u, 1u})
for (const auto param : {0u})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still failing?

"00000000"_bytes; // stack_drop

EXPECT_EQ(code.immediates.substr(br_table_imm_offset, expected_br_imm.size()), expected_br_imm);
Instr::local_get, 0, 0, 0, 0, Instr::br_table,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are super annoying, I'm not checking all the offsets

@chfast
Copy link
Collaborator Author

chfast commented Oct 26, 2020

AMD EPYC 7601, GCC10, no LTO

fizzy/parse/blake2b_mean                                           +0.0281         +0.0280            52            53            52            53
fizzy/instantiate/blake2b_mean                                     +0.0181         +0.0184            57            58            57            58
fizzy/execute/blake2b/512_bytes_rounds_1_mean                      -0.2144         -0.2145           264           207           264           207
fizzy/execute/blake2b/512_bytes_rounds_16_mean                     -0.2191         -0.2191          3998          3122          3997          3122
fizzy/parse/ecpairing_mean                                         -0.0274         -0.0275          2848          2770          2848          2770
fizzy/instantiate/ecpairing_mean                                   -0.0345         -0.0347          2987          2884          2987          2883
fizzy/execute/ecpairing/onepoint_mean                              -0.1447         -0.1448       1413175       1208678       1412978       1208398
fizzy/parse/keccak256_mean                                         +0.0006         +0.0006            94            94            94            94
fizzy/instantiate/keccak256_mean                                   -0.0070         -0.0071           100           100           100           100
fizzy/execute/keccak256/512_bytes_rounds_1_mean                    -0.1008         -0.1008           295           265           295           265
fizzy/execute/keccak256/512_bytes_rounds_16_mean                   -0.1076         -0.1074          4355          3887          4354          3886
fizzy/parse/memset_mean                                            -0.0111         -0.0112            13            13            13            13
fizzy/instantiate/memset_mean                                      -0.0225         -0.0225            19            18            19            18
fizzy/execute/memset/256_bytes_mean                                -0.0046         -0.0047            21            21            21            21
fizzy/execute/memset/60000_bytes_mean                              +0.0164         +0.0165          4467          4541          4467          4540
fizzy/parse/mul256_opt0_mean                                       -0.0229         -0.0229            17            17            17            17
fizzy/instantiate/mul256_opt0_mean                                 -0.0285         -0.0287            23            22            23            22
fizzy/execute/mul256_opt0/input1_mean                              -0.1718         -0.1718           108            89           108            89
fizzy/parse/ramanujan_pi_mean                                      +0.0202         +0.0202            54            55            54            55
fizzy/instantiate/ramanujan_pi_mean                                +0.0071         +0.0070            59            60            59            60
fizzy/execute/ramanujan_pi/33_runs_mean                            -0.0227         -0.0228           452           441           452           441
fizzy/parse/sha1_mean                                              +0.0044         +0.0043            86            86            86            86
fizzy/instantiate/sha1_mean                                        -0.0021         -0.0021            92            91            91            91
fizzy/execute/sha1/512_bytes_rounds_1_mean                         -0.2537         -0.2538           307           229           307           229
fizzy/execute/sha1/512_bytes_rounds_16_mean                        -0.2565         -0.2565          4295          3194          4295          3193
fizzy/parse/sha256_mean                                            +0.0069         +0.0067           142           143           142           143
fizzy/instantiate/sha256_mean                                      +0.0012         +0.0012           148           149           148           149
fizzy/execute/sha256/512_bytes_rounds_1_mean                       -0.2438         -0.2439           375           284           375           284
fizzy/execute/sha256/512_bytes_rounds_16_mean                      -0.2472         -0.2472          5253          3954          5253          3954
fizzy/parse/taylor_pi_mean                                         -0.0294         -0.0293             6             5             6             5
fizzy/instantiate/taylor_pi_mean                                   -0.0352         -0.0354            11            11            11            11
fizzy/execute/taylor_pi/pi_1000000_runs_mean                       -0.1057         -0.1057        111817         99999        111806         99991
fizzy/parse/micro/eli_interpreter_mean                             -0.0228         -0.0230             8             8             8             8
fizzy/instantiate/micro/eli_interpreter_mean                       -0.0263         -0.0263            14            13            14            13
fizzy/execute/micro/eli_interpreter/exec105_mean                   -0.2182         -0.2183            17            13            17            13
fizzy/parse/micro/factorial_mean                                   -0.0737         -0.0737             2             2             2             2
fizzy/instantiate/micro/factorial_mean                             -0.0577         -0.0578             2             2             2             2
fizzy/execute/micro/factorial/20_mean                              -0.0502         -0.0501             1             1             1             1
fizzy/parse/micro/fibonacci_mean                                   -0.0809         -0.0811             3             2             3             2
fizzy/instantiate/micro/fibonacci_mean                             -0.0665         -0.0665             3             3             3             3
fizzy/execute/micro/fibonacci/24_mean                              -0.1922         -0.1924         12775         10319         12774         10317
fizzy/parse/micro/host_adler32_mean                                -0.0090         -0.0090             3             3             3             3
fizzy/instantiate/micro/host_adler32_mean                          -0.0057         -0.0059             6             6             6             6
fizzy/execute/micro/host_adler32/1_mean                            +0.0028         +0.0029             0             0             0             0
fizzy/execute/micro/host_adler32/1000_mean                         +0.0442         +0.0440            59            61            59            61
fizzy/parse/micro/spinner_mean                                     -0.0249         -0.0249             2             2             2             2
fizzy/instantiate/micro/spinner_mean                               -0.0100         -0.0102             2             2             2             2
fizzy/execute/micro/spinner/1_mean                                 -0.1111         -0.1111             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                              -0.1081         -0.1082            19            17            19            17
fizzy/parse/stress/guido-fuzzer-find-1_mean                        +0.0382         +0.0382           237           246           237           246
fizzy/instantiate/stress/guido-fuzzer-find-1_mean                  +0.0291         +0.0293           277           285           277           285

@chfast
Copy link
Collaborator Author

chfast commented Nov 5, 2020

Haswell 4 GHz, GCC10 LTO

fizzy/parse/blake2b_mean                                           +0.1393         +0.1393            23            27            23            27                      
fizzy/instantiate/blake2b_mean                                     +0.1521         +0.1521            27            31            27            31                      
fizzy/execute/blake2b/512_bytes_rounds_1_mean                      -0.2172         -0.2172            96            75            96            75                      
fizzy/execute/blake2b/512_bytes_rounds_16_mean                     -0.2329         -0.2329          1442          1106          1442          1106                      
fizzy/parse/ecpairing_mean                                         +0.1382         +0.1382          1365          1554          1365          1554                      
fizzy/instantiate/ecpairing_mean                                   +0.1382         +0.1382          1414          1609          1414          1609                      
fizzy/execute/ecpairing/onepoint_mean                              -0.1308         -0.1308        393749        342233        393752        342237                      
fizzy/parse/keccak256_mean                                         +0.1440         +0.1440            42            48            42            48                      
fizzy/instantiate/keccak256_mean                                   +0.1382         +0.1382            46            53            46            53                      
fizzy/execute/keccak256/512_bytes_rounds_1_mean                    -0.2166         -0.2166           124            97           124            97                      
fizzy/execute/keccak256/512_bytes_rounds_16_mean                   -0.1182         -0.1182          1523          1343          1523          1343                      
fizzy/parse/memset_mean                                            +0.1079         +0.1079             6             6             6             6                      
fizzy/instantiate/memset_mean                                      +0.0620         +0.0620            10            10            10            10                      
fizzy/execute/memset/256_bytes_mean                                +0.0260         +0.0260             6             6             6             6
fizzy/execute/memset/60000_bytes_mean                              +0.0454         +0.0454          1279          1337          1279          1337
fizzy/parse/mul256_opt0_mean                                       +0.1282         +0.1282             8             9             8             9
fizzy/instantiate/mul256_opt0_mean                                 +0.0924         +0.0924            11            13            11            13
fizzy/execute/mul256_opt0/input1_mean                              +0.0217         +0.0217            23            24            23            24
fizzy/parse/ramanujan_pi_mean                                      +0.1396         +0.1396            24            27            24            27
fizzy/instantiate/ramanujan_pi_mean                                +0.1140         +0.1140            28            31            28            31
fizzy/execute/ramanujan_pi/33_runs_mean                            -0.2339         -0.2339           133           102           133           102
fizzy/parse/sha1_mean                                              +0.1549         +0.1549            38            44            38            44
fizzy/instantiate/sha1_mean                                        +0.1443         +0.1443            42            48            42            48
fizzy/execute/sha1/512_bytes_rounds_1_mean                         -0.0511         -0.0511            82            78            82            78
fizzy/execute/sha1/512_bytes_rounds_16_mean                        -0.0157         -0.0157          1108          1091          1108          1091
fizzy/parse/sha256_mean                                            +0.1346         +0.1346            64            73            64            73
fizzy/instantiate/sha256_mean                                      +0.1237         +0.1237            69            77            69            77
fizzy/execute/sha256/512_bytes_rounds_1_mean                       -0.0887         -0.0887            84            77            84            77
fizzy/execute/sha256/512_bytes_rounds_16_mean                      -0.0845         -0.0845          1154          1057          1154          1057
fizzy/parse/taylor_pi_mean                                         +0.0308         +0.0308             3             3             3             3
fizzy/instantiate/taylor_pi_mean                                   +0.0065         +0.0065             6             6             6             6
fizzy/execute/taylor_pi/pi_1000000_runs_mean                       -0.0682         -0.0682         39452         36759         39452         36760
fizzy/parse/micro/eli_interpreter_mean                             +0.0601         +0.0601             4             4             4             4
fizzy/instantiate/micro/eli_interpreter_mean                       +0.0318         +0.0318             8             8             8             8
fizzy/execute/micro/eli_interpreter/exec105_mean                   -0.0518         -0.0518             4             4             4             4
fizzy/parse/micro/factorial_mean                                   -0.0455         -0.0455             1             1             1             1
fizzy/instantiate/micro/factorial_mean                             -0.0548         -0.0548             1             1             1             1
fizzy/execute/micro/factorial/20_mean                              +0.0257         +0.0257             1             1             1             1
fizzy/parse/micro/fibonacci_mean                                   -0.0455         -0.0455             1             1             1             1
fizzy/instantiate/micro/fibonacci_mean                             -0.0372         -0.0372             1             1             1             1
fizzy/execute/micro/fibonacci/24_mean                              -0.0292         -0.0292          4872          4730          4872          4730
fizzy/parse/micro/host_adler32_mean                                +0.0015         +0.0015             2             2             2             2
fizzy/instantiate/micro/host_adler32_mean                          -0.0153         -0.0153             4             4             4             4
fizzy/execute/micro/host_adler32/1_mean                            +0.0080         +0.0080             0             0             0             0
fizzy/execute/micro/host_adler32/1000_mean                         -0.0666         -0.0666            30            28            30            28
fizzy/parse/micro/icall_hash_mean                                  -0.0215         -0.0215             3             3             3             3
fizzy/instantiate/micro/icall_hash_mean                            -0.0064         -0.0064             7             7             7             7
fizzy/execute/micro/icall_hash/1000_steps_mean                     -0.0085         -0.0085            63            63            63            63
fizzy/parse/micro/spinner_mean                                     -0.0331         -0.0331             1             1             1             1
fizzy/instantiate/micro/spinner_mean                               -0.0207         -0.0207             1             1             1             1
fizzy/execute/micro/spinner/1_mean                                 +0.0155         +0.0155             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                              -0.1676         -0.1676             9             7             9             7
fizzy/parse/stress/guido-fuzzer-find-1_mean                        +0.1534         +0.1534           122           141           122           141
fizzy/instantiate/stress/guido-fuzzer-find-1_mean                  +0.1220         +0.1220           152           170           152           170

@chfast
Copy link
Collaborator Author

chfast commented Nov 5, 2020

AMD EPYC 7501 2 GHz, GCC10 no-LTO

fizzy/parse/blake2b_mean                                           +0.0336         +0.0337            54            56            54            56
fizzy/instantiate/blake2b_mean                                     +0.1171         +0.1171            60            68            60            68
fizzy/execute/blake2b/512_bytes_rounds_1_mean                      -0.2213         -0.2214           294           229           294           229
fizzy/execute/blake2b/512_bytes_rounds_16_mean                     -0.2296         -0.2296          4491          3460          4490          3459
fizzy/parse/ecpairing_mean                                         +0.0435         +0.0433          2824          2946          2823          2945
fizzy/instantiate/ecpairing_mean                                   +0.0554         +0.0546          3014          3181          3014          3178
fizzy/execute/ecpairing/onepoint_mean                              -0.1832         -0.1831       1491337       1218100       1490912       1217920
fizzy/parse/keccak256_mean                                         +0.0294         +0.0294            98           101            98           101
fizzy/instantiate/keccak256_mean                                   +0.0347         +0.0346           104           108           104           107
fizzy/execute/keccak256/512_bytes_rounds_1_mean                    -0.1294         -0.1293           316           275           316           275
fizzy/execute/keccak256/512_bytes_rounds_16_mean                   -0.1388         -0.1387          4676          4027          4675          4027
fizzy/parse/memset_mean                                            +0.0180         +0.0180            14            14            14            14
fizzy/instantiate/memset_mean                                      +0.0089         +0.0089            20            20            20            20
fizzy/execute/memset/256_bytes_mean                                -0.0374         -0.0374            21            21            21            21
fizzy/execute/memset/60000_bytes_mean                              -0.0128         -0.0131          4651          4592          4651          4590
fizzy/parse/mul256_opt0_mean                                       -0.0173         -0.0174            18            18            18            18
fizzy/instantiate/mul256_opt0_mean                                 +0.0107         +0.0108            24            24            24            24
fizzy/execute/mul256_opt0/input1_mean                              -0.1113         -0.1113           108            96           108            96
fizzy/parse/ramanujan_pi_mean                                      +0.0048         +0.0048            57            57            57            57
fizzy/instantiate/ramanujan_pi_mean                                +0.0050         +0.0050            63            63            63            63
fizzy/execute/ramanujan_pi/33_runs_mean                            -0.0128         -0.0127           456           450           456           450
fizzy/parse/sha1_mean                                              +0.0317         +0.0318            88            91            88            91
fizzy/instantiate/sha1_mean                                        +0.0248         +0.0248            95            97            95            97
fizzy/execute/sha1/512_bytes_rounds_1_mean                         -0.2065         -0.2065           308           244           308           244
fizzy/execute/sha1/512_bytes_rounds_16_mean                        -0.2074         -0.2073          4315          3420          4314          3420
fizzy/parse/sha256_mean                                            +0.0329         +0.0329           146           150           146           150
fizzy/instantiate/sha256_mean                                      +0.0283         +0.0283           152           157           152           157
fizzy/execute/sha256/512_bytes_rounds_1_mean                       -0.3770         -0.3769           406           253           406           253
fizzy/execute/sha256/512_bytes_rounds_16_mean                      -0.3784         -0.3784          5703          3545          5702          3544
fizzy/parse/taylor_pi_mean                                         -0.0025         -0.0025             6             6             6             6
fizzy/instantiate/taylor_pi_mean                                   +0.0154         +0.0153            12            12            12            12
fizzy/execute/taylor_pi/pi_1000000_runs_mean                       +0.3508         +0.3507         92994        125614         92979        125589
fizzy/parse/micro/eli_interpreter_mean                             -0.0128         -0.0128             9             8             9             8
fizzy/instantiate/micro/eli_interpreter_mean                       -0.0026         -0.0029            14            14            14            14
fizzy/execute/micro/eli_interpreter/exec105_mean                   -0.1816         -0.1816            16            13            16            13
fizzy/parse/micro/factorial_mean                                   -0.0522         -0.0522             2             2             2             2
fizzy/instantiate/micro/factorial_mean                             -0.0533         -0.0533             3             2             3             2
fizzy/execute/micro/factorial/20_mean                              -0.2704         -0.2704             2             1             2             1
fizzy/parse/micro/fibonacci_mean                                   -0.0490         -0.0490             3             3             3             3
fizzy/instantiate/micro/fibonacci_mean                             -0.0531         -0.0530             3             3             3             3
fizzy/execute/micro/fibonacci/24_mean                              -0.0220         -0.0219         14628         14306         14625         14304
fizzy/parse/micro/host_adler32_mean                                -0.0348         -0.0347             4             4             4             4
fizzy/instantiate/micro/host_adler32_mean                          -0.0313         -0.0312             7             7             7             7
fizzy/execute/micro/host_adler32/1_mean                            -0.0220         -0.0220             0             0             0             0
fizzy/execute/micro/host_adler32/1000_mean                         -0.0358         -0.0357            62            59            62            59
fizzy/parse/micro/icall_hash_mean                                  -0.0622         -0.0621             7             7             7             7
fizzy/instantiate/micro/icall_hash_mean                            -0.0470         -0.0470            13            13            13            13
fizzy/execute/micro/icall_hash/1000_steps_mean                     -0.0960         -0.0958           142           128           142           128
fizzy/parse/micro/spinner_mean                                     -0.0143         -0.0142             2             2             2             2
fizzy/instantiate/micro/spinner_mean                               -0.0232         -0.0231             2             2             2             2
fizzy/execute/micro/spinner/1_mean                                 -0.0320         -0.0319             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                              -0.0473         -0.0472            19            18            19            18
fizzy/parse/stress/guido-fuzzer-find-1_mean                        +0.0465         +0.0463           248           260           248           260
fizzy/instantiate/stress/guido-fuzzer-find-1_mean                  +0.0291         +0.0295           301           310           301           310

@@ -320,33 +313,34 @@ TEST(parser_expr, instr_br_table)
(block
(block
(br_table 3 2 1 0 4 (get_local 0))
(return (i32.const 99))
(return (i32.const 0x41))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason these were renumbered? Just to make it easier to read below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first converted them to hex to easier find them in the bytecode, but this seem not relevant any more.


if (!module.has_memory())
throw validation_error{"memory instructions require imported or defined memory"};
break;
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this continue here jumping to the while body? I did not know you can jump out from within switches with it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hackish, but also "minimal" change. The continue only affects loops but break both loops and switches.

@@ -861,11 +861,12 @@ parser_result<Code> parse_expr(const uint8_t* pos, const uint8_t* end, FuncIdx f

uint32_t offset;
std::tie(offset, pos) = leb128u_decode<uint32_t>(pos, end);
push(code.immediates, offset);
code.instructions.push_back(opcode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, why is the old code (end of loop) using emplace_back(opcode) but the new commits are not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The std::string only has .push_back(). It will be easier to try this instead of std::vector.

ASSERT_EQ(module->codesec[0].immediates.substr(4), "00000000"_bytes); // load offset.
auto* const load_instr = const_cast<uint8_t*>(&module->codesec[0].instructions[1]);
ASSERT_EQ(*load_instr, Instr::i32_load);
ASSERT_EQ(bytes_view(load_instr + 1, 4), "00000000"_bytes); // load offset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this bytes_view alternative better because it is faster, or does it result in a nicer error display?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was easier for me to read it like that when modifying the test.

"00000000"_bytes; // stack_drop

EXPECT_EQ(code.immediates.substr(br_table_imm_offset, expected_br_imm.size()), expected_br_imm);
EXPECT_EQ(code.immediates.substr(0, expected_br_imm.size()), expected_br_imm);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this 0 now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah the old immediates table here skipped the local for some reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test always takes the part of the immediates for the br_table only. Now this part starts at 0 because we removed other immediates. I didn't bother to keep the br_table_imm_offset name. In the end when all immediates are move to instructions we check all bytes anyway.

@@ -455,7 +455,7 @@ TEST(parser_expr, call_indirect_table_index)
const auto code1_bin = i32_const(0) + "1100000b"_bytes;
const auto [code, pos] = parse_expr(code1_bin, 0, {}, module);
EXPECT_THAT(code.instructions,
ElementsAre(Instr::i32_const, 0, 0, 0, 0, Instr::call_indirect, Instr::end));
ElementsAre(Instr::i32_const, 0, 0, 0, 0, Instr::call_indirect, 0, 0, 0, 0, Instr::end));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really a single test affected by call immediates?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like. All intermediate commits should pass all tests. Immediates for calls are rather simple and at that time we were writing more wat2wasm tests.

@@ -1461,5 +1456,4 @@ TEST(parser, milestone1)
ElementsAre(Instr::local_get, 0, 0, 0, 0, Instr::local_get, 1, 0, 0, 0, Instr::i32_add,
Instr::local_get, 2, 0, 0, 0, Instr::i32_add, Instr::local_tee, 2, 0, 0, 0,
Instr::local_get, 0, 0, 0, 0, Instr::i32_add, Instr::end));
EXPECT_EQ(c.immediates.size(), 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it would have been nice to have the final "remove immediates from struct" as a separate commit to see the impact of changes.

@axic axic force-pushed the immediates_merge branch from 9f8929a to 05023ef Compare November 6, 2020 14:18
@axic axic force-pushed the immediates_merge branch from 05023ef to 1bb954b Compare November 6, 2020 14:24
@chfast chfast merged commit 7c0562b into master Nov 6, 2020
@chfast chfast deleted the immediates_merge branch November 6, 2020 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants