Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add QoL instruction aliases (and other misc instruction syntax suggestions for discussion) #1568

Closed
12 tasks
SonoSooS opened this issue Dec 5, 2024 · 2 comments
Closed
12 tasks
Labels
enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM

Comments

@SonoSooS
Copy link

SonoSooS commented Dec 5, 2024

Considering how some archaic isntruction aliases are being removed, it could be great to add some "modern" aliases for some instructions, to make the ASM syntax more pleasant to use, and/or easier to write in "autopilot" mode.

These are aliases for existing instructions, that fit in, and should nicely extend the instruction syntax in ways that should've been always there to begin with:

  • LDH A, [C] (allow specifying redundant "high" modifier, with, or without)
    Note: this is implemented, but mentioning it for the sake of completeness any difference from gbz80.7.
    Valid permutations (in both directions):

    • LD A, [C]
    • LD A, [$FF00+C]
    • LD A, [C+$FF00]
    • LDH A, [C]
    • LDH A, [$FF00+C]
    • LDH A, [C+$FF00]

    Warning: LD A, [$FF00+$04] is however not the same as LDH A, [$FF00+$04] due to ambiguity, and especially as the integer addition resolves as an expression, and it works with labels as well. LD A, [$04] vs. LDH A, [$04] is an even bigger difference, so using the correct specifier is really important!

  • LDH A, [$04] (allow using only the LOW part as the LDH address, as long as the address HIGH part evaulates to $00 or $FF)
    Note: I don't think this is meant to be removed anytime soon, but mentioning just in case to reinforce its existance.
    Keep this variant, so the programmer can think about actual bytes of the instruction, without happening to remember the opcode byte of their chosen LDH instruction, and so it has feature parity with the STOP n8 syntax.

  • CPL A (allow writing the implicit A as explicit)
    If XOR $0F and XOR A, $0F exists, why not this? I keep typing it all the time, and it's annoying to have to go back and edit out the A parameter.
    CPL has implied A parameter, so it should help in code readability to make it possible to write out that was "always there".
    Note: this is apparently already supported, but not in rgbds-live as of writing.

  • SUB SP, e8 (allow negating the instruction (ADD --> SUB) instead of being forced to negate the e8 parameter)
    If you code in ASM for multile CPU architectures, you could be potentially annoyed by the missing parallel to ADD SP.
    Word of caution though: ADD SP, e8 rules still apply, as this is just an alias, so the e8 range swaps to 128 to -127 when using this alias, instead of 127 to -128 when using ADD SP.

These are not a "modern" alias, but an entirely new suggestion, albeit with somewhat dubious syntax:

  • ADD SP+e8 (allow using the same SP-relative syntax, instead of having to use commas with ADD/SUB)
    Very potentially visually looks like ADD +3, but in reality, this instruction is ADD SP, SP, e8, but the last two parameters encoded into the SP-relative syntax, and the "implicit" target missing.
    If this gets added, ADD SP, SP+e8 (and also with SUB) could be added too, for the sake of syntax completeness with ADD A, param, even if the two are not related to eachother.

  • ADD HL, SP+e8 (allow using ADD/SUB instead of LD to "generate" stack-relative addresses into value of HL)
    I kind of advise against including this as-is, as it suffers from the same problem as JP [HL]. While technically true*[1] at a glance, if you consciously try to interpret what the ASM reads, it doesn't make sense.

    I tried ADD HL, SP, 3, but it's too many commas for me.
    We just removed LDHL, so adding LDSP is kind of pointless, especially since I could do that using just a macro.
    I always keep typing ADD HL, SP+e8 or SUB HL, SP+e8 (the SUB variant is extremely fringe, but still useful in rare cases involving in-RAM code)

And while I'm making instruction suggetions, these are the most likely to be rejected, but I'll still mention them just in case for future discussions:

  • DAA A (same as with CPL A)
    I read it as "Decimal Arithmetic Adjust", so in that case it makes sense for it to have a parameter. Only A is valid, but could be optional parameter that is checked for validity, but otherwise still ignored.

  • LD Cy, 1 (and other wild variations of it)
    On NEC 78K0R, you can modify the processor status register as a memory-mapped register. And so, you can use bit-modification instructions on it. Although, it also supports modifying the Cy bit in particular too, with dedicated instructions.
    Could possibly look better to use 78K0R-like syntax for SM83 in some cases.
    With that in mind, here are possible permutations:

    • SET Cy
    • SET 0, Cy
    • SET 4, F
    • SET F.4
    • OR F, 0x10 (pitfall: sets Cy instead of clearing it)

    Note: these are all assembly-time constants! If any other values are used, it's rejected as invalid.
    Since cf, F.c, and others are not standardized, and c conflicts with register C, I've been using Cy from the 78K0R, as it also has a name conflict with the C register, and so the carry flag name "Cy" is unambigous that way. However, basically almost everyone else I have seen calls the carry flag as cf in this community, so the Cy name is unlikely to pass.

  • CPL Cy (same as with LD Cy, 1)
    Permutations:

    • CPL F.4
    • XOR Cy, 1 (pitfall: Cy inverted instead of cleared)
    • XOR Cy (pitfall: same as above)
    • XOR F.4, 1 (pitfall: same as above)
    • XOR F.4 (pitfall: same as above)
    • XOR F, 0x10 (pitfall: same as above)

    Note: the numbers here are assembly-time constants! Any other value is invalid, especially since they do not assemble to valid instructions.

  • SET r8, bit_i (reversed parameters for instructions with bit index specifier)
    As there is no ambiguity with the instructions containing bit index specifier (BIT, SET, RES), it should be safe to allow swapping the two paramters' order, and have it assemble either way.

  • SET r8.bit_i (same as above, but with dot syntax)
    I really like the bit dot syntax, as I find it really readable. Also from the 78K0R syntax.
    Examples:

    • SET F.4 (SCF alias)
    • RES A.7
    • BIT [HL].4
  • RETCC (allow extra ways to express condition flags)
    I've lost count how many times I tried to write cc and cs as a conditional..
    Examples (also applies to CALL cc, JP cc, and JR cc too):

    • RET zc (RET nz, "zero flag clear")
    • RET zs (RET z, "zero flag set")
    • RET cc (RET nc, "carry clear")
    • RET cs (RET c, "carry set")
    • RETEQ (RET z, "equals" / (A - param) == 0 / A == param)
    • RETNE (RET nz, "not equals" / (A - param) != 0 / A != param)
    • RETGE (RET nc, "greater or equal" / (A - param) >= 0 / A >= param)
    • RETLT (RET c, "less than" / (A - param) < 0 / A < param)
    • JGE e8 (JR nc)
    • JLT e8 (JR c)
    • JRAE e8 (JR nc, "above or equal")
    • JRB e8 (JR c, "below")
    • ...and the list is basically endless

*[1] - the CPU does something similar to PC=HL, IR=[PC+] in the same M-cycle, so in that regard, the JP [HL] syntax does make kind of sense, from an architectural standpoint. However, when reading the ASM as-is, it's bogus syntax, as this is not a 6502, the CPU is not capable of indirect jumps like that.

@Rangi42
Copy link
Contributor

Rangi42 commented Dec 5, 2024

Firstly, thank you for your proposals, and explaining why you'd want them! Unfortunately I really don't think these are good candidates for new built-in syntax.

  1. Permutations of LDH A, [C]: I doubt we'll allow any of these. LD A, [$FF00+C] is allowed for historical reasons, not because it's preferred syntax. As such, permutations like LD A, [C] or LD A, [C+$FF00] should not be created; and also I'm pretty sure C + $FF00 would be a parser rule conflict.
  2. LDH A, [$xx]: This is already supported and I don't see a reason to remove it.
  3. CPL A: This is already supported in master and the latest release candidate.
  4. SUB SP, e8: Needing a range of -127 to 128 (-$7F to $80) sounds unusual enough that I wouldn't want this.
  5. ADD SP+e8: Absolutely not; A is the only implicit operand. LD BC with implicit HL was a bug/misfeature that we removed, not going to create more.
  6. ADD HL, SP + e8: What? ADD SP, e8 and LD HL, SP + e8 are real instructions, this is not. Not sure what you're imagining this would assemble as, and that's enough reason to not introduce it.
  7. DAA A: The A is already in the instruction name, just like RRA, RLCA, etc.
  8. LD Cy, 1 etc: This is not rcCar. :P
  9. CPL Cy: Cy is not and will not be a new token just to make this never-before-used syntax real. (Preemptively: Nor will CPL C, since that looks like the register C.)
  10. BIT/RES/SET r8, bit_i: Why on Earth would we start allowing this.
  11. BIT/RES/SET r8.bit_i: Again, no.
  12. RETCC etc: You are free to use DEF CC EQUS "NC", DEF CS EQUS "C", DEF RETNE EQUS "RET nz", etc. This was literally from today:

    historically rgbasm has chosen certain ones, but also has sometimes supported redundant ones (like ; and * comments, or BSS and ROM0)
    we usually prefer to remove the redundant ones and pick one modern-ish way


it could be great to add some "modern" aliases for some instructions

These are not modern aliases, they're either taken from other old architectures and their different styles of hand-written asm (78K0R? some random microcontroller from 1986?), or made-up never-before-seen syntaxes.

Overall, the whole direction RGBASM has been going in for years is removing redundant built-in expressions, while giving users more powerful tools to define their own. More powerful macros, {interpolation}, REDEFining strings, STRFMT... It's already enough to implement structs, regexes, lists/arrays, a Brainf!#k interpreter, and we haven't even added user-defined functions yet. :3 If you prefer a different lineage of asm mnemonics from the GB/Z80-style ones RGBASM has inherited, it should be possible to implement yourself.

@Rangi42 Rangi42 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 5, 2024
@Rangi42 Rangi42 added enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM labels Dec 5, 2024
@SonoSooS
Copy link
Author

SonoSooS commented Dec 5, 2024

I agree with your statement. I did expect most to be rejected (even though I want all of these), but it's better to bring this up before 1.0.0, so we can refer to this in case someone else would ask after 1.0.0 why X or Y decision is like it is.

Although, feels like some of my points in the extra text were missed, I'll try to re-word some of them.

Permutations of LDH A, [C]: I doubt we'll allow any of these.

So, does that mean that either LD A, [C] or LDH A, [C] will be removed for 1.0.0?

SUB SP, e8: Needing a range of -127 to 128 (-$7F to $80) sounds unusual enough that I wouldn't want this.

It is not unusual. I just said out explicitly the potential parsing pitfall. Since the instruction turns into SUB, but the CPU only has ADD, the range for the signed operand changes, as this is just an alias. The CPU can only do +127 -128 with ADD SP, and since SUB SP would have to negate the signed operand for it to assemble correctly, the accepted signed operand range for SUB SP turns into -127 +128.
If the range is verified at bytecode emit time, then whatever I said is redundant, as it would error out at emit-time due to an invalid range passed to ADD SP.

ADD HL, SP + e8: What? ADD SP, e8 and LD HL, SP + e8 are real instructions, this is not. Not sure what you're imagining this would assemble as, and that's enough reason to not introduce it.

It would assemble as LD HL, SP+e8, and with SUB, it would assemble as LD HL, SP+-e8.
But yes, I did not expect this one to get passed at all, since the syntax is nonsense, and ADD HL, SP, e8 has too many commas while still being nonsense.

LD Cy, 1 etc: This is not rcCar. :P

Oops! 😛

CPL Cy: Cy is not and will not be a new token just to make this never-before-used syntax real. (Preemptively: Nor will CPL C, since that looks like the register C.)

I guess for Cy you also include cf that everyone else uses.

BIT/RES/SET r8, bit_i: Why on Earth would we start allowing this.

Right-to-left register order. Currently, bitfield instructions are the only ones that have their operands left-to-right, while every other instruction is written right-to-left, where it's obvious that the target is on the left. Can't really target an integer immediate this way as operation target output, so I instinctively always try to write it in this order, and get a nice assemble error 😅

RETCC etc: You are free to use DEF CC EQUS "NC", DEF CS EQUS "C", DEF RETNE EQUS "RET nz", etc.

Oops! I did not realize at the time of writing that these are macro-compatible (including EQUS) 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM
Projects
None yet
Development

No branches or pull requests

2 participants