Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Data (and struct) syntax #2061

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

HoneyryderChuck
Copy link
Contributor

This is a proposal on how to solve the Data and Struct usage, which currently has to be hand-stitched and is often times incomplete (lack of #members, for example). Essentially it boils down to this:

class A < Data(a: Integer, b: String)
end

The generated type will have the corresponding attribute readers, correct initiallize signatures, #members, and all that, as well as correct class innheritance hierarchy (anonymous dynamic class in between A and Data with all the aforementioned methods).

Implementation-wise, this patch reuses code from record attibutes to parse the class members, and provides an exceptional path for classes inheriting from Data (a similar solution could be built for Struct).

I've left this purposedly raw and incomplete to get some early feedback, before I go too far on an unwanted path.

cc @soutaro

@ParadoxV5
Copy link
Contributor

ParadoxV5 commented Oct 15, 2024

Thank you for the kickstart.


Howëver, this design needs more innovation to be suitable for the anonymous superclass-less design that I and many prefer.

Measure = Data.define(:amount, :unit) do
  self::TAU = 6.28r
end

If RBS < shall correspond to Ruby <, then that snippet’s corresponding RBS might have to be (note the unprecedented `=` _superclass_)

class Measure = Data(amount: Integer, unit: String)
  TAU: Rational
end

https://github.com/ruby/rbs/blob/master/docs/data_and_struct.md#type-checking-class-definitions-using-data-and-struct


This idea of RBS DSLs is great 👍.

I suggest making this system extensible for OOP-based DSLs (e.g., Delegate, FFI/Fiddle) to tap in.

The RBS plugin (whatever form it is) provides handling code that generates the corresponding raw signatures (in the same spirit as soutaro/rbs-inline#76).
The user writes this for https://github.com/ffi/ffi/wiki/Structs/234eab91d0ee55ce103d4b92a96a98c6e8e46890#an-example:

class SimpleStruct = FFI::Struct(value: Float)
end

Note that type args (AKA. generics) are in a tuple list (Type,…) while this is in a record map (key: Type,…).
Could reusing square brackets be syntactically compatible?

class A < Data[a: Integer, b: String]
end

This can even open the doors to “heterogenous type args” by combining the thoughts above.
[Integer, String] would be syntax sugar for Array(0: Integer, 1: String) and {a: Integer, b: String} would be Hash(a: Integer, b: String), and they expand to:

# 0: Integer, 1: String
def []: (0) -> Integer
      | (1) -> String
# a: Integer, b: String
def []: (:a) -> Integer
      | (:b) -> String

@soutaro
Copy link
Member

soutaro commented Oct 16, 2024

@HoneyryderChuck Thank you for this PR.

I think there are several things to consider:

  1. I was thinking the Data (and Struct) things are over because rbs-inline provides some support. (And the feature will be merged into RBS gem, hopefully in a few months.)
  2. Having the Data(...) syntax as an extension point may make sense, as @ParadoxV5 said. It's clearly an area to explore to provide better support for FFI, Delegate...

@HoneyryderChuck
Copy link
Contributor Author

Thx for the feedback sofar!

Howëver, this design needs more innovation to be suitable for the anonymous superclass-less design that I and many prefer.

@ParadoxV5 Indeed, this syntax does not address that. I think it puts us a step closer by providing a "middle ground" with an anonymous class with dynamic method injection, but the thing you're asking should be its own separate contribution; namely, RBS needs a syntax for anonymous class which supports patterns such as Error = Class.new(StandardError).

I suggest making this system extensible for OOP-based DSLs (e.g., Delegate, FFI/Fiddle) to tap in.

I find that suggestion interesting. I don't know enough of RBS internals much to propose such a plugin system (I don't think there is one now, right? 🤔 ), but I could see this being useful for cases such as dynamic method injection potentially, i.e. the case you stated with Delegate, where RBS internally wouldn't need to fiddle with the concept of what a "delegated" class is, and that would be left for the library to provide. Not planning yet to solve the delegate issue, and data/struct are core classes (so should be solved in RBS core), but that could be done, as long as there is an agreement on which RBS syntax can be "bailed out" to external plugins (and how).

Could reusing square brackets be syntactically compatible?

It could, I just thought it'd be clearer to distinguish it from the generic declarations. Another idea would be to use brackets, which would map well to how records are typed (but could be confusing for Data/Struct? maybe?):

# generic example
class A < B[C]

# current proposal
class A < Data(a: Integer)

# proposal with brackets
class A < Data{a: Integer}

This can even open the doors to “heterogenous type args” by combining the thoughts above.

Perhaps, didn't want to go that far 😅 but that could be an avenue for tuples/records.

I was thinking the Data (and Struct) things are over because rbs-inline provides some support.

@soutaro rbs-inline does not (yet) support other injected methods beyond #initialize (#members, #deconstruct_keys...), at least that I could think of. It also doesn't seem to fix the anonymous class hierarchy nor the more popular A = Data.define(... syntax issues (whether that's desirable or not, that's a different question). I'd argue there's room for improvement in RBS syntax to support that.

@HoneyryderChuck HoneyryderChuck force-pushed the data-struct-special-syntax branch from bd70d45 to 131ad1d Compare October 25, 2024 16:56
@HoneyryderChuck HoneyryderChuck force-pushed the data-struct-special-syntax branch 2 times, most recently from d36cdb0 to f1425c8 Compare November 11, 2024 23:29
@HoneyryderChuck HoneyryderChuck force-pushed the data-struct-special-syntax branch 2 times, most recently from d6621ab to c0b010e Compare November 27, 2024 11:43
@HoneyryderChuck HoneyryderChuck force-pushed the data-struct-special-syntax branch from c0b010e to e1af69f Compare November 27, 2024 11:44
@HoneyryderChuck
Copy link
Contributor Author

@soutaro I think the PR is at a point where I need some feedback on the direction, i.e. whether it's worth pursuing it or not.

I've coalesced into using brackets, i.e. Data{a: Integer, b: String}. I think it makes more sense considering the current syntax of RBS, where Data(...) is a bit undefined as extension point for variables, and using {...} makes it nicer when you consider records, itself self-contained "value objects as hashes", which makes it easy to mentally transition to "value objects as data/struct".

Spent more time than I would have wished fixing the cli "runtime" prototype generation. The main reason was that, because I'm using RBS::Types::Function::Param to declare value object attributes in the DataDecl and StructDecl classes, the default way to stringify them was${type} ${name} (instead of ${name}: ${type}). I saw that current code for functions workaround that by defining param_to_s and changing depending on whether an optional-or-required positional-or-keyword, which may fail when name does not contain a name character (should do "bar?: String" but instead does "bar?: String"). I thought about creating a similar class to RBS::Types::Function::Param but for class attributes, but I thought that was a design decision for you, so I decided it to patch it in the current way instead. Overall, there's too much usage of .struct? and .data?, but in most cases I don't know of a nicer way to work around it.

The json lib adds a .json_create method to the Struct class (currently one of the reasons why the typecheck jobs fail). I think that this is one method that should be declared in StructDecl, but should be conditional on whether the json (sigs) are loaded. I don't know of an easy way of doing that, but if you do, that'd help me remove the declaration from the .sig file and into StructDecl.

@ParadoxV5
Copy link
Contributor

ParadoxV5 commented Nov 28, 2024

I've coalesced into using brackets, i.e. Data{a: Integer, b: String}.

Curly braces looks strange as an “operator”, maybe because no text-based languages’ve ever used it.
I’m not against it, though.

Spent more time than I would have wished fixing the cli "runtime" prototype generation.

Could square brackets […] be easier to parse by extending the current type variables syntax?
As for generating, have you considered outsorcing to a templating system, such as the default gem ERB?

Could reusing square brackets be syntactically compatible?

It could, I just thought it'd be clearer to distinguish it from the generic declarations. Another idea would be to use brackets, which would map well to how records are typed (but could be confusing for Data/Struct? maybe?):

[P.S.] Well, this system also look like generics to me – type kwargs, if you will 😁.


I think the PR is at a point where I need some feedback on the direction, i.e. whether it's worth pursuing it or not.

  1. I was thinking the Data (and Struct) things are over because rbs-inline provides some support. (And the feature will be merged into RBS gem, hopefully in a few months.)

Isn’t rbs-inline not meant to replace manually-written/script-generated RBS files?


Indeed, this syntax does not address that. I think it puts us a step closer by providing a "middle ground" with an anonymous class with dynamic method injection, but the thing you're asking should be its own separate contribution; namely, RBS needs a syntax for anonymous class which supports patterns such as Error = Class.new(StandardError).

I partially retract my review by specifing my opinion:

RBS class A < Data(a: Integer, b: String) should be for Ruby A = Data.define(:a, :b) do/A = Data.define(:a, :b); class A – no extraneous anonymous intermediate class.
A = Data.define(:a, :b) do is in the spirit that Error = Class.new(StandardError) do is (mostly) equivalent to class Error < StandardError; both having no intermediate classes.

class A < Data.define(:a, :b) is comparable to class Error < Class.new(StandardError) – both inherits from the returned class rather than directly from Data/StandardError.
If RBS doesn’t allow skipping anonymous intermediate classes, then it’d be the user’s responsibility in including those intermediates in their RBS inheritances.

@HoneyryderChuck
Copy link
Contributor Author

Could square brackets […] be easier to parse by extending the current type variables syntax?

I'd say no. Consider that the parsing code reuses record parsing primitives. The same limitations I found using the initial proposal of Data(...) would be equivalent for syntax using Data[...].

As for generating, have you considered outsorcing to a templating system, such as the default gem ERB?

That's a question for @soutaro . I just tried to integrate my change with the existing system, which had already some support for Data/Struct value objects. The goal of the PR is not to propose an alternative system.

[P.S.] Well, this system also look like generics to me – type kwargs, if you will

I believe the RBS notation for generics uses [], i.e. [T] 🤔

RBS class A < Data(a: Integer, b: String) should be for Ruby A = Data.define(:a, :b) do/A = Data.define(:a, :b); class A – no extraneous anonymous intermediate class.

I agree. Current rbs does not support A = Data.define... nor anonymous class attribution to static var however, so I'd say that's a different proposal worth its own issue/PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants