Skip to content

Displaying pairs and types constructed by external libraries #579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
btzy opened this issue Apr 25, 2020 · 6 comments
Closed

Displaying pairs and types constructed by external libraries #579

btzy opened this issue Apr 25, 2020 · 6 comments

Comments

@btzy
Copy link
Contributor

btzy commented Apr 25, 2020

This is a somewhat open question, pertaining to the specification and behaviour of display(x).

Currently, pairs are displayed as [a, b]. There are two issues with this:

  1. Specification issue: The specification merely states that "the notation used for the display of values is consistent with JSON". But since we didn't define a JSON representation for pairs, a conforming implementation could have implemented it to produce something akin to JavaScript objects {head: a, tail: b} instead.
    • Proposal: The specification should explicitly specify something - either say that the displayed text is implementation defined (for non-primitive types at least), or demand a specific representation.
  2. Implementation issue: The currently displayed format hints that pairs are implemented as arrays under the hood. This leaks an implementation detail that shouldn't be known to the user. This is even worse since Source supports arrays (we can display([2, 3]) to get something that looks like a pair).
    • Proposal: Display pairs as pair(a, b) instead.

Normative discussion:

For the specification issue, I don't think there is a need to stick to valid JSON, since the internal representation of pairs is supposed to be opaque to the user. Something like pair(a,b) is far better, and anyway we are breaking JSON validity for displaying lists already. The more difficult problem lies in the implementation — since pairs are indeed represented as 2-element arrays in the LIST library, it would be impossible to display pairs using a different syntax from arrays without rewriting the LIST library.

However, given that we now have a WebAssembly implementation and some VM implementation of Source, which are able to implement pairs differently from arrays, I think it is better to not constrain the display of pairs too much.

There's an easy way out, which is to specify that the display of pairs is implementation defined.

I don't like that very much. Instead, I propose the following:

As part of the core language, the specification defines the text representation for display all primitive types (undefined, boolean, number, function, string, maybe array) only. The LIST library then gets to define the text representation of a pair. With #568, the libraries (except perhaps MISC) are functionally separate from the core language, so alternative implementations of the LIST library meant to work with the WebAssembly or other implementations can specify the text representation of a pair (and a list too). After all, only the LIST library implementation actually knows how their pairs are implemented.

The same argument may be made for runes and other external libraries — currently the JSON that is shown when you try to display a rune is effectively garbage. Only the RUNES library knows what should be a nice way to display a rune, so the library is in the best to define what display() should do when given a rune.

Note about #379: My proposal isn't exactly at odds with it. My proposal, if accepted, would simply mean that a JavaScript-based implementation of Source can use libraries that specify that they display JSON-compliant data, while non-JavaScript implementations of libraries can display something else.

Then, if and when someone eventually gets around to updating the LIST library to store pairs differently, the library can specify to display pairs as pair(2,3) (or whatever else is appropriate) without changing the core language.

@martin-henz
Copy link
Member

we didn't define a JSON representation for pairs
At the moment, our specs are quite explicit that pairs are arrays. See
https://sicp.comp.nus.edu.sg/source/source_3.pdf page 8.

The issue is not just the display, but also equality and array access (which works on pairs) and pair operations head, tail etc which work on arrays.

a conforming implementation could have implemented it to produce something akin to JavaScript objects {head: a, tail: b} instead
The trouble is that SICP JS does not introduce objects, so introducing them just to have pairs would be quite odd.

[To be perfectly honest, SICP JS (following SICP) does not introduce arrays or loops. The addition of these things in Source §3 is driven by the SoC curriculum: CS1101S needs to cover arrays and loops, and it makes sense to do that in the context of imperative programming: mutable data.]

Regarding the display of runes: That's a known issue, see here #311

@btzy
Copy link
Contributor Author

btzy commented Apr 26, 2020

At the moment, our specs are quite explicit that pairs are arrays. See
https://sicp.comp.nus.edu.sg/source/source_3.pdf page 8.

Turns out I didn't read the spec carefully enough. Nonetheless, I could still make the argument that pairs shouldn't be implemented with arrays in the first place (or at least, it should not feel like it from the user's perspective). Passing a pair into a function that accepts arrays should not work.

(As an aside, type inference currently cannot be extended to Source §3 if we want arrays to be homogeneous but pairs to be heterogeneous.)

The trouble is that SICP JS does not introduce objects, so introducing them just to have pairs would be quite odd.

Well, Source §2 introduces pairs, but arrays are only introduced in Source §3. It is already quite odd when programming in Source §2.

@martin-henz
Copy link
Member

martin-henz commented Apr 29, 2020

You are right: It hurts a bit to see pairs being able to play the role of arrays.

I can reply to these arguments with reference to the two origins of Source: (1) SICP and (2) JS

Regarding SICP: This is not an isolated incident: The languages Scheme and Source provide a small set of primitive types and let the user derive more types using those primitive types. It is up to the user to use these derived types correctly. In CS1101S we talk about "list discipline" and "tree discipline" to refer to these obligations. The languages don't help in enforcing this discipline. For example, lists are made up of pairs and null, and trees are made up of lists, but it's up to the user to do this correctly; the languages don't enforce this. So far the SICP heritage.

The JavaScript heritage leads me to the following argument: JavaScript programmers will naturally ask: What is a pair? The answer: "It's an array with two elements." is the most economical answer I can think of. A further benefit of this representation is a satisfying representation in JSON, which strengthens our connection to JavaScript.

@btzy
Copy link
Contributor Author

btzy commented Apr 29, 2020

The JavaScript heritage leads me to the following argument: JavaScript programmers will naturally ask: What is a pair? The answer: "It's an array with two elements." is the most economical answer I can think of. A further benefit of this representation is a satisfying representation in JSON, which strengthens our connection to JavaScript.

I would probably think that this is natural to a TypeScript programmer, but not to a JavaScript programmer (and that {head: x, tail: y} would be more natural, and actually also be the more satisfying representation in JSON). Furthermore, Source is actually already more strongly typed than JavaScript (there are no implicit conversions so "a" + 1 is an error in Source), and making a separate pair type is in the same direction. Also, making pairs and arrays distinct does not violate the property of Source being a subset of JavaScript.

As for the "list discipline" and "tree discipline", I guess whether the burden is placed on the programmer (as in Source) or the implementation (as in a strongly typed language) depends on pedagogical concerns. Type inference, for example, transfers some of the burden from the programmer to the implementation, which is in a similar direction to my proposal. There perhaps might be an argument (I don't know) for requiring students to self-enforce "list discipline" and "tree discipline".

@martin-henz
Copy link
Member

The JavaScript heritage leads me to the following argument: JavaScript programmers will naturally ask: What is a pair? The answer: "It's an array with two elements." is the most economical answer I can think of. A further benefit of this representation is a satisfying representation in JSON, which strengthens our connection to JavaScript.

I would probably think that this is natural to a TypeScript programmer, but not to a JavaScript programmer (and that {head: x, tail: y} would be more natural, and actually also be the more satisfying representation in JSON). Furthermore, Source is actually already more strongly typed than JavaScript (there are no implicit conversions so "a" + 1 is an error in Source), and making a separate pair type is in the same direction. Also, making pairs and arrays distinct does not violate the property of Source being a subset of JavaScript.

As for the "list discipline" and "tree discipline", I guess whether the burden is placed on the programmer (as in Source) or the implementation (as in a strongly typed language) depends on pedagogical concerns. Type inference, for example, transfers some of the burden from the programmer to the implementation, which is in a similar direction to my proposal. There perhaps might be an argument (I don't know) for requiring students to self-enforce "list discipline" and "tree discipline".

The trouble is that objects are not part of Source. So a reduction of pairs to objects would force us to include objects in Source as the data structure that underlies pairs. We can avoid that with the representation of pairs with arrays. So basically, we have a minimal subset of JSON that supports all Source primitive data types.

@btzy
Copy link
Contributor Author

btzy commented May 2, 2020

The JavaScript heritage leads me to the following argument: JavaScript programmers will naturally ask: What is a pair? The answer: "It's an array with two elements." is the most economical answer I can think of. A further benefit of this representation is a satisfying representation in JSON, which strengthens our connection to JavaScript.

I would probably think that this is natural to a TypeScript programmer, but not to a JavaScript programmer (and that {head: x, tail: y} would be more natural, and actually also be the more satisfying representation in JSON). Furthermore, Source is actually already more strongly typed than JavaScript (there are no implicit conversions so "a" + 1 is an error in Source), and making a separate pair type is in the same direction. Also, making pairs and arrays distinct does not violate the property of Source being a subset of JavaScript.
As for the "list discipline" and "tree discipline", I guess whether the burden is placed on the programmer (as in Source) or the implementation (as in a strongly typed language) depends on pedagogical concerns. Type inference, for example, transfers some of the burden from the programmer to the implementation, which is in a similar direction to my proposal. There perhaps might be an argument (I don't know) for requiring students to self-enforce "list discipline" and "tree discipline".

The trouble is that objects are not part of Source. So a reduction of pairs to objects would force us to include objects in Source as the data structure that underlies pairs. We can avoid that with the representation of pairs with arrays. So basically, we have a minimal subset of JSON that supports all Source primitive data types.

I see your point. It would be more explainable to students if pairs are implemented using two-element arrays.

It still doesn't feel entirely satisfactory though; there are still at least two other related issues:

  • External libraries such as RUNES will display object syntax if you try to do display(rune)
  • Type inference cannot be extended to arrays in Source §3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants