Eidola: Specification of legal identifiers

Specification of Legal Identifiers

Design questions
Semantics | Home

The Question

From the point of view of the semantics, there is no reason to limit the set of identifiers (i.e. names of elements). Textual languages impose limits for the sake of parsing, but this isn't a concern for Eidola -- there is no theoretical prohibition against naming Eidola elements with non-alphanumeric characters, or spaces and control characters, or Korean characters, or even audio ("After the tone, please speak the name of your variable...").
Of course, there are practical limits to what a notation can reasonably display or ask a user to input. And while parsability isn't a problem for the semantics, it may still be a problem for a notation -- a programmer will have to type in code at some point! Added to this is the problem that the problem that certain kinds of identifiers may not be suitable for all representations (audio in XML?), or even all hardware. And names may carry semantic weight in storing references between namespace, so it's important that all notations be able to deal meaningfully with all identifiers.
It makes sense, therefore, to place some limits on the set of identifiers. It seems a shame, however, to prevent programmers from using special symbols suited to very specific purposes which the semantics cannot anticipate. There several possible approaches:

Keep it traditional: This is guaranteed to be workable, though it seems a shame to overcautiously rule out all the interesting possibilities Eidola offers. Taking this approach would be a concession to the limits of textual languages, and would make creative alternatives unsupported hacks.

Allow a broad but specific set of identifiers in the semantics, such as all Unicode characters: This still rules out some of the wilder possibilities (graphical names), but seems a good compromise. It is not unreasonable to ask all notations and representations to deal with Unicode. It may, however, make code entry completely unworkable.

Leave it out of the semantics: This would allow notations to be as creative as they wish, but it violates the principle of representation independence. The semantics should give a guarantee that any representation or notation can deal with any Eidola program, without any translation problems or loss of information.

Create identifier specifications separate from the semantics: This would at least limit the degree of representation/notation compatibility problems, but isn't a full solution. It creates the danger of having endless flavors of identifiers which, in practical terms, would be little better than the previous option.

Specify a standard required set, and allow optional alternate names: for example, specify that every name must have a Unicode form, but may also have a graphical icon form which an interface may show if it so desires. This threatens to create an unmaintainable mess of names in the code, but leaves open some of the really outrageous and interesting possibilities.

The Verdict

...will have to wait until code entry takes shape, which won't happen until the algorithmic semantics exist and we have at least some idea of how the notations will deal with them.