,,,,,,

Etymological programming

— Filed under: Docs Internal Post Software Syntax

Etymological programing is a set of conventions used by Pedro to use a couple thousand different functions fluently. By writing this document Pedro hopes to gather the conventions in a single place, so they can mature.

Etymological programing deals chiefly with the naming of functions and variables, by analogy with the naming of words in every world language.

The ultimate goal is to save cognitive effort and thus increase the power and joy of bringing ideas to life, especially for small groups of programmers creating small codebases (below 100k lines or 10k functions).

At a basic level, Etymological programing entails knowing return and argument types plus argument positions just by reading the function name. At an intermediate level it means quickly finding the right function even after forgetting its name or immediately check whether such a function already exists. At an advanced level it could mean understanding an unknown function based on its name alone, before peeking at the documentation or source code, if at all.

When adding new aspects of Etymological programing, one should be guided by the question: what feels more natural? Let's bring the language back into the programming language!

Functions are always Capitalised and CamelCased even when used as variables. No other variable type may be capitalised.

The last word in a function always identifies its return type.

Shall a function not return anything (only produce a side effect) the last word must be a verb instead, without any suffix added.

The order and name of all words in a function, except the last word, matches the order and name of all argument variables.

Whenever arguments are reordered, all words in a function name must be reordered accordingly. A find and replace tool is essential.

Reordering should occur whenever it would result in having less words, by allowing grouping and//or by omitting.

Shall several arguments have the exact same name, instead of repeating them, a repetition numeral prefix is applied.

When the last argument name coincides with the name of the return variable, one of them can be simply omitted.

Idea omitting it could cause ambiguity, maybe a suffix could be appended to resolve? Also, what happens to the return type when omitting and grouping are combined?

Type capitals can be used to name variables that take a precise type, where no more specific name would be appropriate. They arise as arguments of low-level function libraries.

Capital Type
Aarray
Dboolean (dual)
Ffunction
Ggraph
Nnumber
Oobject
Rregex
Sstring
Uurl

Type capitals are UPPERCASE and may be combined.

Idea uniformise type abbreviations and single type suffixes

Arguments of an overloaded function can have multiple types. This is expressed concisely by concatenating type capitals. For instance, one could define a function Length(SAO) which would calculate the lengths of a string S (number of characters) or an array A (number of entries) or an object O(number of keys).

Type suffixes identify variable types concisely and can be appended to any chosen name.

For example, let's assign this glyph: ♫ to a variable named glyph. Then glyphs would be an array [♫, ♫, ♫], whereas glypht would be a textual representation such as ”music” , and glyphed would be true or false (a boolean) depending on whether something is a glyph.

There are single type suffixes, as well Multi-type suffixes.

Suffix Meaning Mnemonic
aarraya is the first letter of array
sarray of-s forms a plural in English, naturally meaning a collection of items
oobjecto is the first letter of object
vdictionary whose values ofv is the first letter of values
kdictionary whose keys ofk is the first letter of key
edboolean value-ed forms a past participle, implying the result of a concluded action
nnumber/namen is the first letter of number (and of name)
ttext (string)t is the first letter of text
lhtml/xml (as text)l is the last letter of html
elhtml node (as element)el is at the begining of element
ggraphg is the first letter of graph
uurlu is the first letter of url
evevent objectev is the beginning of event

Letter e should never be used as a prefix, because it is already a part of other prefixes, and because it can be used to as linking element.

Multi-type suffixes are convenient when naming arguments of conveniently overloaded functions.

Suffix Meaning Mnemonic
iitem or array thereofi is the first letter of "item", but also a plural in Italian
jitem or object thereofj is similar to i
uitem or undefinedu is the first letter of "undefined"
xitem or text thereofx is text but could be something else
yritem or function that returns itsee function suffix cascade
Suffix Meaning Mnemonic
hnot an argumenth a silent letter in many languages

These represent a transformation.

Suffix Meaning Mnemonic
zitem out of array thereofmirror of s

This cascade is useful when naming functions that return other functions, several levels deep. Typical uses arise in currying, function factories and callback pyramids. The letter -r was chosen because this is strongly associated with verbs in Latin languages like French and Spanish.

Suffix Depth Return type
ar0item (could be a function)
er1function, that returns an item
ir2function, that returns a function, that returns an item
or3function, that returns a function, that returns a function, that returns an item

By consistently naming all functions with a specific return type, suffix -ar need never be used, except internally. Indeed, if one wishes to emphasise that a return type is indeed not a function, then suffixing -ar is advised. In practice, suffixes -er is the most used and -ir surfaces occasionally. Suffix -or may signal unnecessary complexity, which defeats the vision of *etymological programming*.

Nesting suffixes creates a possessive relation. So texts means an array (-s) containing items of type text, textsv means an dictionary (-v) containing in each value a texts array. Conversely, textvs means an array of dictionaries, each containing in each value one text item.

With deeper nesting, words will be harder to pronounce. This can signal unnecessary complexity, but occasionally may be useful. Enter the linking e.

As another example, textses means an array containing another array containing text. Here a linking e was added to help pronounciation, even though textss would be valid.

Prefix Meaning Example Clarification
PreBefore, beginningPrefixStringprepends to start of string
PosAfter, endingPosfixStringappends to end of string
UnContrary or complementaryUnPrefixStringremoves from start of string
ExOutsideExfixStringadds to the outside (both ends of string)
InInsideUnInfixStringremoves from inside (not at the ends)
ReEnforceReString(AFNO)coerces arrays, functions, numbers objects, into strings
DisDistinctDisArray(A)deduplicates all items in an array

Idea uniform letter immediately after prefix - is it always capital?

These prefixes specify a repeated number of arguments of the same type.

Prefix Number Usage
si1probably not needed
bi2common
ter3rare
qua4rare
qui5(impractical)
sen6''
sep7''
oct8''
nov9''
den10''
plurimanyvariable number of arguments (could be as low as zero)

Obs: bi- and si- are shorthands for bin- and sin-, the actual distributive latin numeral prefixes.

Actually, Un can also be used as a standalone function, e.g. Un(Arrayed) creates a new function that checks whether an item is not an array.

Idea shouldn't Un be named Uner? Also does this idea generalise to other prefixes?

These prefixes may be used freely, but never to refer to the number of arguments.

Prefix Number
mono1
di2
tri3
tetra4
penta5
hexa6
hepta7
octa8
ennea9
deca10
polymany
All greek and latin prefixesA-H,H-O and P-Z
Numeral Prefixeshttps://en.wikipedia.org/wiki/Numeral_prefix