Cognition
By Preston Pan, 2024

1. The problem
2. Introduction
3. Baremetal Cognition
4. Bootstrapping Takeaways
5. Crank
6. The Stem Dialect, Improved
- 6.1. The Great Escape
7. The Brainfuck Dialect
- 7.1. The Dialect Dialect
8. Theoretical Musings
9. Conclusion

1. The problem

Lisp programmers claim that their system of s-expression code in addition to its featureful macro system makes it a metaprogrammable and generalized system. This is of course true, but there's something very broken with lisp: metaprogramming and programming aren't the same thing, meaning there will always be rigid syntax within lisp (its parentheses or the fact that it needs to have characters that tell lisp to read ahead). The left parenthesis tells lisp that it needs to keep on reading until the right parenthesis in order to finish some process that allows it to stop and evaluate the whole expression. This makes the left and right parenthesis unchangable from within the language (not conceptually, but under some implementations it is not possible), and, more importantly, it makes the process of retroactively changing the sequence in which these tokens are delimited impossible, without a heavy amount of string processing. Other langauges have other ways in which they need to read ahead when they see a certain token in order to decide what to do. This process of having a program read ahead based on current input is called syntax.

And as long as you read ahead, or assume a default way of reading ahead, you fall into the trap of having some form of syntax. Cognition is different in that it uses an antisyntax that is fully postfix. This has similarities with concatenative programming languages, but concatenative programming langauges also suffer from two main problems: first, the introduction of the left and right bracket character (which is in fact prefix notation, as it needs to read ahead of the input stream), and the quote character for strings. This is unsuitable for such a general language. You can even see the same problem in lisp's C syntax implementation: escape characters everywhere, awkward must-have spaces delimit the start and end of certain tokens (and if not, it requires post-processing). The racket programming language has its macro system, but it is not runtime dynamic. It still utilizes preprocessing.

So, what's the percise solution to this connundrum? Well, it's beautiful; but it requires some cognition.

2. Introduction

Cognition is an active research project that Matthew Hinton and I have been working on for the past couple of months. Although my commit history for this project has not been impressive, we came up with a lot of the theory together, working alongside each other in order to achieve one of the most generalized systems of syntax we know of. Let's take a look at the conceptual reason why cognition needs to exist, as well as some baremetal cognition code (you'll see what I mean by this later). There's a paper about this language available about the language in the repository, for those interested. Understanding cognition might require a lot of background in parsing, tokenization, and syntax, but I've done my best to write this in a very understandable way. The repository is available at https://github.com/metacrank/cognition-rust, for your information.

Figure 1: The Cognition programming language, logo designed by Matthew Hinton

3. Baremetal Cognition

Baremetal cognition has a couple of perculiar attributes, and it is remarkably like the Brainfuck programming language. But unlike its look-alike, it has the ability to do some serious metaprogramming. Let's take a look at what the bootstrapping code for a very minimal syntax looks like:

ldfgldftgldfdtgl
df 
 
dfiff1 crank f

And do note the whitespace (line 2 has a whitespace after df, line 3 has a whitespace, and the newlines matter). Erm, okay. What?

So, our goal in this post is to get from a syntax that looks like that to a syntax that looks like Stem. But how on earth does this piece of code even work? Well, we have to introduce two new ideas: delimiters, and ignores.

3.1. Tokenization

Delimiters allow the tokenizer to figure out when one token ends and another begins. The list of single character tokenizers is public, allowing that list to be modified and read from within cognition itself. Ignored characters are characters that are completely ignored by the tokenizer in the first stage of every read-eval-print loop; that is, at the start of collecting the token, it fist skips a set of ignored characters. By default, every single character is a delimiter, and no characters are ignored characters. The delimiter and ignored characters list allows you to toggle a flag to tell it to blacklist or whitelist the given characters, adding brevity (and practicality) to the language.

Let's take the first line of code as an example:

ldfgldftgldfdtgl

because of the delimiter and ignored rules set by default, every single character is read as a token, and no character is skipped. We therefore read the first character, l. By default, Cognition works off a stack-based programming language design. If you're not familiar, see the Stem blogpost for more detail (in fact if you're not familiar this won't work as an explanation for you, so you should see it, or read up on the Forth programming language). Though, we call them containers, as they are more general than stacks. Additionally, in this default environment, no word is executed except for special faliases, as we will cover later.

Therefore, the character l gets read in and is put on the stack. Then, the character d is read in and put on the stack. But f is different. In order to execute words in Cognition, we must take a look at the falias system.

3.2. Faliases

Faliases are a list of words that get executed when they are put on the stack, or container as we will call it in the future. All of them in fact execute the equivalent of eval in stem but as soon as they are put on their container. Meaning, when f, the default falias, is run, it doesn't go on the container, but rather executes the top of the container which is d. d changes the delimiter list to the string value of a word, meaning that it changes the delimiters to blacklist only the character l as a delimiter. Everything else by default is a delimiter because everything by default is parsed into single character words.

3.3. Delimiter Caveats

Delimiters have an interesting rule, and that is that the delimiter character is excluded from the tokenized word unless we have not ignored a character in the tokenization loop, in which case we collect the character as a part of the current token and keep going. This is in contrast to a third kind of tokenization category called the singlet, which includes itself into a token before skipping itself and ending the tokenization collection.

In addition, remember what I said about the blacklist? Well, you can toggle between blacklisting and whitelisting your list of delimiters, singlets, and ignored characters. By default, there are no blacklisted delimiters, no whitelisted singlets, and no whitelisted ignored characters.

We then also observe that all other characters will simply skip themselves while being collected as a part of the current token, without ending this loop, therefore collecting new characters until the loop halts via delimiter or singlet rules.

3.4. Continuing the Bootstrap Code

So far, we looked at this part of the code:

ldf

which simply creates l as a non-delimiter. Now, for the rest of the code:

gldftgldfdtgl
df 
  
dfiff1 crank f

gldf puts gl on the stack due to d being a delimiter, and f is called on it, meaning that now g and l are the only non-delimiters. Then, tgl gets put on the stack and they become non-delimiters with df. dtgl gets put on the stack, and the newline becomes the only non-delimiter with \ndf (yes, the newline is actually a part of the code here, and spaces need to be as well in order for this to work). Then, the space character, due to how delimiter rules work (if you don't ignore, the first character is parsed normally even if it is a delimiter) and \n gets put on the stack. Then, another \ \n word is tokenized (you might not see it, but there's another space on line 3). The current stack looks like this (bottom to top):

3. dtgl
2. [space char]\n
1. [space char]\n

df sets the non-delimiters to \ \n. if sets the ignores to \ \n, which ignores these characters at the start of tokenization. f executes dtgl, which is a word that toggles the dflag, the flag that stores the whitelist/blacklist distinction for delimiters. Now, all non-delimiters are delimiters and all delimiters are non-delimiters. Finally, we're put in an environment where spaces and newlines are the delimiters for tokens, and they are ignored at the start of tokenizing a token. Next, 1 is tokenized and put on the stack, and then the crank word, which is then executed by f (the 1 token is treated as a number in this case, but everything textual in cognition is a word). We are done our bootstrapping sequence! Now, you might wonder what crank does. That we will explain in a later section.

4. Bootstrapping Takeaways

From this, we see a couple principles: first, cognition is able to change how it tokenizes on the fly and it can do it programmatically, allowing you to program a program in cognition that would theoretically automate the process of changing these delimiters, singlets, and ignores. This is something impossible in other languages, being able to program your own tokenizer for some foreign language from within cognition, and have future code be tokenized exactly like how you want it to be. This is solely possible because the language is postfix and doesn't read ahead, so it doesn't require more than one token to be parsed before an expression is evaluated. Second, faliases allow us to execute words without having to have prefix words or any default execution of words.

5. Crank

The metacrank system allows us to set a default way in which tokens are executed on the stack. The crank word takes a number as its argument and by effect executes the top of the stack for every n words you put on the stack. To make this concept concrete, let's look at some code (running from what we call crank 1 as we set our environment to crank one at the end of the bootstrapping sequence):

5 crank 2crank 2 crank
1 crank unglue swap quote prepose def

the crank 1 environment allows us to stop using f in order to evaluate tokens. Instead, every 1 token that is tokenized is evaluated. Since we programmed in a newline and space-delimited syntax, we can safely interpret this code intuitively.

The code begins by trying to evaluate 5, which evaluates to itself as it is not a builtin. crank evaluates and puts us in 5 crank, meaning every 5th token evaluates from here on. 2crank, 2, crank, 1 are all put on the stack, leaving us with a stack that looks like so (notice that crank doesn't get executed even though it is a bulitin because we set ourselves to using crank 5):

4. 2crank
3. 2
2. crank
1. 1

crank is the 5th word, so it executes. Note that this puts us back in crank 1, meaning every word is evaluated. unglue is a builtin that gets the value of the word at the top of the stack (as 1 is used up by the crank we evaluated), and so it gets the value of crank, which is a builtin. What that in effect does is it gets the function pointer associated with the crank builtin. Our new stack looks like this:

3. 2crank
2. 2
1. [CLIB]

Where CLIB is our function pointer that points to the crank builtin. We then swap:

3. 2crank
2. [CLIB]
1. 2

then quote, a builtin that quotes the top thing on the stack:

3. 2crank
2. [CLIB]
1. [2]

then prepose, a builtin like compose in stem, except that it preposes and that it puts things in what we call a VMACRO:

2. 2crank
1. ( [2] [CLIB] )

then we call def. This defines a word 2crank that puts 2 on the stack and then calls a function pointer pointing us to the crank builtin. Now, we still have to define what VMACROs are, and in order to do that we might have to explain some differences between the cognition stack and the stem stack.

5.1. Differeneces

In the stem stack, putting words on the stack directly is allowed. In cognition, words are put in containers when they are put on the stack and not evaluated. This means words like compose in stem work on words (or more accurately containers with a single word in them) as well as other containers, making the API for this language more consistent. Additionally, words like cd as we will make use of this concept.

5.1.1. Macros

Macros are another difference between stem quotes and cognition containers. When macros are evaluated, everything in the macro is evaluated, ignoring the crank. If bound to a word, evaluating that word evaluates the macro which will ignore the crank completely and will only increment the cranker by one, while evaluating each statement in the macro. They are useful for making crank-agnostic code, and expanding macros is very useful for the purpose of optimization, although we will actually have to write the word expand from more primitive words later on (hint: it uses recursive unglue).

5.2. More Code

Here is te rest of the code in bootstrap.cog in coglib/:

getd dup _ concat _ swap d i 
_quote_swap_quote_compose_swap_dup_d_i eval 

2crank ing 0 crank spc
2crank ing 1 crank swap quote def
2crank ing 0 crank endl
2crank ing 1 crank swap quote def
2crank ing 1 crank
2crank ing 3 crank load ../coglib/ quote
2crank ing 2 crank swap unglue concat unglue fread unglue evalstr unglue
2crank ing 1 crank compose compose compose compose VMACRO cast def
2crank ing 1 crank
2crank ing 1 crank getargs 1 split swap drop 1 split drop
2crank ing 1 crank
2crank ing 1 crank epop drop
2crank ing 1 crank INDEX spc OUT spc OF spc RANGE
2crank ing 1 crank concat concat concat concat concat concat =
2crank ing 1 crank
2crank ing 1 crank missing spc filename concat concat dup endl concat
2crank ing 1 crank swap quote swap quote compose
2crank ing 2 crank print compose exit compose
2crank ing 1 crank
2crank ing 0 crank fread evalstr
2crank ing 1 crank compose
2crank ing 1 crank
2crank ing 1 crank if

Okay, well, the syntax still doesn't look so good, and it's still pretty hard to get what this is doing. But the basic idea is that 2crank is a macro and is therefore crank agnostic, and we guarantee its execution with ing, another falias (because it's funny). Then, we execute an n crank, which standardizes what crank each line is in (you might wonder what ing and f's interaction is with the cranker. It actually just guarantees the evaluation of the previous thing, so if the previous thing already evaluated f and ing both do nothing). In any case, this defines words that are useful, such as load, which loads something from the coglib. It does this by compose-ing things into quotes and then def-ing those quotes.

The crank, and by extension, the metacrank system is needed in order to discriminate between evaluating some tokens and storing others for metaprogramming without having to use f, while also keeping the system postfix. Crank is just one word that allows for this type of behavior; the more general word, metacrank, allows for much more interesting kinds of syntax manipulation. We have examples of metacrank down the line, but for now I should explain the metacrank word.

5.3. Metacrank

n m metacrank sets a periodic evaluation m for an element n items down the stack. The crank word is therefore equivalent to 0 m metacrank. Only one token can be evaluated per tokenized token, although every metacrank is incremented per token, where lower metacranks get priority. This means that if you set two different metacranks, only one of them can execute per token tokenized, and the lower metacrank gets priority. Note that metacrank and, by extension, crank, don't just depend on tokenized words; they also work while evaluating word definitions recursively, meaning if a word is evaluated in 2 crank, one out of two words will execute in each level of the evaluation tree. You can play around with this in the repl to get a sense of how it works: run ../crank bootstrap.cog repl.cog devel.cog load in the coglib folder, and use stem like syntax in order to define a function. Then, run that function in 2 crank. You will see how the evaluation tree respects cranking in the same way that the program file itself does.

Metacrank allows for not only metaprogramming in the form of code building, but also direct syntax manipulation (i.e. I want to execute this token once I have read n other token(s)). The advantages to this system compared to other programming languages' systems are clear: you can program a prefix word and undef it when you want to rip out that part of syntax. You can write a prefix character that doesn't stop at an ending character but always stops when you read a certain number of tokens. You can feed user input into a math program and feed the output into a syntax system like metacrank. The possibilities are endless! And with that, we will slowly build up the stem programming language, v2, now with macros and from within our own cognition.

6. The Stem Dialect, Improved

In this piece of code, we define the comment:

2crank ing 0 crank ff 1
2crank ing 1 crank cut unaliasf
2crank ing 0 crank 0
2crank ing 1 crank cut swap quote def
2crank ing 0 crank
2crank ing 0 crank #
2crank ing 0 crank geti getd gets crankbase f d f i endl s
2crank ing 1 crank compose compose compose compose compose compose compose compose compose
2crank ing 0 crank drop halt crank s d i
2crank ing 1 crank compose compose compose compose compose VMACRO cast quote compose
2crank ing 0 crank halt 1 quote ing 1 quote ing metacrank
2crank ing 1 crank compose compose compose compose VMACRO cast
2crank ing 1 crank def
2crank ing 2 crank # singlet # delim
2crank ing 1 crank #comment: geti getd gets crankbase '' d '' i '\n' s ( drop halt crank s d i ) halt 1 1 metacrank

and it is our first piece of code that builds something truly prefix. The comment character is a prefix that drops all the text before the newline character, which is a type of word that tells the parser to read ahead. This is our first indication that everything that we thought was possible within cognition truly is.

But before that, we can look at the first couple of lines:

2crank ing 0 crank ff 1
2crank ing 1 crank cut unaliasf
2crank ing 0 crank 0
2crank ing 1 crank cut swap quote def
2crank ing 0 crank

which simply unaliases f from the falias list, with ing being the only remaining falias. In cognition, even these faliases are changeable.

Since we can't put f directly on the stack (if we try by just using f, it would execute), we instead utilize some very minimal string processing to do it, putting ff on the stack and then cutting the string in half to get two copies of f. We then want f to mean false, which in cognition is just an empty word. Therefore, we make an empty word by calling 0 cut on this string, and then def-ing f to the empty string. The following code is where the comment is defined:

2crank ing 0 crank #
2crank ing 0 crank geti getd gets crankbase f d f i endl s
2crank ing 1 crank compose compose compose compose compose compose compose compose compose
2crank ing 0 crank drop halt crank s d i
2crank ing 1 crank compose compose compose compose compose VMACRO cast quote compose
2crank ing 0 crank halt 1 quote ing 1 quote ing metacrank
2crank ing 1 crank compose compose compose compose VMACRO cast
2crank ing 1 crank def
2crank ing 2 crank # singlet # delim
2crank ing 1 crank #comment: geti getd gets crankbase '' d '' i '\n' s ( drop halt crank s d i ) halt 1 1 metacrank

Relevant: halt just puts you in 0 for all metacranks, and VMACRO cast just turns the top thing on the stack from a container to a macro. geti, getd, gets gets the ignores, delims, and singlets respectively as a string; drop is dsc in stem. singlet and delim sets the singlets and delimiters. endl is defined withint bootstrap.cog and just puts the newline character as a word on the stack. crankbase gets the current crank.

we call a lot of compose words in order to build this definition, and we make the # character a singlet delimiter in order to allow for spaces after the comment. We put ourselves in 1 1 metacrank in the # definition while altering the tokenization rules beforehand in order to tokenize everything until a newline as a token while calling # on said word in order to effectively drop that comment and get ourselves back in the original crank and metacrank. Thus, the brilliant # character is written, operating on a token that is tokenized in the future, with complete default postfix syntax. With the information above, one can work out the specifics of how it works; the point is that it does, and one can test that it does by going into the coglib folder and running ../crank bootstrap.cog repl.cog devel.cog load, which will load the REPL and load devel.cog, which will in turn load comment.cog.

6.1. The Great Escape

Here, we accelerate our way out of this primitive syntax, and it all starts with the great escape character. We make many great leaps in this section that aren't entirely explained for the sake of brevity, but you are free to play around with all of these things by using the repl. In any case, I hope you will enjoy this great leap in syntax technology; by the end, we will have reached something with real structure.

Here we define a preliminary prefix escape character. Also you will notice that 2crank ing 0 crank is used as padding between lines:

2crank ing 2 crank comment.cog load
2crank ing 0 crank
2crank ing 1 crank # preliminary escape character \
2crank ing 1 crank \
2crank ing 0 crank halt 1 quote ing crank
2crank ing 1 crank compose compose
2crank ing 2 crank VMACRO cast quote eval
2crank ing 0 crank halt 1 quote ing dup ing metacrank
2crank ing 1 crank compose compose compose compose
2crank ing 2 crank VMACRO cast
2crank ing 1 crank def
2crank ing 0 crank
2crank ing 0 crank

This allows for escaping so that we can put something on the stack even if it is to be evaluated, but we want to redefine this character eventually to be compatible with stem-like quotes. We're even using our comment character in order to annotate this code by now! Here is the full quote definition (once we have this definition, we can use it to improve itself):

2crank ing 0 crank [
2crank ing 0 crank
2crank ing 1 crank # init
2crank ing 0 crank crankbase 1 quote ing metacrankbase dup 1 quote ing =
2crank ing 1 crank compose compose compose compose compose
2crank ing 0 crank
2crank ing 1 crank # meta-crank-stuff0
2crank ing 3 crank dup ] quote =
2crank ing 1 crank compose compose
2crank ing 16 crank drop swap drop swap 1 quote swap metacrank swap crank quote
2crank ing 3 crank compose dup quote dip swap
2crank ing 1 crank compose compose compose compose compose compose compose compose
2crank ing 1 crank compose compose compose compose compose \ VMACRO cast quote compose
2crank ing 3 crank compose dup quote dip swap
2crank ing 1 crank compose compose compose \ VMACRO cast quote compose \ if compose
2crank ing 1 crank \ VMACRO cast quote quote compose
2crank ing 0 crank
2crank ing 1 crank # meta-crank-stuff1
2crank ing 3 crank dup ] quote =
2crank ing 1 crank compose compose
2crank ing 16 crank drop swap drop swap 1 quote swap metacrank swap crank
2crank ing 1 crank compose compose compose compose compose compose compose compose \ VMACRO cast quote compose
2crank ing 3 crank compose dup quote dip swap
2crank ing 1 crank compose compose compose \ VMACRO cast quote compose \ if compose
2crank ing 1 crank \ VMACRO cast quote quote compose
2crank ing 0 crank
2crank ing 1 crank # rest of the definition
2crank ing 16 crank if dup stack swap 0 quote crank
2crank ing 2 crank 1 quote 1 quote metacrank
2crank ing 1 crank compose compose compose compose compose compose compose compose
2crank ing 1 crank compose \ VMACRO cast
2crank ing 0 crank
2crank ing 1 crank def

Um, it's quite the spectacle how Matthew Hinton ever came up with this thing, but alas, it exists. Then, we use it in order to redefine itself, but better as the old quote definition can't do recursive quotes (we can do this because the definition is used before you redefine the word due to postfix def, a development pattern seen often in low level cognition):

\ [

[ crankbase ] [ 1 ] quote compose [ metacrankbase dup ] compose [ 1 ] quote compose [ = ] compose

[ dup ] \ ] quote compose [ = ] compose
[ drop swap drop swap ] [ 1 ] quote compose [ swap metacrank swap crank quote compose ] compose
[ dup ] quote compose [ dip swap ] compose \ VMACRO cast quote compose
[ dup dup dup ] \ [ quote compose [ = swap ] compose \ ( quote compose [ = or swap ] compose \ \ quote compose [ = or ] compose
[ eval ] quote compose
[ compose ] [ dup ] quote compose [ dip swap ] compose \ VMACRO cast quote compose [ if ] compose \ VMACRO cast
quote compose [ if ] compose \ VMACRO cast quote quote

[ dup ] \ ] quote compose [ = ] compose
[ drop swap drop swap ] [ 1 ] quote compose [ swap metacrank swap crank ] compose \ VMACRO cast quote compose
[ dup dup dup ] \ [ quote compose [ = swap ] compose \ ( quote compose [ = or swap ] compose \ \ quote compose [ = or ] compose
[ eval ] quote compose
[ compose ] [ dup ] quote compose [ dip swap ] compose \ VMACRO cast quote compose [ if ] compose \ VMACRO cast
quote compose [ if ] compose \ VMACRO cast quote quote

compose compose [ if dup stack swap ] compose [ 0 ] quote compose [ crank ] compose
[ 1 ] quote dup compose compose [ metacrank ] compose \ VMACRO cast

def

Okay, so now we can use recursive quoting, just like in stem. But there are still a couple things missing that we probably want: a good string quote implementation, and probably escape characters that work in the brackets. Also, since Cognition utilizes macros, we probably want a way to notate those as well, and we probably want a way to expand macros. We can do all of that! First, we will have to redefine \ once more:

\ \
[ [ 1 ] metacrankbase [ 1 ] = ]
[ halt [ 1 ] [ 1 ] metacrank quote compose [ dup ] dip swap ]
\ VMACRO cast quote quote compose
[ halt [ 1 ] crank ] VMACRO cast quote quote compose
[ if halt [ 1 ] [ 1 ] metacrank ] compose \ VMACRO cast
def

This piece of code defines the bracket but for macros (split just splits a list into two):

\ (
\ [ unglue
[ 11 ] split swap [ 10 ] split drop [ macro ] compose
[ 18 ] split quote [ prepose ] compose dip
[ 17 ] split eval eval
[ 1 ] del [ \ ) ] [ 1 ] put
quote quote quote [ prepose ] compose dip
[ 16 ] split eval eval
[ 1 ] del [ \ ) ] [ 1 ] put
quote quote quote [ prepose ] compose dip
prepose
def

We want these macros to automatically expand because it's more efficient to bind already expanded macros to words, and they functionally evaluate identically (isdef just returns a boolean where true is a non-empty string, false is an empty string, if a word is defined):

\ (
( crankbase [ 1 ] metacrankbase dup [ 1 ] =
  [ ( dup \ ) =
      ( drop swap drop swap [ 1 ] swap metacrank swap crank quote compose ( dup ) dip swap )
      ( dup dup dup \ [ = swap \ ( = or swap \ \ = or
        ( eval )
        ( dup isdef ( unglue ) [ ] if compose ( dup ) dip swap )
        if )
      if ) ]
  [ ( dup \ ) =
      ( drop swap drop swap [ 1 ] swap metacrank swap crank )
      ( dup dup dup \ [ = swap \ ( = or swap \ \ = or
        ( eval )
        ( dup isdef ( unglue ) [ ] if compose ( dup ) dip swap )
        if )
      if ) ]
  if dup macro swap
  [ 0 ] crank [ 1 ] [ 1 ] metacrank ) def

and you can see that as we define more things, our language is beginning to look more or less like it has syntax! In this quote.cog file which we have been looking at, there are more things, but the bulk of it is pretty much done. From here on, I will just explain the syntax programmed by quote.cog instead of showing the specific code.

As an example, here is expand:

# define basic expand (works on nonempty macros only)
[ expand ]
( macro swap
  ( [ 1 ] split
    ( isword ( dup isdef ( unglue ) ( ) if ) ( ) if compose ) dip
    size [ 0 ] > ( ( ( dup ) dip swap ) dip swap eval ) ( ) if )
  dup ( swap ( swap ) dip ) dip eval drop swap drop ) def

# complete expand (checks for definitions within child first without copying hashtables)
[ expand ]
( size [ 0 ] > ( type [ VSTACK ] = ) ( return ) if ?
  ( macro swap
    macro
    ( ( ( size dup [ 0 ] > ) dip swap ) dip swap
      ( ( ( 1 - dup ( vat ) dip swap ( del ) dip ) dip compose ) dip dup eval )
      ( drop swap drop )
      if ) dup eval
    ( ( [ 1 ] split
        ( isword
          ( compose cd dup isdef
            ( unglue pop )
              ( pop dup isdef ( unglue ) ( ) if )
            if ) ( ) if
          ( swap ) dip compose swap ) dip
        size [ 0 ] > ) dip swap
      ( dup eval ) ( drop drop swap compose ) if ) dup eval )
  ( expand )
  if ) def

Which recursively expands word definitions inside a quote or macro, using the word unglue. We've used the expand word in order to redefine itself in a more general case.

7. The Brainfuck Dialect

And returning to whence we came, we define the Brainfuck dialect with our current advanced stem dialect:

comment.cog load
quote.cog load

[ ] [ ] [ 0 ]

[ > ] [[ swap [[ compose ]] dip size [ 0 ] = [ [ 0 ] ] [[ [ 1 ] split swap ]] if ]] def
[ < ] [[ prepose [[ size dup [ 0 ] = [ ] [[ [ 1 ] - split ]] if ]] dip swap ]] def
[ + ] [[ [ 1 ] + ]] def
[ - ] [[ [ 1 ] - ]] def
[ . ] [[ dup char print ]] def
[ , ] [[ drop read byte ]] def

[ pick ] ( ( ( dup ) dip swap ) dip swap ) def
[ exec ] ( ( [ 1 ] * dup ) dip swap [ 0 ] = ( drop ) ( dup ( evalstr ) dip \ exec ) if ) def

\ [ (
  ( dup [ \ ] ] =
    ( drop swap - [ 1 ] * dup [ 0 ] =
      ( drop swap drop halt [ 1 ] crank exec )
      ( swap [ \ ] ] concat pick )
      if )
    ( dup [ \ [ ] =
      ( concat swap + swap pick )
      ( concat pick )
      if )
    if )
  dup [ 1 ] swap f swap halt [ 1 ] [ 1 ] metacrank
) def

><+-,.[] dup ( i s itgl f d ) eval

test with ../crank -s 2 bootstrap.cog helloworld.bf brainfuck.cog. You may of course load your favorite brainfuck file with this method. Note that brainfuck.cog isn't a brainfuck parser in the ordinary sense; it actually defines brainfuck words and tokenizes brainfuck, running it in the native cognition environment.

It's very profound, as well, how our current syntax allows us to define an alternate syntax with great ease. It might make you wonder if it's possible to specifically craft a syntax whose job is to write other syntaxes. Another interesting observation you might have is that Cognition defines syntax by defining a prefix character as a word that uses metacrank, rather than reading symbols and deciding what to do based on symbols. It's almost as if the syntax becomes inherent to the word that's being defined.

These two ideas synthesize to create something truly exciting, but that hasn't yet been implemented in the standard library (though we very much know that it is possible). Introducing: the dialect dialect of Cognition…

7.1. The Dialect Dialect

Imagine a word mkprefix, that takes two input words (say for example [ and ]), and an operation, and automatically defines [ to apply said operation until it hits a ] character. This is possible because constructs like metacrank and def are all just regular words, so it's possible to use them as words to metaprogram with. In fact, everything is just a word (even d, i, and s), so you can imagine a hyperabstract dialect that includes words like mkprefix, using syntax to automate the process of implementing more syntax. Such a construct I have not encountered in any other programming language. Yet, in your own Cognition, you can make nearly anything a reality.

Such creative things Matthew Hinton and I have discussed as possibilities regarding the standard library. Right now, the standard library has metawords that generate abstract words automatically and call them. This is possible through string concatenation and using def in the definition of another word also (this is also possible in my prior programming language Stem). We have discussed the possibility of a word that searches for word-generators to abstract its current wordlist automatically, and we have talked about the possibility of directing this abstraction framework for the purpose of solving a problem. These are conceptually possible words to write within cognition, and this might give you an idea of how powerful this idea is.

8. Theoretical Musings

There are a couple of things about Cognition that make it interesting beyond its quirks. For instance, string processing in this language is equivalent to tokenizer postprocessing, which makes string operations inherently extremely powerful in this language. It also has potential applications in Symbolic AI and in syntax and grammar research, where prototypes of languages and metalanguages can be tested with ease. I'd imagine that anyone configuring a program that reads a configuration file would really want their configuration language to be something like this, where they can have full freedom over the syntax (and metasyntax) in which they program in (think about a Cognition based shell, or a Cognition based operating system!). Though, the point of working on this language was never its applications; its intrinsic beauty is its own philosophical statement.

9. Conclusion

You can imagine cognition can program basically any syntax you would want, and in this article, we demonstrate the power of the already existing code that makes cognition work. In short, the system allows for true syntax as code, as my friend Andrei put it; one can dynamically program and even automate the production of syntax. In this article, we didn't have the space to cover other important Cognition concepts like the Metastack and words like cd, but this can be done in a part 2 of this blog post. For now, let's leave off here, and we can meet here once more for a part two.

Cognition By Preston Pan, 2024

Table of Contents