Ok, the title is a joke on the title of most academic papers (with the obligatory colon), but, as with most of them, you might actually find the results useful, or at least interesting.


This force-directed graph visually translates most of the relationships among Beanish glyphs. It was made with Graphviz, with a source .dot file generated by a Python script. I used the results of the Maximum-Likelihood classifiers I ran yesterday, from 2 to 8 groups, adding or subtracting a score (the value of the uniform distribution for that classifier, for example 0.25 for the one with four classes) from each glyph to glyph edge. Only positive edges are shown.

The size of each glyph indicates its relative frequency. The proximity between glyphs indicate their statistical proximity in terms of grouping, which may or may not mirror a linguistic proximity: the closer the glyphs, the more alike they are. Glyphs not connected by edges do not present much similarity: for example, ᖽ and ᘝ are part of the same group on the left, but, as there is no edge linking them (i.e., the final score was not positive), they don’t seem to be related. The colors for each node (i.e., glyph) are meaningless, but glyphs that were grouped together in the 7-class classifier (which I guess to be the one that more closely mirrors linguistic features) share the same color.

Some initial comments:

  • We immediately note three separate groups. It would be perfect if they mirrored three linguistic groups (for example, vowels, semivowels, consonants), but a quick check at the corpus confirms that, unfortunately, this is not the case. Still, there are strong static indications that there are, in fact, three different groups of glyphs.
  • The diacritics are probably, indeed, a group or sub-group of glyphs, but the comma (the small, lunar lower diacritic) probably is not part of it. Maybe it is something like a subscripted iotta in Greek? (i.e., an alternative representation for a glyph).
  • The group on the left is far more common than the other two.