Here we have a set of tables with statistics on the diacritics. Diacritics are loosely defined, here, as symbols that don’t show up alone, and which graphically are shorter. The c could be both, as it looks like a diacritic but does not seem to depend on any particular other symbol. Exploring the hypothesis of the diacritics representing vowels, it gives us a new hypothesis, that c is a semivowel.

First, a table that lists, diacritic by diacritic, the letters that bear them, their counts and frequency.

Diacritic Letter Count Frequency
( q 2 / 9 22.22 %
2 2 / 9 22.22 %
W 2 / 9 22.22 %
9 1 / 9 11.11 %
4 1 / 9 11.11 %
b 1 / 9 11.11 %
) A 6 / 22 27.27 %
3 3 / 22 13.63 %
S 3 / 22 13.63 %
W 3 / 22 13.63 %
d 1 / 22 13.63 %
L 1 / 22 13.63 %
N 1 / 22 13.63 %
2 1 / 22 13.63 %
U 1 / 22 13.63 %
7 1 / 22 13.63 %
X 1 / 22 13.63 %
^ X 2 / 7 28.57 %
b 2 / 7 28.57 %
c 2 / 7 28.57 %
q 1 / 7 28.57 %
. 4 9 / 15 * 60.00 %
L 5 / 15 33.33 %
J 1 / 15 6.66 %
c 7 3 / 6 50.00 %
X 2 / 6 33.33 %
4 1 / 6 16.66 %

* One case, frame 2671, is a 4(. , unless I misread a Beanish question mark for a diacritic; if they are indeed two diacritics and diacritics do represent vowels, it might be the only attested occurrence of a diphtong in Beanish

We can already note some properties of language, of at least of its script. Some diacritics, as it is better shown by the following table, are strongly related to some letters: the only possible diacritic for A, for example, is ). There is no solution on c being a diacritic or a letter, but there one could read a tendency of confirming it as a diacritic (possibly representing a rare vowel).

Regarding the possibility of ^ and c being, respectively, allographemes for ) and (, the distributions seem to void this hypothesis. The population is very small, but it seems rather unlikely, not to mention that the letter X can bear both ^ and ), while the letter 4 can bear both c and ( (albeit just one occurrence). An analysis of the corpus also reveals that ^ is only found at the end of a word.

The second table, presenting all characters along with any diacritic they may bear (this will be later extended to the contexts where each letter is found):

Letter Diacritic Count Total count Frequency
3 ) 3 / 3 21 14.28 %
2 ) 1 / 3 25 4.00 %
( 2 / 3 25 8.00 %
4 c 1 / 10 22 4.54 %
( 1 / 10 22 4.54 %
. 8 / 10 22 36.36 %
7 c 3 / 4 16 18.75 %
) 1 / 4 16 6.25 %
6 0 / 0 3 0.00 %
g 0 / 0 5 0.00 %
9 ( 1 / 1 8 12.5 %
A ) 6 / 6 19 31.57 %
J . 1 / 1 8 12.5 %
M 0 / 0 7 0.00 %
L ) 1 / 6 16 6.25 %
. 5 / 6 16 31.25 %
N ) 1 / 1 15 6.66 %
Q 0 / 0 1 0.00 %
S ) 3 / 3 6 50.00 %
U ) 1 / 1 7 14.28 %
W ) 3 / 5 10 30.00 %
( 2 / 5 10 20.00 %
X ) 1 / 5 10 10.00 %
c 2 / 5 10 20.00 %
^ 2 / 5 10 20.00 %
Z 0 / 0 6 0.00 %
b ( 1 / 3 15 6.66 %
^ 2 / 3 15 13.33 %
d ) 1 / 1 5 20.00 %
g 0 / 0 9 0.00 %
j 0 / 0 1 0.00 %
q ( 2 / 3 4 50.00 %
^ 1 / 3 4 25.00%

This table is very important. First of all, it pretty much disproves the theory that the diacritics are vowels; the only possibility would be for every single consonant in the script to have a “default” vowel and a diacritic would only be used if the actual vowel is different from the default one. One obvious new theory is that diacritics represent phonetic features, such as aspiration: in fact, from the scene when Cueball learns the word for “water” we learn that the script is likely phonetic, or at least alphabetic.

What is now difficult is to identify which symbols represent vowels; if the diacritics don’t represent them, we would expect the letters that have no diacritics, such as q and j, to have a high frequency, but we have the opposite situation. A new hypothesis would be for the diacritics to represent semi-vowels, but given that most letters have at least one diacritic it seems unlikely.

We can at least list some combinations of letters and diacritics which we can assume as “common” even with the small population : 3) — 4. — 7c — 7) — A) — L. — S) — W) —- W( — Xc — X^

We can also start dividing the letters into groups:

  • letters that take more than one diacritic: 2 4 7 L W X b q
  • letters that seem to take a single diacritic: 3 9 A J N S U d
  • letters that seem to take no diacritic: 6 g M Q Z g j

If the script is indeed alphabetical, the frequencies seem to confirm that the first group is likely to contain mostly vowels and the third one mostly (rare) consonants.

Advertisements