The very first statistic to compute is the frequency of diacritics (which could be vowels) and letters. I am not including the final ball (which is a question mark), the final double horizontal line (which is an exclamation mark) and the final single horizontal line (which is a normal mark, i.e., a dot).

There are still some doubts about the diacritics; in particular, the ^ and the c could be just different versions of ) and (, respectively. If the hypothesis of the diacritics as vowels is right, this would indicate a language either with a restricted set of vowels or a script with a standard vowel for each consonant.

The distribution looks pretty normal for a human language. Considering how restricted is our corpus, it could be much more skewed. The first thing to note is that it does not seem to confirm that the diacritics are vowels, or at least that every vowel is represented by a diacritic.

Our next step is to study the distribution of letters at the beginning and at the end of the words, and then the relationship between them (for example, you probably have noticed that Z seems to depend on L, like Q and U in most languages).

Index Symbol Count Frequency
1 2 25 8.36 %
2 ) 22 7.36 %
4 22 7.36 %
 4 3 21 7.02 %
 5 A 19 6.35 %
 6 7 16 5.35 %
L 16 5.35 %
 8 b 15 5.02 %
N 15 5.02 %
. 15 5.02 %
 11 W 10 3.34 %
X 10 3.34 %
 13 ( 9 3.01 %
g 9 3.01 %
 15 J 8 2.68 %
9 8 2.68 %
 17 ^ 7 2.34 %
U 7 2.34 %
c 7 2.34 %
M 7 2.34 %
 21 S 6 2.01 %
Z 6 2.01 %
 23 d 5 1.67 %
G 5 1.67 %
 25 q 4 1.34 %
 26 6 3 1.00 %
 27 Q 1 0.33 %
j 1 0.33 %
Total  299 100 %
Advertisements