Glyph classes by a Maximum-Likelihood-Criterion

I have finished my review of the corpus transliterated with the Canadian Aboriginal Syllabics; there were some inconsistencies, but I believe that it is now correct. I included a new glyph (the question mark, [?]) for the first glyphs of what we suppose is the Ionian Sea, as likely there is a single glyph missing.

I used the reviewed corpus to divide the glyphs into classes with a Maximum-Likelihood-Criterion; I adapted the corpus in order to use “mkcls”, which is part of the Giza++ package frequently used in statistical machine translation (http://code.google.com/p/giza-pp/). The glyphs are divided in groups from 2 to 10 classes, my comments are below.

Please remember that these are not linguistic categories, phonological, morphological, syllabic, whatever. They are classes based, essentially, on the frequency and context where each glyph is found in the corpus; while the might mirror real categories, they are to be understood as statistical properties, not linguistic ones (not only because they were built by a statistical classifier, but also because, as I’ve been whining since the first post, the corpus we have is very limited).

Division into 2 classes of glyphs

Class 1: , ᐣ ᐧ ᑦ ᑫ ᒣ ᓄ ᔑ ᔭ ᖗ ᖚ ᘈ ᘖ ᙉ
Class 2: ? ᑕ ᓭ ᔪ ᕋ ᖉ ᖊ ᖽ ᘊ ᘛ ᘝ ᙐ

Division into 3 classes of glyphs

Class 1: , ᓄ ᔑ ᖗ ᖚ ᘖ
Class 2: ? ᑕ ᒣ ᓭ ᔪ ᕋ ᖉ ᖊ ᖽ ᘊ ᘛ ᘝ ᙐ
Class 3: ᐣ ᐧ ᑦ ᑫ ᔭ ᘈ ᙉ

Division into 4 classes of glyphs

Class 1: ᑕ ᒣ ᔪ ᖽ ᘊ ᘛ ᙉ
Class 2: , ᓄ ᔑ ᖗ ᖚ ᘖ
Class 3: ? ᓭ ᕋ ᖉ ᖊ ᘝ ᙐ
Class 4: ᐣ ᐧ ᑦ ᑫ ᔭ ᘈ

Division into 5 classes of glyphs

Class 1: , ᓄ ᖗ ᖚ ᘖ
Class 2: ᐣ ᑦ
Class 3: ? ᓭ ᔑ ᕋ ᖉ ᘝ ᙐ
Class 4: ᐧ ᑫ ᔭ ᘈ
Class 5: ᑕ ᒣ ᔪ ᖊ ᖽ ᘊ ᘛ ᙉ

Division into 6 classes of glyphs

Class 1: ᐣ ᐧ ᑦ ᑫ ᔭ ᘈ ᙉ
Class 2: ᒣ ᘊ
Class 3: , ᓄ ᔑ ᖗ
Class 4: ? ᔪ ᘛ ᘝ ᙐ
Class 5: ᕋ ᖉ ᖚ ᖽ ᘖ
Class 6: ᑕ ᓭ ᖊ

Division into 7 classes of glyphs

Class 1: ᒣ ᘊ
Class 2: ᑕ ᖽ ᘖ
Class 3: , ᓄ ᖗ ᖚ
Class 4: ᓭ ᖊ
Class 5: ? ᔪ ᕋ ᖉ ᘛ ᘝ ᙉ ᙐ
Class 6: ᐣ ᑦ
Class 7: ᐧ ᑫ ᔑ ᔭ ᘈ

Division into 8 classes of glyphs

Class 1: ᑕ ᖚ ᖽ ᘖ
Class 2: ᓭ ᖊ ᙐ
Class 3: ᒣ ᘊ
Class 4: , ᓄ ᖗ
Class 5: ᐣ ᑦ
Class 6: ᐧ ᔑ
Class 7: ? ᔪ ᕋ ᖉ ᘛ ᘝ
Class 8: ᑫ ᔭ ᘈ ᙉ

My comments:

Assuming there are no unseen glyphs, the missing glyph at the beginning of “Ionian Sea” is probably one in the group ᔪ ᕋ ᖉ ᘛ ᘝ.
The same group ᔪ ᕋ ᖉ ᘛ ᘝ probably mirrors a true, linguistic group; if the script is alphabetic, they very likely are a group of related consonants.
The ML classifier suggests that the diacritics are indeed a separate category of glyphs, an in particular that ᐣ and ᑦ are very similar (one is probably, as per the graphical representation, the inverse of the other). There are, however, doubts regarding the lower diacritic (the comma) and, to a lesser extent, the vertically centered dot.
As the affix study had suggested, ᒣ andᘊ are probably very alike, which probably is also true for between ᓭ and ᖊ.

Deciphering Beanish

~ ᖉ, ᖆᐣᖚᔭ,ᐦ

Glyph classes by a Maximum-Likelihood-Criterion

Comment Cancel reply

Condividi:

Related

Comment Cancel reply