One of my most devastating afflictions is that I am a developer. Once I get an idea it gnaws at my brain until I either find another project or scratch that itch. Taggloo is that itch.

The Taggloo site was an experiment. It suffered the worst possible fate on the web: it was used by other people. What started out as a small site for me to combine my love of .NET and Manx Gaelic became a useful tool which cost me money but – worse – time. It conflicted with my life, family and worst of all, my role as Scout Leader. So I had to make the difficult decision to shut it down.

My head does not let sleeping dogs lie. I am reviving thoughts and ideas on how Taggloo could be useful. But not in the form it was in. By combining the dataset with AI and sticking that behind an API, I can scratch multiple itches.

I’m currently running through a course on FastAI. I am enthused by one of the myths they dispel: you do not need lots of data. This contradicts my previous belief that the Manx Gaelic corpus is just not big enough. Nor is it modern enough, with a lot of the available corpus being in “old Manx Gaelic”, like the Bible. One of Taggloo’s ambitions was to catalogue modern Manx Gaelic. It aimed to add to the corpus by mining social media like Facebook and Twitter. Zuck and Musk put a stop to that when they turned off their APIs, becoming less open.

This is me thinking aloud, please comment to tell me I’m wrong (or right)!

My understanding of how machine-learning works is by understanding the relationship of tokens between each other in a data-set. Simple.

So in English, one could write:

I like living on the Isle of Man

Where the token relationships may be:

  • “I”
  • “I like”
  • “I like living”
  • “live living”
  • “on the”
  • “the Isle of Man”
  • “Isle of Man”

This is all simplistic and I’m sure this can be broken down even further.

In Manx Gaelic, this would be:

S’mie lhiam cummal ayns Mannin

So the tokens would be:

  • “S’”
  • “S’mie”
  • “S’mie lhiam”
  • “lhiam”
  • “lhiam cummal”
  • “cummal”
  • “cummal ayns”
  • “ayns Mannin”
  • “Mannin”

So given that these words/tokens often go together one could derive the next word, and create inference based on the probability of the words being alongside each other or within the same sentence as other words:

  • S’
    • mie
      • lhiam
        • cummal
          • ayns
            • Mannin

This looks possible for an auto-correct like interface. It can predict the next word within the same language. You might consider inferring mutations found in Gaelic languages. For example, “Mannin” can become “Vannin” in “Ellan Vannin”. But what about where you need to translate between languages?

SentenceLanguageMeaningLiteral meaning
I like living on the Isle of ManEnglishI like living on the Isle of ManI like living on the Isle of Man
S’mie lhiam cummal ayns ManninManx GaelicI like living in Mannin (being the Isle of Man)Is (emphatic) good with me live in Mannin

In my FastAI learnings so far, I’ve been covering image recognition. I have been using machine learning to categorise images based on what is in the training set, which seems like it should be more complex than language. I just haven’t got there yet or the lightning bolt hasn’t struck.

My model is available at hugging Face if you find the need to distinguish a photograph of a cat from a dog,

https://huggingface.co/spaces/programx360/fastai-chapter2-v2

Another challenge I’m working with is AI and the prevalence/popularity of Python. Except I’m a .NET developer. Once I figure more out, perhaps I’ll be able to roll my own. Or at least use an API and JavaScript. This podcast episode convinces I’m not going to be all alone in C#.

That said, I am definitely liking the Jupyter Notebooks, which allow you to drop in Python scripts and annotate that script with markdown, providing as as-you-go development plan. My chapter 2 Notebook is at Kaggle.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Quote of the era

In the beginning there was Jack … and Jack had a groove. And from this groove came the groove of all grooves. And while one day viciously throwing down on his box, Jack boldly declared “Let There Be House” and House music was born.

~ Chuck Roberts