I have conducted research on NLP for endangered languages. I have publications on topics such as morphology, disambiguation and dictionary building. In my opinion, the mainstream NLP research oversimplifies the problems related to NLP for endangered languages by grouping them together with the ultimate I-didn’t-know-the-fuck-I-was-doing category of NLP called “low-resourced” NLP. Low-resourced NLP usually deals with extremely well resourced languages that are called low-resourced because that makes a good research paper.
Computational Creativity and NLG
I have worked for the research group of computational creativity at the University of Helsinki. There, I created a tool called Poem Machine. It boasts with an AI capable of creativity in the form of poetry. It makes it possible for humans to create poems together with a machine. Poem Machine is also featured on the Finnish news.
Later on, I moved to doing my PhD on the topic. During this time and after it, I have conducted work on a variety of topics related to computational creativity and natural language generation such as poem generation, humor generation, dialog generation and news headline generation.
Creative natural language generation is a fun research topic, because you can just throw whatever random evaluation method you want and report meaningless results as science. Better yet, crappy unscientific evaluation is enough to get your paper through even to the most “prestigious” NLP conference called ACL. Why try so hard if random garbage gets you there? 😀
Finnish NLP is quite well researched. I have done generative work with Finnish and as a result, developed a tool called syntax maker (Syntax maker Github) that can handle Finnish morphosyntax and inflect words correctly. I have also normalized, generated and identified different Finnish dialects. I have developed a tool called Murre that specializes in Finnish dialects. My dialect research has made it to the news in Finland and curiously in Brazil too, not to mention that the new story was translated into Chinese as well.
I have worked in the following academic projects: