Luka Krsnik

Portfolio: Enhancing Deep Neural Networks with Morphological Information

BERT

Classification

Data Science

Fasttext

LSTMs

Large Language Models

Machine Learning

Named Entity Recognition

Natural Language Processing

Research

Project Description

Our project investigated the impact of adding morphological data to the input of deep neural networks. We tested additions on downstream tasks - named entity recognition (NER), dependency parsing (DP), and comment filtering (CF) - across various languages. We conducted tests utilizing two distinct architectures: an adapted BERT model and a combination of Fasttext embeddings and LSTM models.

Role

I managed all aspects of testing, data gathering, and experiments related to NER, collaborating with another developer who handled DP and CF tasks. Our research findings were published in the Natural Language Engineering journal, operated by Cambridge University. We utilized Python and PyTorch for machine learning tasks, supported by Scikit-Learn for data analysis.

Project Outcome

Our research demonstrated that incorporating morphological features enhances LSTM-based models' performance for NER and DP tasks. However, for the CF task, the inclusion of these features did not yield noticeable improvements. Regarding BERT-based models, added morphological features showed performance enhancements solely for DP when the features were of high quality (manually checked), while their predicted counterparts did not provide significant improvements.