Ba in Applied Mathematics.
- Outcome:
- Graduated 2022, special mention.
In this project, different classification models were built using machine and deep learning architectures. The classification task was done to newspaper articles and social media posts in Spanish whose subjects vary but mainly refer to Latin American national politics and the current coronavirus crisis. The purpose of the models built was to detect when a text constitutes fake, true or misleading news. Due to the models’ input, Natural Language Processing was studied.
In this work we tested the natural capabilities of transformers-based
models on classification tasks. Additionally, we tested the feature
representation of the transformer model (BERT Random Forests
and Boosted Machines
are state of the art on tabular data. As it is naturally an unbalanced
classification task, more questions arise as we are worried about bias
towards specific news outlets and the generalization of our predictive
models.
- Tags
- Machine learning, natural language processing, transformers, ensemble models.