Home | Repositories | Statistics | About



Subject: semantic search, deep learning, transformers, BERT


Year: 2022


Type: Proceedings



Title: Running Semantic Search Over Complete English Wikipedia on a Local Computer


Author: Tudjarski, Stojancho
Author: Madevska Bogdanova, Ana



Abstract: We implement a system that allows providing human-like answers to human-like questions extracted from a considerable amount of data in a reasonable time measured in seconds. To prove that the volume of the data used as a knowledge base where the answers to the questions are searched for, we used a complete English Wikipedia dump running on a local laptop under Windows10 OS, exposed to a software that receives questions and provides the three most relevant solutions. The entire technology stack of the implementation is the subject of this research. The main conclusion of this research is that it is possible to implement semantic search over a vast amount of text data on a local computer with an average hardware specifications, which is of outermost importance in developing different NLP systems.


Publisher:


Relation: The 19th International Conference on Informatics and Information Technologies – CIIT 2022



Identifier: oai:repository.ukim.mk:20.500.12188/25707
Identifier: http://hdl.handle.net/20.500.12188/25707



TitleDateViews
Running Semantic Search Over Complete English Wikipedia on a Local Computer202234