Implementing Search for Swahili Text
We'll explore a number of techniques for implementing search, from simple keyword filtering to more sophisticated semantic search. We'll use data from the MasakhaNER project for demonstration. First setup the dependencies. In [1]: !pip install -q requests pandas scikit-learn jupyter transformers tqdm !pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu In [2]: import pandas as pd import requests import io import numpy as np Get the data¶ The data we will be using is Swahili News text data from the masakhane-ner repository, whose origin is the Swahili version of Voice of America
Read more...