Job Description
Our client, is building a high-quality dataset for African languages, focusing on Kenya – (Luhya, Gusii, Oromo, and Kamba), Nigeria – (Ibibio, Ijaw).
The collected data will support AI model development for language processing, particularly in agriculture and health-related topics.
Roles and Responsibilities:
Technical Oversight: Lead the implementation of data recording, processing, and annotation workflows.
Machine Learning Support: Utilize Hugging Face Transformers for data processing and NLP model training.System Operations: Ensure smooth operation of MacBook machines for audio recording and data processing.Data Quality Assurance: Oversee the technical aspects of data collection to ensure accuracy and consistency.Collaboration: Work closely with language coordinators and data processors to streamline technical workflows.Troubleshooting & Optimization: Identify and resolve technical challenges in recording, storing, and processing data. Qualifications
Programming Skills: Strong proficiency in Python, with experience in NLP and data processing.Machine Learning Tools: Hands-on experience with Hugging Face Transformers and similar NLP frameworks.Technical Hardware Skills: Familiarity with MacBook systems and their advanced functionalities for audio recording.Project Coordination: Ability to support data collection teams and ensure smooth technical execution.
Location Preference: Based in Kenya or Nigeria to facilitate on-the-ground operations.