Research
Research Projects
Vociply: Real-Time Voice-to-Voice Agentic System
Deep Learning Indaba 2025 (Poster) · Presented at Deep Learning Indaba 2025
Real-time voice-to-voice agentic system for African business automation using LLMs. This work demonstrates multilingual real-time inference optimization for edge devices and low-resource environments, addressing practical deployment challenges in resource-constrained settings.
Research Contributions
- Integration of LLMs with real-time speech processing for voice agents
- Optimization strategies for low-latency inference on edge devices
- Multilingual support for African languages in voice interfaces
- System architecture balancing performance and resource constraints
Adaptive Meta-Quantization via Hypernetworks for Ternary Neural Networks
CVPR 2026 · Under Review
Investigation of adaptive meta-quantization strategies using hypernetworks for ternary neural networks. This work addresses optimization challenges in extreme quantization scenarios through learned quantization policies that adapt dynamically during training, enabling efficient deployment on resource-constrained hardware.
Research Contributions
- Meta-learning framework for adaptive quantization in ternary networks
- Hypernetwork architecture for dynamic quantization policy generation
- Empirical evaluation demonstrating improved accuracy-efficiency tradeoffs
- Theoretical analysis of plasticity-stability dynamics in quantized networks
Publications
Adaptive Meta-Quantization via Hypernetworks for Ternary Neural Networks
Maroa Masese, C., Hussein, A., & Mbilinyi, A.
CVPR 2026 · 2026 · Under Review
Vociply: A Real-Time Voice-to-Voice Agentic System for African Business Automation Using LLMs
Maroa, C. et al.
Deep Learning Indaba 2025 (Poster) · 2025 · Presented · View Paper
Conference Participation
- Deep Learning Indaba 2025 · Kigali, Rwanda, 2025
- AMLD Africa 2024 · USIU Kenya, 2024
Research Directions
- Continual learning algorithms and catastrophic forgetting mitigation
- Efficient neural architectures for edge computing
- Memory-efficient model adaptation and incremental learning
- Theoretical foundations of model compression and quantization