About Us:
We are a leading participant in the Gonka decentralized AI network (), leveraging high-performance GPU infrastructure to maximize mining rewards. We’re seeking an ML Optimization Engineer to help us achieve superior efficiency and weight in the Gonka ecosystem.
Key Responsibilities:
- Implement advanced inference optimizations (speculative decoding, quantization, attention modifications, etc.) to maximize mining weight — techniques already proven to achieve double weight with identical GPUs by other participants
- Fine-tune Docker configurations for various GPU models based on available registry
- Develop custom optimization strategies that balance throughput and quality
- Create and maintain custom Docker images optimized for specific GPU architectures
- Design and implement systems for stable and scalable mining of Gonka and other protocols
- Develop optimized images for Tenstorrent AI ASICs to expand our hardware ecosystem beyond current GPU deployment
- Migrate Python code and VLLM implementations to new VLLM images and adapt them for specific GPU cards
Required Qualifications:
- Proven experience with large language model optimization techniques
- Strong understanding of transformer architectures and attention mechanisms
- Proficiency with PyTorch, CUDA, and GPU optimization techniques
- Experience with vLLM, FlashInfer, or similar inference optimization frameworks
- Familiarity with Docker containerization and GPU workload management
Preferred Qualifications:
- Experience with Claude Code Max (will be provided if needed)
- Previous experience with Gonka or similar decentralized AI networks
- Background in competitive ML or distributed systems optimization
- Experience with NVIDIA GPU architectures (B200/B300/H200/H100/A100)
- Knowledge of Tenstorrent AI ASICs or other specialized AI hardware
What We Offer:
- Opportunity to work with cutting-edge AI infrastructure
- High performance-based bonuses tied to achieved weight improvements
- Potential for full-time position with percentage of mining profits
- Flexible remote work environment
- Access to high-end GPU hardware for experimentation
Application Process: Intrested candidates should submit their resume along with a brief description of their relevant experience in ML optimization or performance enhancement ideas.