I'm a Computer Engineering student at the University of Toronto (Minor: AI Engineering) with a deep passion for AI/ML development and research β from designing custom attention mechanisms and fine-tuning large language models, to pushing the limits of GPU-accelerated compute.
I love working at the intersection of systems engineering and machine intelligence β building things that are both fast and smart.
- π BASc Computer Engineering @ UofT (PEY Co-op, Class of 2029)
- π¬ ML Researcher @ UTMIST β designing sparse attention for BART
- πΌ Software Developer @ Nodalli β building AI-native infrastructure
- π Based in Toronto, ON
- π¬ mohamad.salman@mail.utoronto.ca
|
Content-Aware Sparse Attention for BART Standard attention scales quadratically with sequence length β existing fixes like Big Bird use hardcoded patterns that don't adapt to input. We're building a mechanism that reads the content and decides which tokens to attend to dynamically.
|
Unified Action Adapter Layer Building the execution backbone of an AI platform β routing NLP-parsed commands across 4 platform APIs with field-level validation, Redis-backed OAuth, and automated credential management.
|
|
Fine-tuned LLaMA 3.1 8B with LoRA + DeepSpeed on 5K debate transcripts, achieving 87% agreement with GPT-4 judgements. Deployed a self-hosted inference pipeline with 70% latency reduction over baseline.
|
Multi-agent MCP pipeline orchestrating 5 forensic agents. Maps 20 LLM signals to a 0β100 fraud score with async FastAPI + SendGrid webhook backend dispatching replies in under 30s.
|
|
100Γ speedup over CPU across 5M simulations using CUDA on RTX 4060. Engineered 16-byte memory layouts for coalesced memory access with bit-for-bit CPU/GPU parity validation.
|
Trained a Binary Neural Network in PyTorch to classify "open sesame" from raw audio. Deployed fully on-chip to a DE1-SoC FPGA in C, interfaced with a servo motor for a physical gate.
|
|
C++ city-scale mapping engine with an adjacency list loading datasets in under 2 seconds. Implemented A* pathfinding with cache-friendly data structures on GTK/EZGL.
|
No-code workflow automation platform on Cloudflare Workers. Managed state for 100+ distributed cron jobs via Durable Objects, with LLaMA-powered auto-generation of workflow steps.
|
"The best way to predict the future is to build it."


