Principal Research Software Engineer

Microsoft Research


Abdul Dakkak is a Principal Research Software Engineer at Microsoft Research AI (MSRAI) working on next-generation compilers for end-to-end machine learning. Before then, Abdul Dakkak received his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign (UIUC). He was a senior compiler developer at Wolfram Research, leading the design and development of the Wolfram Compiler effort for over 6 years. Abdul’s research interest lies at the intersection of machine learning, compilers, programming languages, and accelerated computing. His focus is compiling high-level languages into performant code running on different hardware. In the process, he has developed industry-grade tools for compiling, running, profiling, and introspecting real-world applications to optimize their performance across both the hardware and software stack. As a primary developer of the Wolfram Compiler, Abdul has developed the Wolfram type system and architected the Wolfram runtime. As a result, the compiled Wolfram code matches the speed to hand-optimized C code and can target accelerator and multi-node systems.

Aside from the compiler work, Abdul also has been developing MLModelScope, which is a distributed platform allowing people to deploy, profile, and experiment with ML/DL frameworks and models. The tools are used to inform system design for Deep Learning model serving and develop highly tuned GPU kernels for model inference.

Abdul has been involved in teaching activities. He developed tools to enable teaching for large classrooms and is the author of WebGPU and RAI. Both WebGPU and RAI have over 100k users and are used across over 14 universities (including the University of Michigan, BSC/UPC, UIC, the University of Tennessee, …) to evaluate over 2.5 million labs. He has aided in teaching the Coursera HPP course (3 times), the introductory and advanced CUDA courses (2 times), and the PUMPS summer school at BSC (4 times).


  • Compilers and Systems
  • Performance Optimizations
  • Artificial Intelligence
  • Accelerated Computing


  • PhD in Computer Science, 2013-2020

    University of Illinois Urbana-Champaign

  • B.A. in Pure Mathematics, 2009

    University of Toledo



Principal Research Software Engineer

Microsoft Research

Jul 2020 – Present Redmond

Responsibilities include:

  • Design and implementation of next-gen machine learning compilers.

Senior Compiler Developer

Wolfram Research

Jan 2019 – Jul 2020 Illinois

Responsibilities include:

  • Co-lead the Wolfram Compiler effort
  • Researched, designed, and developed the Wolfram type-system, code generation, and optimizations
  • Prototyped paths to compile to accelerators and to JavaScript

Kernel Developer

Wolfram Research

Apr 2010 – Jan 2019 Illinois

Responsibilities include:

  • Lead a team to develop GPU integration in Mathematica
  • Developed a domain specific language to write financial code for Wolfram Finance Platform
  • Optimized the core Mathematica engine and designed next-gen Wolfram runtime
  • Developed primitives for the Wolfram Geometry project

Junior Kernel Developer

Wolfram Research

Apr 2009 – Apr 2010 Illinois

Responsibilities include:

  • Developed and architected CUDALink and OpenCLLink
  • Optimized C foreign function interface path for Mathematica
  • Developed NVIDIA Compiler bindings for C compiler driver

Recent Publications

Accelerating Fourier and Number Theoretic Transforms using Tensor Cores and Warp Shuffles. PACT, 2021.

PDF Project Project

FFT Blitz: the Tensor Cores Strike Back. PPoPP, 2021.

PDF Project Project

★ The Design and Implementation of a Scalable DL Benchmarking Platform. CLOUD, 2020.

PDF Project website best paper

DLSpec: A Deep Learning Task Exchange Specification. USENIX OpML, 2020.

PDF Project website

Recognitions & Awards

Best Paper The Design and Implementation of a Scalable DL Benchmarking Platform

Best Paper XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs

ACM Artifact Evaluation Stamp for The Design and Implementation of the Wolfram Language Compiler

Best Paper and ACM Artifact Evaluation Stamp for Evaluating CUDA Communication Primitives on High-Bandwidth Interconnects

Best Poster

Top-20 Poster

Countries Visited