Avatar

Abdul Dakkak

Senior Compiler Developer @ Wolfram Research

PhD candidate in CS @ UIUC

Biography

Abdul Dakkak is a Ph.D. candidate in Computer Science at the University of Illinois at Urbana-Champaign (UIUC) advised by Professor Wen-mei Hwu. He is a senior compiler developer at Wolfram Research, leading the Wolfram Compiler effort. Abdul's research interest lies between programming languages and accelerated computing, with a focus on compiling high-level languages into performant code running on different hardware. In the process, he has developed industry-grade tools for compiling, running, profiling, and introspecting real-world applications to optimize their performance across both the hardware and software stack. As a primary developer of the Wolfram Compiler, Abdul has developed the Wolfram type system and architected the Wolfram runtime. As a result, the compiled Wolfram code matches the speed to hand-optimized C code and can target accelerator and multi-node systems.

Abdul has been involved in teaching activities. He developed tools to enable teaching for large classrooms and is the author of WebGPU and RAI. Both WebGPU and RAI have over 100k users and are used across over 14 universities (including the University of Michigan, BSC/UPC, UIC, the University of Tennessee, …) to evaluate over 2.5 million labs. He has aided in teaching the Coursera HPP course (3 times), the introductory and advanced CUDA courses (2 times), and the PUMPS summer school at BSC (4 times).

Aside from the above, Abdul also has been developing MLModelScope, which is a distributed platform allowing people to deploy, profile, and experiment with ML/DL frameworks and models. The tools are used to inform system design for Deep Learning model serving and develop highly tuned GPU kernels for model inference.

Interests

  • Compilers and Systems
  • Performance Optimizations
  • Artificial Intelligence

Education

  • PhD Candidate in Computer Science, 2013-

    University of Illinois Urbana-Champaign

  • B.A. in Pure Mathematics, 2009

    University of Toledo

Experience

 
 
 
 
 

Senior Compiler Developer

Wolfram Research

Jan 2019 – Present Illinois
Responsibilities include:

  • Co-lead the Wolfram Compiler effort
  • Researched and developed the Wolfram type-system, code generation, and optimizations
  • Prototyped paths to compile to accelerators and to JavaScript
 
 
 
 
 

Ph.D. Candidate in Computer Science

University of Illinois, Urbana-Champaign

Aug 2013 – Present Illinois
Responsibilities include:

  • Performed research on cutting-edge compiler, GPU, and AI
  • Developed highly-used systems including WebGPU, RAI, D4P, and MLModelScope
  • Mentored under-graduate, masters, and graduate students
 
 
 
 
 

Kernel Developer

Wolfram Research

Apr 2010 – Jan 2019 Illinois
Responsibilities include:

  • Lead a team to develop GPU integration in Mathematica
  • Developed a domain specific language to write financial code for Wolfram Finance Platform
  • Optimized the core Mathematica engine and designed next-gen Wolfram runtime
  • Developed primitives for the Wolfram Geometry project
 
 
 
 
 

Junior Kernel Developer

Wolfram Research

Apr 2009 – Apr 2010 Illinois
Responsibilities include:

  • Developed and architected CUDALink and OpenCLLink
  • Optimized C foreign function interface path for Mathematica
  • Developed NVIDIA Compiler bindings for C compiler driver

Recent Publications

DLSpec: A Deep Learning Task Exchange Specification. To appear in USENIX OpML, 2020.

PDF Project website

MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale. In ArXiv, 2020.

Preprint PDF Project

DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs. To appear in ICPE, 2020.

PDF Project Project extended

The Design and Implementation of the Wolfram Language Compiler. CGO, 2020.

PDF Project

Projects

Wolfram Compiler

The Wolfram Compiler compiles the Wolfram Language into optimized native machine code.

MLModelScope

An open-source, framework and hardware agnostic, extensible and customizable, distributed platform design for evaluating and profiling ML models across datasets/frameworks/systems.

Benanza

Automatic μBenchmark Generation to Compute “Lower-bound” Latency and Inform Optimizations of Deep Learning Models on GPUs.

RAI

A Scalable Project Submission System for Parallel Programming Courses.

WebGPU

A Scalable Lab Submission System for Parallel Programming Courses.

Recent & Upcoming Talks

SC 2019 - Across-Stack Profiling and Characterization of State-of-the-Art Machine Learning Models on GPUs

The past few years have seen a surge of using Machine Learning (ML) and Deep Learning (DL) algorithms for traditional HPC tasks such as feature detection, numerical analysis, and graph analytics. While ML and DL help solving HPC tasks, their adoption has been hampered in part because of the complexity of understanding ML/DL and their interactions with systems utilization. Optimizing these algorithms requires characterizing their performance and resource utilization across the hardware/software …

Tutorial at IISWC 2019 - Challenges and Solutions for End-to-End and Across Stack ML Benchmarking

The current landscape of Machine Learning (ML) and Deep Learning (DL) is rife with non-uniform models, frameworks, and system stacks. It lacks standard tools and methodologies to evaluate and profile models or systems. Due to the absence of standard tools, the state of the practice for evaluating and comparing the benefits of proposed AI innovations (be it hardware or software) on end-to-end AI pipelines is both arduous and error-prone — stifling the adoption of the innovations in a rapidly …

Developing in the Wolfram Compiler

The Wolfram Language is a dynamic untyped language that has a 30-year history. The talk will describe current work in developing a compiler for the Wolfram Language. It will cover the compiler's architecture, along with techniques to convert the language into low-level code. This includes a type system with sexy-types such as type classes, ad hoc polymorphism, parametric polymorphism and overloading. We will describe our multitiered pipeline developed to optimize Wolfram code, along with the …

HotChips 2019 - MLModelScope: Evaluate and Profile ML Models at Scale and Across Stack

The current landscape of Machine Learning (ML) and Deep Learning (DL) is rife with non-uniform frameworks, models, and system stacks but lacks standard tools to facilitate the evaluation and measurement of model. Due to the absence of such tools, the current practice for evaluating and comparing the benefits of proposed AI innovations (be it hardware or software) on end-to-end AI pipelines is both arduous and error prone – stifling the adoption of the innovations. We propose MLModelScope …

Recognitions & Awards

Best Paper Candidate XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs

ACM Artifact Evaluation Stamp for The Design and Implementation of the Wolfram Language Compiler

Best Paper and ACM Artifact Evaluation Stamp for Evaluating CUDA Communication Primitives on High-Bandwidth Interconnects

Best Poster

Top-20 Poster

Countries Visited

Contact