CV

This is a description of the page. You can modify it in '_pages/cv.md'. You can also change or remove the top pdf download button.

Contact Information

Name Zhenyu Bai
Professional Title ARTIC Fellow
Email zhenyu.bai@nus.edu.sg

Experience

  • 2023 -

    Singapore

    Research Fellow (ARTIC Fellow from 2026)
    National University of Singapore, School of Computing
    PI: Tulika Mitra
    • Reconfigurable spatial-dataflow architecture and compiler design (collaborations with Tenstorrent & IBM).
    • Hardware accelerators for sparse and quantized AI workloads.
    • Compilers for Coarse Grained Reconfigurable Array (CGRA).
    • Dataflow architecture and software co-design for Spiking Neural Networks.
    • Heterogeneous FPGA-GPU system for AI workloads (collaborations with AMD).
  • 2019 - 2023

    Toulouse, France

    PhD student
    IRIT, University of Toulouse
    • CPU micro-architecture modeling and program performance analysis for real-time systems.
  • 2019 - 2019

    Grenoble, France

    Research Intern
    Verimag, Grenoble Alpes University
    • CPU cache analysis and program analysis for real-time systems.
  • -

    Toulouse, France

    Teaching (Computer Architecture & Compilation)
    University of Toulouse
    • Computer Architecture and VHDL (≈80h)
    • Computer Architecture and ARM assembly (≈50h)
    • Compilation Theory (≈60h)
    • Advanced Compilation (≈10h)
    • Master student project supervisor (3 months/year)

Education

  • 2019 - 2023

    Toulouse, France

    PhD
    IRIT lab, University of Toulouse
    Computer Science
    • Scholarship funded by the French Minister for Higher Education and Research (top Master students).
  • 2017 - 2019

    Toulouse, France

    Master
    University of Toulouse
    Embedded Computing Systems
  • 2014 - 2017

    Toulouse, France

    BS
    University of Toulouse
    Computer Science

Publications

  • 2026
    A Data-Driven Dynamic Execution Orchestration Architecture
    31th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
  • 2024
    SWAT: Scalable and efficient window attention-based transformers acceleration on FPGAs
    Proceedings of the 61st ACM/IEEE Design Automation Conference
  • 2024
    Zed: A generalized accelerator for variably sparse matrix computations in ml
    Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques
  • 2025
    TerEffic: Highly Efficient Ternary LLM Inference on FPGA
    arXiv preprint arXiv:2502.16473
  • 2025
    Enhancing CGRA Efficiency Through Aligned Compute and Communication Provisioning
    Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1
  • 2024
    SparrowSNN: A Hardware/software Co-design for Energy Efficient ECG Classification
    arXiv preprint arXiv:2406.06543
  • 2024
    Reconsidering the energy efficiency of spiking neural networks
    arXiv preprint arXiv:2409.08290
  • 2025
    Data-aware Dynamic Execution of Irregular Workloads on Heterogeneous Systems
    arXiv preprint arXiv:2502.06304
  • 2020
    Improving the Performance of WCET Analysis in the Presence of Variable Latencies
    The 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)
  • 2022
    A Framework for Calculating WCET Based on Execution Decision Diagrams
    ACM Transactions on Embedded Computing Systems
  • 2023
    Computing Execution Times With Execution Decision Diagrams in the Presence of Out-of-Order Resources
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • 2019
    PLRU cache analysis
    Proceedings of the 13th Junior Researcher Workshop on Real-Time Computing (JRWRTC 2019)
  • 2025
    TL: Automatic End-to-End Compiler of Tile-Based Languages for Spatial Dataflow Architectures
    arXiv preprint arXiv:2512.22168
  • 2024
    ASADI: Accelerating sparse attention using diagonal-based in-situ computing
    2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)
  • 2023
    Modélisation du comportement temporel du pipeline pour le calcul de WCET
    PhD thesis, Université Paul Sabatier-Toulouse III
  • 2021
    Déterminer le WCET d’applications temps-réel en présence de latences d’exécution variables
    Conférence francophone d’informatique en Parallélisme, Architecture et Système (COMPAS 2021)

Languages

Chinese : Native
French : Almost native
English : Fluent