I am a senior software engineer at Google working on TensorFlow. I mostly work on TensorFlow's integration with partners, third-party device plug-ins (PluggableDevice), and improving sparsity support.
Recent updates
- 2022-11: I'm honored to join the Steering Committee of oneAPI Community Forum!
- 2022-10: The long-awaited Intel® Extension for TensorFlow has been released! Install the plug-in to run TensorFlow models on Intel GPUs (Linux/WSL2).
- 2022-09: Our paper on Compiler Support for Sparse Tensor Computatations in MLIR has been accepted to ACM TACO!
- 2022-06: TensorFlow can now support DirectX-12-compatible GPUs on Windows through the TensorFlow-DirectML plug-in!.
- 2021-10: I will appear in the Intel ON event on Oct 27-28, 2021. All talks are online. Free registration here.
- 10/27/21 Accelerate Deep Learning with Intel-optimized TensorFlow and a Sneak Peek at Next-Gen CPU with Google (co-presenter)
- 10/28/21 The Fast Path to Scale AI Everywhere (AI Tech Insight Session) (cameo)
- 2021-07: I was on the Industry Panel at SIAM Annual Meeting 2021.
- 2021-06: I gave a keynote talk at the oneAPI Developer Summit at ISC 2021 on TensorFlow and oneDNN's partnership over the past few years.
- 2021-06: We recently launched PluggableDevice, a new device plug-in mechanism for TensorFlow, in TF 2.5! If you're using Mac, check out the TensorFlow-Metal plug-in for Mac GPUs.
About
My background is in High Performance Computing (HPC) and Parallel Computing. I hold a B.Eng. in Computer Engineering from Kasetsart University. I came to the U.S. on the Fulbright scholarship and completed my Ph.D. in Computer Science at UC Berkeley under the supervisor of Professor Kathy Yelick. My dissertation focused on avoiding communication in large-scale N-body algorithms and matrix computations on supercomputers to achieve highly-scalable and efficient implementations. During my time at Berkeley, I was part of the Berkeley Benchmarking and Optimization Group (BeBOP). I was also affiliated with the Dynamic Exascale Global Address Space Programming Environments (DEGAS) project at the Lawrence Berkeley National Laboratory. Prior to my Ph.D., I enjoyed optimizing scientific applications on hardware accelerators such as GPU and Cell Broadband Engine.
Education
-
University of California, Berkeley
Ph.D. in Computer Science, 2017
Advisor: Prof. Katherine Yelick
Dissertation: Communication Avoidance for Algorithms with Sparse All-to-all Interactions -
Kasetsart University, Bangkok, Thailand
B.Eng. in Computer Engineering, 1st-class honor, 2010
Advisor: Assoc. Prof. Putchong Uthayopas
Thesis: PlayCloud: A Middleware System for PlayStation Grid
Professional Services
- IPDPS 2023 Technical Program Committee: Parallel and Distributed Algorithms for Data Science Track
- SC 2021, 2022 Technical Program Committee: Machine Learning and HPC Track
- HiCOMB@IPDPS 2020-2021 Program Committee
- IPDPS 2021 Technical Program Committee: Algorithms Track
- AAAI 2020-2021 Reviewer
- NeurIPS 2019-2020 Reviewer
- ICML 2019-2020 Reviewer
- Euro-Par 2020 Program Committee: Parallel Numerical Methods and Applications Track
- INFOCOMP 2019 Technical Program Committee
- ICS 2019 External Review Committee
- SC 2018 Technical Program Committee: Algorithms Track
- PLDI 2018 Artifact Evaluation Committee
Past open-source projects
- HP-CONCORD: Massively Parallel Graphical Model Structure Learning [Webpage] [Code]
- SpDM3: Parallel Sparse-Dense Matrix-Matrix Multiplication Library [Webpage] [Code]
Presentations
- Industry Panel
- 07/20/21 Industry Panel: Industrial Secrets: Shedding Light on Opportunities for Mathematicians in BIG (Business-Industry-Government) Careers, SIAM Annual Meeting 2021, virtual.
- 11/12/20 oneAPI Spec & Industry Panel, oneAPI Developer Summit 2020, virtual.
- TensorFlow and oneDNN in Partnership [Recorded keynote talk]
- 06/22/21 oneAPI Developer Summit at ISC 2021 (Keynote), virtual.
- 05/20/21 oneAPI AI Technical Advisory Board meeting, virtual.
- Igniting the Next Generation of Deep Learning [Podcast]
- 01/06/21 oneAPI Code Together podcast series, virtual.
- Machine Learning to Solve Challenging Problems
- 01/15/2020 National Electronics and Computer Technology Center, Pathum Thani, Thailand.
- 01/14/2020 Vidyasirimedhi Institute of Science and Technology, Rayong, Thailand.
- Sparse HPC Opportunities in Deep Neural Networks
- 01/14/2020 Vidyasirimedhi Institute of Science and Technology, Rayong, Thailand.
- 06/12/2019 Minisymposium on Parallel High-Dimensional Approximation: Uncertainty Quantification and Machine Learning, Platform for Advanced Scientific Computing (PASC) 2019 Conference, Zurich, Switzerland.
- Introduction to Machine Learning and TensorFlow
- 04/03/2019 Metamedia Technology, Bangkok, Thailand.
- 03/28/2019 Department of Computer Engineering, Kasetsart University, Bangkok, Thailand.
-
Communication-Avoiding Optimization Methods for
Distributed Massive-Scale Sparse Inverse Covariance Matrix Estimation [Paper] [arXiv] [BibTeX]- 01/14/2020 Vidyasirimedhi Institute of Science and Technology, Rayong, Thailand.
- 04/11/2018 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018), Lanzarote, Spain
- 12/09/2017 NIPS BigNeuro Workshop, Long Beach, CA
- 12/08/2017 NIPS Advances in Modeling and Learning Interactions from Complex Data Workshop, Long Beach, CA
- 07/14/2017 Communication-Avoiding Algorithms Workshop at SIAM Annual Meeting 2017 (AN17), Pittsburgh, PA
- Communication Avoidance for Algorithms with Sparse All-to-all Interactions [Dissertation]
-
Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication [Paper] [Slides]
- 05/26/2016 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), Chicago, IL
-
A Computation- And Communication-Optimal Parallel Direct 3-Body Algorithm [Paper] [Slides]
- 10/27/2015 SIAM Conference on Applied Linear Algebra (LA) 2015, Atlanta, GA
- 11/18/2014 26th ACM/IEEE Supercomputing Conference (SC 2014), New Orleans, LA
-
A Communication-Optimal N-Body Algorithm for Direct Interactions [Paper] [Slides] [Poster]
- 01/07/2014 Department of Computer Engineering, Kasetsart University, Bangkok, Thailand
- 06/30/2013 DEGAS Summer Retreat 2013, Santa Cruz, CA
- 03/01/2013 SIAM Conference on Computational Science and Engineering (CSE) 13, Boston, MA
-
Implementation of Cloud Removal Processing (LMF) on Multi-Temporal Remote Sensing Data Using GPGPU
- 11/19/2010 The 2010 International Computer Science and Engineering Conference (ICSEC 2010), Chiang Mai, Thailand (Substituted for San Aksaranugraha, author)
-
Implementation Issues in Developing a Fluid Flow Solver on Cell Architecture
- 09/23/2009 NECTEC-Annual Conference and Exhibition 2009 (NECTEC-ACE'09), Pathumthani, Thailand
Publications
-
Compiler Support for Sparse Tensor Computations in MLIR
[Paper]
[PDF]
Aart Bik, Penporn Koanantakool, Tatiana Shpeisman, Nicolas Vasilache, Bixia Zheng, and Fredrik Kjolstad;
ACM Transactions on Architecture and Code Optimization Vol. 19, No. 4, Article 50, September 2022 -
Mesh-TensorFlow: Deep learning for supercomputers
[Paper]
[Supplemental]
[arXiv]
[BibTeX]
Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani,
Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, Ryan Sepassi, and Blake Hechtman;
32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada, December 2018.
-
Communication-Avoiding Optimization Methods for
Distributed Massive-Scale Sparse Inverse Covariance Matrix Estimation [Paper] [arXiv] [BibTeX]Penporn Koanantakool, Alnur Ali, Ariful Azad, Aydın Buluç, Dmitriy Morozov, Leonid Oliker, Katherine Yelick, and Sang-Yun Oh;
21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018), Lanzarote, Spain, April 2018.
-
Communication Avoidance for Algorithms with Sparse All-to-all Interactions
[PDF]
[BibTeX]
Penporn Koanantakool
Ph.D. Dissertation, UC Berkeley, December 2017. -
Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication [Paper] [Slides] [BibTeX] [Webpage] [Code]
Penporn Koanantakool, Ariful Azad, Aydın Buluç, Dmitriy Morozov, Sang-Yun Oh, Leonid Oliker, and Katherine Yelick;
30th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), Chicago, IL, USA, May 2016. -
Write-Avoiding Algorithms [Paper] [TechReport] [BibTeX]
Erin Carson, James Demmel, Laura Grigori, Nick Knight, Penporn Koanantakool, Oded Schwartz and Harsha Vardhan Simhadri;
30th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), Chicago, IL, USA, May 2016.
Also a UCB Technical Report: UCB/EECS-2015-163, June 2015 -
A Computation- And Communication-Optimal Parallel Direct 3-Body Algorithm [Paper] [Slides] [BibTeX]
Penporn Koanantakool and Katherine Yelick;
26th ACM/IEEE Supercomputing Conference (SC 2014), New Orleans, LA, USA, November 2014. -
Scalable Multimedia Content Analysis on Parallel Platforms [Journal] [BibTeX]
Ekaterina Gonina, Gerald Friedland, Eric Battenberg, Penporn Koanantakool, Michael Driscoll, Evangelos Georganas, Kurt Keutzer;
ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP) 2013. -
A Communication-Optimal N-Body Algorithm for Direct Interactions [Paper] [Slides] [Poster] [BibTeX]
Michael Driscoll, Evangelos Georganas, Penporn Koanantakool, Edgar Solomonik, and Katherine Yelick;
27th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), Boston, MA, USA, May 2013.
(First 3 authors contributed equally) -
PlayCloud: A Middleware System for PlayStation Grid
Penporn Koanantakool
Bachelor's Thesis, Kasetsart University, 2010. -
Implementation Issues in Developing a Fluid Flow Solver on Cell Architecture
Penporn Koanantakool, Supakit Prueksaaroon, and Sornthep Vannarat;
NECTEC-Annual Conference and Exhibition 2009 (NECTEC-ACE'09), Pathumthani, Thailand, September 2009.