Parallelizing Expectation Maximization (EM)

In this report, we explore the Expectation-Maximization (EM) algorithm for Gaussian Mixture Models (GMMs) and its implementation in Python using various libraries. We start with a vanilla Python implementation, then optimize it using NumPy, Numba, and CuPy. We compare the performance of these implementations on synthetic datasets and discuss the trade-offs involved. The results show that both Numba and CuPy can offer significant time performance gains, whereby with access to a CUDA-capable GPU, CuPy can offer more than ~200 times faster execution than the base Python implementation. The report concludes with insights into the potential for significant performance improvements in the EM algorithm through optimized and/or parallel computing techniques.
Machine Learning
Statistics
Author

Ilia Azizi

Published

June 24, 2023

Disclaimer: Please note that this work has NOT been peer-reviewed.

Reuse

Citation

BibTeX citation:
@online{azizi2023,
  author = {Ilia Azizi},
  title = {Parallelizing {Expectation} {Maximization} {(EM)}},
  date = {2023-06-24},
  url = {https://iliaazizi.com/projects/em_parallelized},
  langid = {en}
}
For attribution, please cite this work as:
Ilia Azizi. 2023. “Parallelizing Expectation Maximization (EM).” June 24, 2023. https://iliaazizi.com/projects/em_parallelized.