Bookmarks tagged [batch-processing]
Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterpris...
https://pypi.python.org/pypi/pyspark/
Apache Spark Python API.
A flexible parallel computing library for analytic computing.
https://github.com/spotify/luigi
A module that helps you build complex pipelines of batch jobs.
Run MapReduce jobs on Hadoop or Amazon Web Services.
https://github.com/ray-project/ray/
A system for parallel and distributed Python that unifies the machine learning ecosystem.