Skip to Main Content

DDS8530

Module 2 Required Resources

Introduction to Apache Beam

Josh Cummings.  (Jan 2024). Introduction to Apache Beam, https://www.baeldung.com/apache-beam 

  • This article provides an introduction to Apache Beam which provides a portable programmable layer for batch and streaming processing jobs. It supports many distributed processing backends including Apache Spark and Google Cloud Dataflow, which we use in this course. 

Revolutionizing Real-Time Stream Processing

Apache Beam. (n.d.). Revolutionizing Real-Time Stream Processing: 4 Trillion Events Daily at LinkedIn, https://beam.apache.org/case-studies/linkedin/ 

  • This case study shows how LinkedIn uses Apache Beam to process 4 Trillion events daily. This provides you with a glimpse of how real-world stream processing pipelines are built and deployed. 

The Many Faces of Publish Subscribe

Patrick Eugster et al. (June 2003). The Many Face of Publish Subscribe, http://systems.cs.columbia.edu/ds2-class/papers/eugster-pubsub.pdf 

  • This is an excellent introduction on publish subscribe design pattern and different architectural patterns related to the same paradigm. 

Google Pub/Sub Messaging

Google. (n.d.). Google Pub/Sub Messaging, https://docs.cloud.google.com/pubsub/docs/overview 

  • This documentation provides a comprehensive overview of Google’s pub/sub offering and how you may use it. 

 

Module 2 Optional Resources

Mass Ad Bidding with Beam at Booking.com

Apache Beam. (n.d.). Mass Ad Bidding With Beam at Booking.com, https://beam.apache.org/case-studies/booking/ 

  • This is another case study showing how Booking.com is using Apache Beam to build their ad bidding platform.