So you think you know pub/sub ? Udi Dahan in
AgendaBasicsPatternsDistribution
Publish/subscribe basics Enables one-to-many communication Pub Sub1 Sub2 Sub3
Publish/subscribe basics Enables one-to-many communication Should really be called “subscribe/publish” Pub Sub1 Sub2 Sub3 sub sub sub
Publish/subscribe basics Enables one-to-many communication Should really be called “subscribe/publish” Not the same as multicast – it’s more reliable
Publisher Subscriber
Publish/subscribe basics Enables one-to-many communication Should really be called “subscribe/publish” Not the same as multicast – it’s more reliable Is about logical, not physical data distribution Each event should be processed once
Publisher Sub3_2 Sub3_1 Sub2 Sub1 Sub3 LB
But what about data sync? Keeping in-proc caches synced in a web farm Use a distributed cache for that (Redis, etc) Do not build your own distributed cache Not unless you absolutely HAVE to More on this later
Subscribers can be publishers too Think peer-to-peer, not client/server PS1 PS2 PS3 PS4
Avoid shared resources Shared databases create tight coupling PS1 PS2 PS3 PS4 DBDB
Seek out autonomy But preserve the “single source of truth” PS1 PS2 PS3 PS4
BasicsPatternsDistribution
Events – not commands Always publish events – not commands Examples: OrderCancelled, AccountCreated Something that already happened – a fact Subscribers can’t invalidate events But what about failures?
Technological failures Deserialization failures Move off to “error” queue for admin to handle Likely to be returned for reprocessing later Transient failures (deadlocks & other exceptions) Retry + backoff & escalate to error queue Process & server crashes TX processing for complete rollback*
Insufficient transactionality risks DBDB Q Q Entity ID not in DB System gets out of sync
Careful with XYZ_Updated events Simple CRUD domains less suitable for implementation on top of pub/sub In-order event processing usually not guaranteed Can be mitigated with sequence numbers … and logic which matches them to entity versions Consider “Valid-to/from” semantics
Auditing / Journaling Copy msg to another queue after processing Supported out-of-the-box by most queues Extract to longer-term storage So the queue doesn’t “explode” A central log of everything that happened Can be difficult to interpret by itself
Leveraging message headers Endpoint 1 Message ID: 1 Conversation ID: 1 Message ID: 2 Conversation ID: 1 Message ID: 3 Audit Endpoint 2 Endpoint 3 Maintain a conversation ID header for cross-endpoint message flows
BasicsPatternsDistribution
Content-based “pub/sub” When subscriber-side filtering won’t scale User defines rules about what’s “interesting” And that can change at runtime It’s primarily about physical data distribution Not logical division of responsibilities
Finance Subscribing to updates of specific stocks
Industrial / Internet of Things Subscribing to events about sensor states
Solutions – well, it depends For small numbers of users (internal employees) Keep a single distributed cache up to date Have user machines poll the cache every second Across multiple sites, have a cache at each site User machines poll the cache of their site In short – no real use of pub/sub
“Clicks & mortar” Retail Distributing price changes / end-of-day orders
“Pub/sub” between geographic sites Also focused on data distribution Often want visibility into progress of distribution Which sites haven’t received the data yet Geographic sites tend to have business meaning
“Clicks & mortar” Retail Cross-site distribution done within a SOA service Not really pub/sub
SummaryBasicsPatternsDistribution
Q&A
Thank you Udi Dahan in Particular