Integrate, grow, repeat. How WebSight real-time DXP provides a way to break out of the content-centric model

The presentation will offer an analysis of how WebSight DXP event-driven architecture can be used to create a more agile and responsive digital experience platform.

We will showcase an event-driven WebSight DXP architecture, and demonstrate how CMS can be used as a data source for a composable platform that processes events in real time.

The live demo will focus on the critical capabilities the event-driven architectures bring into DXP business scenarios:

By the end of our presentation, participants will have a general understanding of how the event-driven architecture can be utilized in the DXP context.

How do you determine, which experience depends on what resources (templates, products, content)?

Maciej Laskowski

This really depends and there are at least a couple of valid solutions since the composing experience is made by small functions. The functions we were presented during the demo are configurable, and we can define which template is compatible with a particular (versioned) model and type of the data (products/reviews/prices), e.g. a template that is identified by the path `/templates/product-details-template.html` will use data of type and versions: pimProduct.v1, reviews.v1 and prices.v1.

Michal

Pipelines are built out of functions. Functions subscribes to the topics to read the events from and (optionally) produces new events. The first function we've created is the Router, which is responsible for re-writing each event type (page update, data update, template update etc) to the type-based topic. This way we know how to inform downstream functions about the event, which is relevant to the function. We'll have a diagrams with pipelines from demo on the playground session.

What if an upstream system just provided snapshots and no events or delta updates?

Maciej Laskowski

Could you tell me more about what you mean by snapshots? I’m not sure if I understand it correctly. Pushing all the data once again is okay, there are techniques that enable filtering out if the data is already in the system. Also, another strategy can be used, like keeping only the latest version of each product’s data in the system. Both have trade-offs.

Michal

There are a lot of techniques on how to retrieve data from non-event based systems. It widely depends on the use case. Some options: - Extend the source system with a module, that will produce the events on change - Create a module, that periodically retrieves the data from the external system - Use low-level connectors that comes with the event-streaming platforms - In the word case scenario, processing the batches

Michael

If you release a new code release, does that mean that all content is statically regenerated? Or do you detect which template changed?

Maciej Laskowski

It depends. If there is a moderate number of products/templates (let’s say less than 1 million), it usually does not make sense to add sophisticated algorithms that detect a change, because processing 1 million pages is less than a couple of minutes… But, of course, event-driven is made for workflows and there is absolutely nothing wrong in placing some change-detection before the experience templating function, to limit the number of compositions.

Michal

It's also recommended to use multiple templates. and only notify the platform about the templates that changed. We can detect if the template changed in between publications, this would be a nice optimization (i.e. to check the hash of the template before triggering the updates). We haven't facet the problem since the templates generation is fast (we haven't measured the templating engine module alone), but as you seen in the presentation publishing 3500 products takes low seconds.

Robin

How do you sync your data between all HTTP servers, when autoscaling kicks in (new instance should be added)? If you clone an instance, there could be some content updates happening in the meantime. Do you create a queue where the updates are gathered, before the instance is ready and then push/pull it from the queue to the HTTP server or a complete different solution?

Maciej Laskowski

Generally, it all depends on scale. On a relatively small scale (less than 100k unique pages) we use the simplest method and the synchronisation is made “at-place”, meaning with the event-streaming platform. The new HTTP server syncs the latest version of each unique experience, and it takes about 1-2 minutes. On a bigger scale (over 1M unique pages) you may use some snapshot techniques if the sync time (~10-15 min) is too long.

Maciej Laskowski

That would require an additional queue for the delta between the snapshot and “now”, and this could be achieved by using the previous event-streaming technique. The trigger of the autoscaling mechanics can be configured. By default, it would be some quantity metric (e.g. number of page views per the HTTP server instance), but resource-based metric (like CPU/network usage) would be also fine here.

Robin

Is this an open source solution? Building and maintaining your solution probably takes a lot of effort right? For example how are you handling security issues, bug fixes, releases / updates? Is there already a community behind it?

Maciej Laskowski

The real-time composability we presented is not an open-source solution. However, it is built on top of Apache open-source projects. WebSight CMS is a separate project, which is available under BSL license, in short words it is free to play with and small commercial use cases. We already have a community behind the CMS and we just started building community around the real-time composability.

Maciej Laskowski

For the releasing - we follow the Semantic Versioning principles and we tend to release small and frequent updates. At the moment, the product is in an early stage, as mentioned above, we have just started building a community around it. Feel free to contact us to get more details on how to participate in an early-preview phase.

Ovidiu

Is the templating language django templating language?

Radu Cotescu

It looks like HTL - they run on Sling. Confirmed - see https://docs.websight.io/cms/developers/components/#rendering-script.

Maciej Laskowski

Hi, the templating engine we adapted for the demo purposes is Pebble (https://pebbletemplates.io/). However, this is a small experience function and any templating engine (or a combination of a couple of them if you really need this) may be applied in a manner of a workflow or even a single (slightly bigger) function.

Maciej Laskowski

Radu - the HTL is done on the CMS, which is a separate product. Our CMS is based on the Apache Sling stack. We are still working on public documentation for the real-time composition product. Thank you.

Peter Trumpp

How are the experiences Stored? Database, cloud based, Filesystem based. Locally near to the Webserver?

Maciej Laskowski

That depends on the particular component of the system. There is great flexibility when your services are done in a micro-service fashion, each of them can have their own database. However, the general rule we use here is event-sourcing. At the moment we are using the Apache Pulsar under the hood which has event-sourcing capabilities (think of it as an event-dedicated database). HTTP Servers can keep the experiences on their local hard drive (preferably some kind of SSD to optimise the I/O).

Michael

Do authors need to write a template language in their components? Are you only supporting a technical authoring audience?

Maciej Laskowski

We made the template language visible during the presentation on purpose, to let the audience know how it works. What do components presents fully depends on the component developers. They can hide all the templating language from the CMS authors. On the roadmap of our CMS (https://github.com/orgs/websight-io/projects/2/views/2), we have an item for author-friendly components that would enable drag-and-drop of template components with preview.

Juan Sanchez

Can the client choose in which cloud the solution is deployed or is already provided?

Maciej Laskowski

Yes, the composability layer is based on the cloud-native stack and it is built to run with Kubernetes. Most of the cloud providers have a managed Kubernetes service which is perfectly fine to run the solution.

Konrad Windszus

Is it using Apache Kafka under the hood?

Maciej Laskowski

There is an abstraction on the messaging provider. At the moment we are using Apache Pulsar (it was also used during the demo). Apache Kafka could be a fine replacement, however Apache Pulsar is cloud-native and scales more smoothly than Apache Kafka.

Michal

Pulsar is cloud-native, it's designed to use K8S by default. It's a good match.