I have previously spoken about how raw data collected from operational sensors may lack standardisation and could benefit from transformation into a series of domain based Common Data Formats (CDFs) prior to storage and processing. I have also spoken about how separating ingress components from egress components may facilitate expansion across multiple operational sites without duplicating common componentry.
But how can we empower the business to leverage the effort invested in defining these CDFs while decreasing IT involvement in the data transformation process, whilst also reducing the need for non-specialist personnel from being aware of the technical details of every eventual endpoint? This is where a few nifty features of the Reekoh Accelerate™ platform can come in.
Recognise the value of your efforts
Let’s consider this in the context of a use case, being the next step in our previous project to collect and standardise legacy information before depositing into a common database. Now that we’re live with this process, we have a new business project that seeks to introduce a new technology component on site (perhaps augmenting existing sensing capability or just as the next generation of a common technology set). With this new component, the goal is to get the project team to ensure the new data will flow through into that same common database.
First and foremost, consider how we’ve previously separated ingress components from egress components, linking them through some form of internal communication. By doing so, we’ve ensured that the project team does not need to be aware of the technical (and security) details of the eventual endpoint; they just need access to the internal endpoint. They also need limited testing; they can be confident that as long as the data is produced in the relevant CDF and sent to the right endpoint, data will flow through to the database and existing business processes.
That project team (or the internal IT department) will still need to build a new pipeline, perform the data transformation and configure those internal endpoints. The use of pipeline templates, which permit the basic pipeline structure to be pre-built and pre-configured, can massively reduce the required effort, with the resulting process becoming something like the following:
- Work out where your data is coming from, gain access and determine the format of the data
- Create a new data schema for the new system’s message format
- Create a mapping, dragging fields from your new schema across to the existing CDF
- Create a new pipeline using a pre-built pipeline template, configuring up the data source (perhaps just being an API URL and credentials), selecting your mapper and pointing it at your egress flow.
That’s it! They’re up and running and the new data becomes available in your downstream systems in the normalised, useful format, ready for you to use within your business.
This article was originally posted on LinkedIn on May 14, 2020.