How IBM is helping build Data Infrastructure on Cloud faster

Ahmed Ahsan Khan
Dev Genius
Published in
3 min readOct 15, 2020

--

Photo by Markus Spiske on Unsplash

In our previous article we defined who Data Engineers are and how they are different from Data Scientist and Data Analyst. We mentioned that Data Engineers are software engineers who design and built software infrastructures that will help to integrate data from various sources and manage big data.

Then in another article we mentioned why IBM are not as popular in their cloud solutions as AWS, Google and Microsoft Azure are.

In this article, we will discuss how IBM Cloud offering can help build data Infrastructure on the cloud. With the recent acquisition of Red Hat by IBM, IBM is standardizing on Red Hat OpenShift container platform as its platform for cloud native, container based, Kubernetes Orchestration. With this standardization, IBM has announced its Cloud Pak on top of Red Hat OpenShift which is helping Software Architect, Data Engineers, Cloud Developers to design and build Data Infrastructure on the cloud.

What is Cloud Pak?

Cloud Pak is a Cloud Solution by IBM which is built on Red Hat OpenShift. It is an integrated set of AI-infused software solutions for hybrid cloud that help you fully implement intelligent workflows in your business to accelerate digital transformation.

IBM has 6 offering under IBM Cloud Pak which can individually be used or used in combination with others. They are:

  1. Cloud Pak for Application
  2. Cloud Pak for Data
  3. Cloud Pak for Integration
  4. Cloud Pak for Automation
  5. Cloud Pak for Multicloud Management
  6. Cloud Pak for Security

Details for these six Cloud Paks are given in the official documentation of IBM as attached above. Over the period, since the cloud computing was introduced, IBM has learned, observed other cloud providers and experimented over its own cloud platforms before coming up with these unified and specialized Cloud Offerings. These offerings can setup as a stepping stone and building block for organizations to move to clouds. And within these Cloud Pak offerings, IBM has its own products and features that can be embedded within it. Such as Cloud Pak for Integration has embedded App Connect (previously known as IBM Integration Bus), MQ and API Connect (API Gateway) with it which were IBM’s core integration solutions, Cloud Pak for Application have Java EE platform embedded into it. (IBM has been an open source contributor for Java for the last 25 years since its inception). IBM have also learnt that most of the organizations have their data on multiple cloud platforms, so there was a need for Multicloud management, hence Cloud Pak for Multicloud Management was introduced to help organisations have their data from multiple cloud platforms all be managed from a single platform.

As Most of the data is moving to the cloud, role of Data Engineer and Cloud Application Developer is blending.

Conclusion:

Introduction of these Cloud Paks would mean that data engineers can develop their data infrastructure on the cloud. Data infrastructure does not mean that only Cloud Pak for Data will be used. Data infrastructure means the infrastructure on which data has to travel and be processed along the way, this also includes data being secure. Therefore all the Cloud Paks can be used all at once or one at a time depending on:

  1. the client’s business requirements (and budget of course),
  2. the amount of data they need to collect and process (Cloud Pak for Data, Cloud Pak for Application, Cloud Pak for Automation),
  3. from how many sources data need to integrate (Cloud Pak for Integration),
  4. how many other cloud vendors their data in already residing in (Cloud Pak for Multicloud Management) and
  5. the level of security their data need to enabled (Cloud Pak for Security).

Happy Learning

--

--

I am a Senior Integration Consultant at Systems Limited | Ex-IBMer.