How are you all doing in the AI hype? Is it slowing down or speeding up for you? I strongly believe that significant changes are coming our way, and ahead we will witness a time of substantial change. How models and the business landscape will change is not clear to anyone and remains to be seen. Humans are terrible at predictions, and change is not linear, so the truth is no one knows. And although AI is all the rage and worth the hype, the real beauty lies in the data—your dear data. So, how are you preparing it for the new world ahead? If the answer is data, my view is that if you’re looking to ride the wave of digital coolness, it all starts with organizing your data. But before you can do that, you need to modernize your data warehouse, ditch the old, and fly into the cloud. Ah, you just want to do AI, I know… but you can’t. Sorry… But read on; you can do AI too, but you need to fix the data.
Now, if you’re already an expert on data warehouses, data lakes, and data lakehouses, and you’ve got cloud and legacy systems checked in your notebook, feel free to skip ahead. But if you’re looking for some plain-language explanations, stick around.
Let’s Break It Down: What’s a Data Warehouse, Data Lake, Data Lakehouse, Legacy, and the Cloud?
Data Warehouse: Imagine it as a storage place for all your data, like a big filing cabinet where you neatly organize and store all your important documents. It’s designed for easy access and analysis. It has lots of brain.
Data Lake: Think of it like a huge pond where you can throw all kinds of data – structured, unstructured, messy, or neat. It’s a bit like a big jigsaw puzzle; you have all the pieces, but it takes some effort to put them together. It has lots of muscle and can handle large volumes but lacks the same brain as the warehouse.
Data Lakehouse: This is like an upgraded version of your data pond. It’s more organized, like a hybrid between a neat filing cabinet and the messy pond. It’s designed for both structured and messy data, making it easier to find and use. So, it offers both muscle and brain for you.
Legacy: This is the old way of doing things, like using a typewriter when modern computers are available. Legacy systems are outdated and less efficient than the newer, better options.
The Cloud: Think of it as your virtual office. Instead of storing everything on your own computer or server, you use someone else’s computer (in a data center) to store and access your data. It’s like renting office space but in the digital world.
So there you have some of the key components of data platforms today, and they can be cool when talking to your friends. Today, many companies are building different kinds of Data Lakehouse concepts, combining the strengths of traditional data lakes and data warehouses.
So, common questions arise. ”This is complex; I just want to do this AI use case, and I have direct access to my systems where the data is. Can I not just use an API and get the data? Like that, I can skip all this warehouse stuff.”
In some cases, a ’yes’ may be the answer, especially when dealing with real-time data. However, as complexity increases, the number of pieces in your puzzle multiplies. This leads to the creation of data silos, islands of data, and a tangled web of information. Managing access and ensuring GDPR compliance becomes a significant challenge when data is scattered in many places. The situation can quickly evolve into a complex problem; I like to think of it as data spaghetti. Building a unified data platform is an essential step in scaling analytics for a mid-sized or large company.
Why the Cloud?
You might wonder why you should care about the cloud. After all, who cares where the servers are? Well, it makes a world of difference. Imagine growing your vegetables versus buying exactly what you need, when you need it. The planning, the time to market, the total cost – it all becomes a breeze with the cloud. It’s one of those ’you have to see it to believe it’ situations.
Capabilities and the True Value of Data
But here’s the thing: Before you can take advantage of the cloud’s magic, you’ve got to have your data sorted and ready. Migrating your data might sound like a headache, but how you do it will determine how quickly you can start reaping the rewards. What are your use cases? If you’re not sure, it’s time to go back to the basics and truly understand your data. Make migration use-case-driven and test your capabilities end-to-end right from the start. Take it from me; a massive MVP can slow you down. A smaller scope would have quickly revealed the right fit. Remember, small decisions and a use-case-driven approach work wonders in combination with your business objectives.
How to Close Legacy
So, you’ve built an incredible stack, moved some data, and, guess what – you still have legacy systems lurking around. That’s double the cost… Congratulations on a successful cloud transformation, but it’s time to stop investing in legacy systems and close those chapters. Otherwise, you’ll be dealing with a double stack, higher costs, and increased complexity for quite a while. A key component often not in focus, but perhaps the main one, to legacy decommissioning, is change management. How do you get people to come along with you on the decommissioning journey? It is boring, and it seems pointless if not connected to a story. It is uncomfortable for the business to change if they do not see the win. People in IT have attached their feelings and emotions to existing systems, and it might take people’s jobs away. Plan carefully, as you are likely to need active involvement. Trust me; I have seen legacy decommissioning efforts that have taken four to five times the needed time.
New Problems and Guesswork in the Cloud
As one problem fades, another emerges. Remember the vegetable analogy? Growing your vegetables might be cheaper per carrot if you get it right. But buying groceries every week, getting only what you need, is more flexible. Now, you need to become a cost predictor. That’s right; even those who never thought about costs before, like data engineers and architects, now decide your costs. So before, the investments and money were clear upfront; now you need to proactively train people, monitor, and guess costs based on what you build and what you buy. The good news is that if you get it right, it’s like buying the vegetables that are in season; it can be cheaper than if you had to do it yourself.
So, there you have it: Cloud transformation the ’Dear’ Data Way. And guess what? We’re here to help you along the way. So get ready for the unknown and treat your data with some dear love and respect