Source – techtarget.com
If you were to visit some data scientist’s home at night and peek in the window, you might see a computer running with its screensaver disabled. That’s because the data scientist had to bring home analytics work that takes too long to run in the corporate data center.
This story is shared in this edition of the Talking Data podcast, an episode centered on user stories about working with big data shared in SearchDataManagment’s pages in 2017.
What is the upshot? If data professionals have learned anything, it is that they have to be innovative in this era of digital business and find new ways to move ideas into operations fast. Moving quickly — as discussed in this episode of the podcast — means deploying clusters automatically to keep ahead of processing capacity requirements, making wholesale changes in the way data is prepared and providing greater interactivity for analytics on big data stores and more.
The tale of data scientists burning the midnight oil comes via Brock Noland, chief architect and co-founder of phData. This consultancy works to move Hadoop and Spark technologies into production.
These days, Noland works with Apache Impala, a distributed query engine, to improve the interactive response time that data scientists and business users experience when working with big data. It’s a step toward a cure for late-night data processing at home, and it’s part of a larger trend toward data democratization. More of Nolan’s commentary can be found in a piece on Impala’s recent graduation to top-level project status at the Apache Software Foundation.
Another take on moving quickly is conveyed in the story of Panera’s analytics work. The big bread maker and restauranteur did a deep dive into the new world of digital business — and that meant closely monitoring transactional operations at lunchtime. It also meant finding an automatic way to spin up instances of Spark and Hadoop for quick analysis.
Panera used BlueData tools that stage analytics work in modern software containers. What happened is shared in this podcast, as well as in a recent report covering Panera’s big data menu.
Yet another story of working with big data today comes from a discussion of new approaches to extract, transform and load (ETL) for data preparation. It comes by way of Alasdair Anderson, executive vice president for technology at Nordea Bank.
According to Anderson, the need for speed required the bank’s technologists to enable data owners to manage their data themselves. So, Nordea turned to a collection of big data software, including data preparation software from Trifacta, to break the ETL bottleneck. Additional discussion of analytics work at Nordea can be found in a recent article that follows ETL processes as they evolve to include business analysts at the bank.
Listen to this podcast to learn more about key trends among the data professionals who are on the frontlines working with big data, and to gain a view into how data management is evolving as we enter 2018.