• MarTech Today
  • Sections
    • Ads
    • Marketing
    • Content
    • Sales
    • Analytics
    • Management
    • Resources
    • More
    • Home
  • Follow Us
    • Follow
  • MarTech Today
  • Ads
  • Marketing
  • Content
  • Sales
  • Analytics
  • Mgmt
  • Resources
  • More
  • Events
    • Follow
  • SUBSCRIBE

MarTech Today

MarTech Today
  • Ads
  • Marketing
  • Content
  • Sales
  • Analytics
  • Management
  • Resources
  • More
  • Events
  • Newsletters
  • Home
Martech: Analytics & Data

MarTech Landscape: What’s the difference between a data warehouse and a data lake?

For marketers, the difference is more than just the choice of metaphors.

Barry Levine on October 31, 2018 at 4:52 pm
  • More

It might seem odd to ask a marketer if they’d like their data in something described metaphorically as a building or a body of water.

In this article, part of our MarTech Landscape Series, we look at the characteristics of these two types of massive data storage.

Data warehouses

Digital marketers are increasingly working with big data, the huge amounts of raw information pouring from social media, contact centers, online behavioral tracking and other sources. And two of the most common kinds of storage for large amounts of data are “data warehouses” and “data lakes.”

While marketers obviously involve IT in storage decisions, it’s helpful to understand the capabilities and costs of your systems by understanding the data storage employed.

A data warehouse provides storage for data that is typically structured for databases as it enters, and the data often comes from operational systems — transactions, customer records, human resources, customer relationship management systems, enterprise resource planning systems and so on. The data is usually sifted and prepared carefully before stored in a warehouse, which is often the preferred mechanism if the information is legally binding and needs to be traceable.

A warehouse can store unstructured data like body cam footage from police officers, said James D’Arezzo, CEO of storage performance provider Condusiv Technologis. Even though that kind of data is not typically structured for a database, it can enter as a list of files. But, like the physical structures they are named after, data warehouses are designed primarily for storing data that is properly sorted, filtered and packaged when it enters.

Data lakes

As the names imply, data lakes are more amorphous than warehouses. They store all kinds of data from any sources, including video feeds, audio streams, facial recognition data, social media posts, and the like.

Lakes sometimes use artificial intelligence to characterize the inflowing data, such as naming it, but the formatting, processing and management of the data is usually undertaken when it is exported for a given need, not before it is stored. While warehouses are typically much more discriminating in what kinds of data they allow in, lakes accept virtually everything.

Although lakes aren’t necessarily faster for accepting or processing data, D’Arezzo told me, their data managers don’t have to create structures and incoming criteria to accept the data. For a marketer, he added, lakes mean a greater depth and breadth of data sources than in a warehouse.

Why this matters to marketers

Data management systems can employ both warehouses and lakes, or they might focus on one type or another. D’Arezzo recommends that marketers understand the kind of storage where their data lives, the analytical tools available, the integration with systems that can act on the data, costs, any performance issues, and whether the storage resides on the company’s physical premises, in the public cloud, in the company’s private cloud, or in some combination.

In terms of costs, data preparation before storage for a warehouse can be expensive and time-consuming, and warehouses traditionally have stored their huge amounts of data on cheap but slow magnetic tape, while lakes often use commodity drives.

D’Arezzo also notes that, sometimes, marketers don’t actually know what they want to do with the data before it is stored, so it might be limiting or difficult to prepare it for an unknown purpose. Facial recognition data, social posts or data from Internet of Things devices, he said, can fall into that category, in which it might be better to store first and decide later.

Warehouse vendors include IBM, Google, Microsoft, Teradata, SAP, while some lake vendors are AWS, Microsoft, Informatica, and Teradata.



About The Author

Barry Levine
Barry Levine covers marketing technology for Third Door Media. Previously, he covered this space as a Senior Writer for VentureBeat, and he has written about these and other tech subjects for such publications as CMSWire and NewsFactor. He founded and led the web site/unit at PBS station Thirteen/WNET; worked as an online Senior Producer/writer for Viacom; created a successful interactive game, PLAY IT BY EAR: The First CD Game; founded and led an independent film showcase, CENTER SCREEN, based at Harvard and M.I.T.; and served over five years as a consultant to the M.I.T. Media Lab. You can find him at LinkedIn, and on Twitter at xBarryLevine.

Related Topics

Channel: Martech: Analytics & DataWhat is martech? The marketing technology landscape, explained

Subscribe to receive daily martech news and expert insights. See terms.


We're listening.

Have something to say about this article? Share it with us on Facebook and Twitter.

Get the daily newsletter digital marketers rely on.
See terms.

ATTEND OUR EVENTS

MarTech 2021: March 16-17

MarTech 2021: Sept. 14-15

MarTech 2020: Watch On-Demand

×

Attend MarTech - Click Here


Learn More About Our MarTech Events

White Papers

  • A Beginner’s Guide to Omnichannel Marketing using Marketing Automation
  • The Top Five Objections That Hold Companies Back From Doing SEO
  • How To Optimize SEO With UGC
  • Email Tune-Up: A 5-Point Inspection to Get Your Program in Gear
  • Digital Marketing Report Q4 2020: Benchmarks and Insights for 2021
See More Whitepapers

Webinars

  • The Secret Behind SEO Success: Predict Rank with the Power of Data Science
  • How to Avoid the Digital Transformation Trap
  • How to Build a Marketing System of Record
See More Webinars

Research Reports

  • Local Marketing Solutions for Multi-Location Businesses
  • Enterprise Digital Asset Management Platforms
  • Identity Resolution Platforms
  • Customer Data Platforms
  • B2B Marketing Automation Platforms
  • Call Analytics Platforms
See More Research

Receive daily martech news and analysis.
Martech Today
Download the Martech Today app on iTunes
Download the Martech Today App on Google Play

Channels

  • Advertising
  • Marketing
  • Content
  • Social
  • Commerce
  • Sales
  • Analytics
  • Management
  • Home

Our Events

  • MarTech
  • SMX

Resources

  • White Papers
  • Research
  • Webinars
  • MarTech Conference

About

  • About Us
  • Contact
  • Privacy
  • Marketing Opportunities
  • Staff
  • Connect With Us

Follow Us

  • Facebook
  • Twitter
  • LinkedIn
  • Newsletters
  • Instagram
  • RSS
  • iOS App
  • Google Play

© 2021 Third Door Media, Inc. All rights reserved.

Your privacy means the world to us. We share your personal information only when you give us explicit permission to do so, and confirm we have your permission each time. Learn more by viewing our privacy policy.Ok