Data Archives - Software Engineering Daily-logo

Data Archives - Software Engineering Daily

3 Favorites

Databases and data engineering episodes of Software Engineering Daily

Location:

United States

Description:

Databases and data engineering episodes of Software Engineering Daily

Language:

English


Episodes
Ask host to enable sharing for playback control

Iceberg at Netflix and Beyond with Ryan Blue

3/7/2024
Apache Iceberg is an open source high-performance format for huge data tables. Iceberg enables the use of SQL tables for big data, while making it possible for engines like Spark and Hive to safely work with the same tables, at the same time. Iceberg was started at Netflix by Ryan Blue and Dan Weeks, and The post Iceberg at Netflix and Beyond with Ryan Blue appeared first on Software Engineering Daily.

Duration:00:47:37

Ask host to enable sharing for playback control

Building a Data Lake with Adam Ferrari

2/6/2024
Starburst is a data lake analytics platform. It’s designed to help users work with structured data at scale, and is built on the open source platform, Trino. Adam Ferrari is the SVP of Engineering at Starburst. He joins the show to talk about Starburst, data engineering, and what it takes to build a data lake. The post Building a Data Lake with Adam Ferrari appeared first on Software Engineering Daily.

Duration:00:46:19

Ask host to enable sharing for playback control

Rama with Nathan Marz

12/28/2023
Building scalable software applications can be complex and typically requires dozens of different tools. The engineering often involves handling many arcane tasks that are distant from actual application logic. In addition, a lack of a cohesive model for building applications can lead to substantial engineering costs. Nathan Marz is the creator of Rama, which is The post Rama with Nathan Marz appeared first on Software Engineering Daily.

Duration:00:42:03

Ask host to enable sharing for playback control

Bonus Episode: SurrealDB with Tobie Morgan Hitchcock

12/25/2023
SurrealDB is the result of a long-time collaboration between brothers Tobie and Jaime Morgan Hitchcock. The project has modest origins and started merely to support other projects the brothers were working on. However, over time the project grew and in 2021 they started working on it full-time. Since then the project has gained serious adoption. The post Bonus Episode: SurrealDB with Tobie Morgan Hitchcock appeared first on Software Engineering Daily.

Duration:00:54:15

Ask host to enable sharing for playback control

Tracking Drug Smugglers and Migrating Databases with Benny Keinan and Lior Resisi

12/7/2023
Maritime logistics is the process organizing the movement of goods across the ocean. Historically, this has been a challenging problem because of the multinational nature of shipping, as well as piracy, smuggling, and legacy technology. It’s also profoundly important for security reasons, and because 90% of what we buy travels over the oceans. Ocean vessels The post Tracking Drug Smugglers and Migrating Databases with Benny Keinan and Lior Resisi appeared first on Software Engineering Daily.

Duration:00:50:40

Ask host to enable sharing for playback control

The Right to Be Forgotten with Gal Ringel

11/29/2023
Data breaches at major companies are so now common that they hardly make the news. The Wikipedia page on data breaches lists over 350 between 2004 and 2023. The Equifax breach in 2017 was especially notable because over 160 million records were leaked, and much of the data was acquired by Equifax without individuals’ knowledge The post The Right to Be Forgotten with Gal Ringel appeared first on Software Engineering Daily.

Duration:00:47:45

Ask host to enable sharing for playback control

Sofascore with Josip Stuhli

11/28/2023
If you’re a sports fan and like to track sports statistics and results, you’ve probably heard of Sofascore. The website started in 2010 and ran on a modest single server. It now has 25 million monthly active users, covers 20 different sports, 11,000 leagues and tournaments, and is available in over 30 languages. Josip The post Sofascore with Josip Stuhli appeared first on Software Engineering Daily.

Duration:00:49:48

Ask host to enable sharing for playback control

Daytona with Ivan Burazin

11/23/2023
Cloud-based software development platforms such as GitHub Codespaces continue to grow in popularity. These platforms are attractive to enterprise organizations because they can be managed centrally with security controls. However, many, if not most, developers prefer a local IDE. Daytona is aiming to bridge that gap. It’s a layer between a local IDE and a The post Daytona with Ivan Burazin appeared first on Software Engineering Daily.

Duration:00:46:52

Ask host to enable sharing for playback control

GraphAware with Luanne Misquitta

11/22/2023
Knowledge graphs are an intuitive way to define relationships between objects, events, situations, and concepts. Their ability to encode this information makes them an attractive database paradigm. Hume is a graph-based analysis solution developed by GraphAware. It represents data as a network of interconnected entities and provides analysis capabilities to extract insights from the data. The post GraphAware with Luanne Misquitta appeared first on Software Engineering Daily.

Duration:00:56:38

Ask host to enable sharing for playback control

Chronosphere with Martin Mao

11/9/2023
Observability software helps teams to actively monitor and debug their systems, and these tools are increasingly vital in DevOps. However, it’s not uncommon for the volume of observability data to exceed the amount of actual business data. This creates two challenges – how to analyze the large stream of observability data, and how to keep The post Chronosphere with Martin Mao appeared first on Software Engineering Daily.

Duration:00:48:08

Ask host to enable sharing for playback control

Streamlit with Amanda Kelly

10/24/2023
The importance of data teams is undeniable. Most companies today use data to drive decision-making on anything from software feature development to product strategy, hiring and marketing. In some companies data is the product, which can make data teams even more vital. But there’s a common problem – analyzing data is hard and time consuming. The post Streamlit with Amanda Kelly appeared first on Software Engineering Daily.

Duration:00:47:06

Ask host to enable sharing for playback control

Modern Web Scraping with Erez Naveh

10/18/2023
Today it’s estimated there are over 1 billion websites on the internet. Much of this content is optimized to be viewed by human eyes, not consumed by machines. However, creating systems to automatically parse and structure the web greatly extends its utility, and paves the way for innovative solutions and applications. The industry of web The post Modern Web Scraping with Erez Naveh appeared first on Software Engineering Daily.

Duration:00:57:29

Ask host to enable sharing for playback control

Observability with Eduardo Silva

10/12/2023
There are hundreds of observability companies out there, and many ways to think about observability, such as application performance monitoring, server monitoring, and tracing. In a production application, multiple tools are often needed to get proper visibility on the application. This creates some challenges. Applications can produce lots of different observatory observability data, but how The post Observability with Eduardo Silva appeared first on Software Engineering Daily.

Duration:00:44:47

Ask host to enable sharing for playback control

AI and Business Analytics with John Adams

10/5/2023
It’s now clear that the adoption of AI will continue to increase, with nearly every industry working to rapidly incorporate it into their systems and applications to provide greater value to their users. Business analytics is a key domain that promises to be radically reshaped by AI. Alembic is an AI platform that integrates web The post AI and Business Analytics with John Adams appeared first on Software Engineering Daily.

Duration:00:27:36

Ask host to enable sharing for playback control

Highly Scalable NoSQL with Dor Laor

9/7/2023
ScyllaDB is a fast and highly scalable NoSQL database designed to provide predictable performance at a massive cloud scale. It can handle millions of operations per second at a scale of gigabytes or petabytes. It’s also designed to be compatible with Cassandra and DynamoDB APIs. Scylla is used by Zillow, Comcast, and for Discord’s 350M+ The post Highly Scalable NoSQL with Dor Laor appeared first on Software Engineering Daily.

Duration:00:36:27

Ask host to enable sharing for playback control

Database Caching with Ben Hagan

8/8/2023
Database caching is a fundamental challenge in database management and there are hundreds of techniques to satisfy different caching scenarios. PolyScale is a fully automated database cache. It offers an innovative approach to database caching, leveraging AI and automated configuration to simplify the process of determining what should and should not be cached. Ben Hagan The post Database Caching with Ben Hagan appeared first on Software Engineering Daily.

Duration:00:35:36

Ask host to enable sharing for playback control

Data-Centric AI with Alex Ratner

7/20/2023
Companies have high hopes for Machine learning and AI to support real-time product offerings, prevent fraud and drive innovation. But there was a catch – training models require labeled data that machines can digest. As data volumes increase, the opportunity to get great ML results rises, but so does the problem of labeling all the The post Data-Centric AI with Alex Ratner appeared first on Software Engineering Daily.

Duration:00:50:19

Ask host to enable sharing for playback control

Making Data-Driven Decisions with Soumyadeb Mitra

7/11/2023
RudderStack is a warehouse-native customer data platform (CDP) that helps businesses collect, unify, and activate customer data from all their different sources. In today’s episode, we’re talking to Soumyadeb Mitra, the founder and CEO of RudderStack. We discuss the importance of activating all your data, how RudderStack can help you activate your data, the challenges The post Making Data-Driven Decisions with Soumyadeb Mitra appeared first on Software Engineering Daily.

Duration:00:50:45

Ask host to enable sharing for playback control

Customer-facing Analytics with Tyler Wells

6/30/2023
The state of Data inside most companies is chaotic. It takes significant time and investment to tame this chaos. When you are a platform provider you are gathering tons of data from the developers using your platform. These developers building products on your platform need insight into that data to better understand how their application The post Customer-facing Analytics with Tyler Wells appeared first on Software Engineering Daily.

Duration:00:51:38

Ask host to enable sharing for playback control

Data Reliability with Barr Moses and Lior Gavish

6/12/2023
As companies depend more on data to improve digital products and make informed decisions, it’s crucial that the data they use be accurate and reliable. MonteCarlo, the data reliability company, is the creator of the industry’s first end-to-end data observability platform. Barr Moses and Lior Gavish are the founders of Monte Carlo and they join The post Data Reliability with Barr Moses and Lior Gavish appeared first on Software Engineering Daily.

Duration:00:56:22