The Python Data & Science Podcast.__init__-logo

The Python Data & Science Podcast.__init__

Technology Podcasts

The podcast about how the Python language powers work in data and science

The podcast about how the Python language powers work in data and science

Location:

United States

Description:

The podcast about how the Python language powers work in data and science

Language:

English


Episodes

Finding The Core Of Python For A Bright Future With Brett Cannon

6/22/2021
Brett Cannon has been a long-time contributor to the Python language and community in many ways. In this episode he shares some of his work and thoughts on modernizing the ecosystem around the language. This includes standards for packaging, discovering the true core of the language, and how to make it possible to target mobile and web platforms.

Duration:01:13:42

Traversing The Challenges And Promise Of Graph Machine Learning

6/15/2021
The foundation of every ML model is the data that it is trained on. In many cases you will be working with tabular or unstructured information, but there is a growing trend toward networked, or graph data sets. Benedek Rozemberczki has focused his research and career around graph machine learning applications. In this episode he discusses the common sources of networked data, the challenges of working with graph data in machine learning projects, and describes the libraries that he has...

Duration:00:50:06

Keep Your Analytics Lint Free With SQLFluff

6/8/2021
The growth of analytics has accelerated the use of SQL as a first class language. It has also grown the amount of collaboration involved in writing and maintaining SQL queries. With collaboration comes the inevitable variation in how queries are written, both structurally and stylistically which can lead to a significant amount of wasted time and energy during code review and employee onboarding. Alan Cruickshank was feeling the pain of this wasted effort first-hand which led him down the...

Duration:01:10:28

Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch

6/1/2021
Deep learning is gaining an immense amount of popularity due to the incredible results that it is able to offer with comparatively little effort. Because of this there are a number of engineers who are trying their hand at building machine learning models with the wealth of frameworks that are available. Andrew Ferlitsch wrote a book to capture the useful patterns and best practices for building models with deep learning to make it more approachable for newcomers ot the field. In this...

Duration:00:46:49

Automatically Generate Your Unit Tests From Scratch With Pynguin

5/24/2021
Unit tests are an important tool to ensure the proper functioning of your application, but writing them can be a chore. Stephan Lukasczyk wants to reduce the monotony of the process for Python developers. As part of his PhD research he created the Pynguin project to automate the creation of unit tests. In this episode he explains the complexity involved in generating useful tests for a dynamic language, how he has designed Pynguin to address the challenges, and how you can start using it...

Duration:01:00:15

Leveling Up Natural Language Processing with Transfer Learning

5/17/2021
Natural language processing is a powerful tool for extracting insights from large volumes of text. With the growth of the internet and social platforms, and the increasing number of people and communities conducting their professional and personal activities online, the opportunities for NLP to create amazing insights and experiences are endless. In order to work with such a large and growing corpus it has become necessary to move beyond purely statistical methods and embrace the...

Duration:00:48:49

Federated Learning For All With Flower

5/10/2021
Machine learning is a tool that has typically been performed on large volumes of data in one place. As more computing happens at the edge on mobile and low power devices, the learning is being federated which brings a new set of challenges. Daniel Beutel co-created the Flower framework to make federated learning more manageable. In this episode he shares his motivations for starting the project, how you can use it for your own work, and the unique challenges and benefits that this emerging...

Duration:01:02:55

Data Exploration and Visualization Made Effortless with Lux

5/3/2021
Data exploration is an important step in any analysis or machine learning project. Visualizing the data that you are working with makes that exploration faster and more effective, but having to remember and write all of the code to build a scatter plot or histogram is tedious and time consuming. In order to eliminate that friction Doris Lee helped create the Lux project, which wraps your Pandas data frame and automatically generates a set of visualizations without you having to lift a...

Duration:00:51:18

Extensible Open Source Authorization For Your Applications With Oso

4/26/2021
Any project that is used by more than one person will eventually need to handle permissions for each of those users. It is certainly possible to write that logic yourself, but you'll almost certainly do it wrong at least once. Rather than waste your time fighting with bugs in your authorization code it makes sense to use a well-maintained library that has already made and fixed all of the mistakes so that you don't have to. In this episode Sam Scott shares the Oso framework to give you a...

Duration:00:54:37

Teaching Geeks The Value And Skills Of Public Speaking

4/19/2021
Being able to present your ideas is one of the most valuable and powerful skills to have as a professional, regardless of your industry. For software engineers it is especially important to be able to communicate clearly and effectively because of the detail-oriented nature of the work. Unfortunately, many people who work in software are more comfortable in front of the keyboard than a crowd. In this episode Neil Thompson shares his story of being an accidental public speaker and how he is...

Duration:00:45:32

Let The Robots Do The Work Using Robotic Process Automation with Robocorp

4/12/2021
One of the great promises of computers is that they will make our work faster and easier, so why do we all spend so much time manually copying data from websites, or entering information into web forms, or any of the other tedious tasks that take up our time? As developers our first inclination is to "just write a script" to automate things, but how do you share that with your non-technical co-workers? In this episode Antti Karjalainen, CEO and co-founder of Robocorp, explains how Robotic...

Duration:00:53:01

Keep Your Code Clean And Maintainable Using Static Analysis With Flake8

4/5/2021
When you are writing code it is all to easy to introduce subtle bugs or leave behind unused code. Unused variables, unused imports, overly complex logic, etc. If you are careful and diligent you can find these problems yourself, but isn't that what computers are supposed to help you with? Thankfully Python has a wealth of tools that will work with you to keep your code clean and maintainable. In this episode Anthony Sottile explores Flake8, one of the most popular options for identifying...

Duration:00:56:19

Make Your Code More Readable With The Magic Of Refactoring Using Sourcery

3/29/2021
Writing code that is easy to read and understand will have a lasting impact on you and your teammates over the life of a project. Sometimes it can be difficult to identify opportunities for simplifying a block of code, especially if you are early in your journey as a developer. If you work with senior engineers they can help by pointing out ways to refactor your code to be more readable, but they aren't always available. Brendan Maginnis and Nick Thapen created Sourcery to act as a full time...

Duration:01:06:43

Be Data Driven At Any Scale With Superset

3/22/2021
Becoming data driven is the stated goal of a large and growing number of organizations. In order to achieve that mission they need a reliable and scalable method of accessing and analyzing the data that they have. While business intelligence solutions have been around for ages, they don't all work well with the systems that we rely on today and a majority of them are not open source. Superset is a Python powered platform for exploring your data and building rich interactive dashboards that...

Duration:00:55:08

Practical Advice On Using Python To Power A Business

3/15/2021
Python is a language that is used in almost every imaginable context and by people from an amazing range of backgrounds. A lot of the people who use it wouldn't even call themselves programmers, because that is not the primary focus of their job. In this episode Chris Moffitt shares his experience writing Python as a business user. In order to share his insights and help others who have run up against the limits of Excel he maintains the site Practical Business Python where he publishes...

Duration:01:04:30

Analyzing The Ecosystem of Python Data Companies With Tony Liu

3/8/2021
There are a large and growing number of businesses built by and for data science and machine learning teams that rely on Python. Tony Liu is a venture investor who is following that market closely and betting on its continued success. In this episode he shares his own journey into the role of an investor and discusses what he is most excited about in the industry. He also explains what he looks at when investing in a business and gives advice on what potential founders and early employees of...

Duration:00:39:31

Go From Notebook To Pipeline For Your Data Science Projects With Orchest

3/1/2021
Jupyter notebooks are a dominant tool for data scientists, but they lack a number of conveniences for building reusable and maintainable systems. For machine learning projects in particular there is a need for being able to pivot from exploring a particular dataset or problem to integrating that solution into a larger workflow. Rick Lamers and Yannick Perrenet were tired of struggling with one-off solutions when they created the Orchest platform. In this episode they explain how Orchest...

Duration:00:41:20

Write Your Python Scripts In A Flow Based Visual Editor With Ryven

2/22/2021
When you are writing a script it can become unwieldy to understand how the logic and data are flowing through the program. To make this easier to follow you can use a flow-based approach to building your programs. Leonn Thomm created the Ryven project as an environment for visually constructing a flow-based program. In this episode he shares his inspiration for creating the Ryven project, how it changes the way you think about program design, how Ryven is implemented, and how to get started...

Duration:00:40:04

CrossHair: Your Automatic Pair Programmer

2/15/2021
One of the perennial challenges in software engineering is to reduce the opportunity for bugs to creep into the system. Some of the tools in our arsenal that help in this endeavor include rich type systems, static analysis, writing tests, well defined interfaces, and linting. Phillip Schanely created the CrossHair project in order to add another ally in the fight against broken code. It sits somewhere between type systems, automated test generation, and static analysis. In this episode he...

Duration:00:44:11

Giving Your Data Science Projects And Teams A Home At DagsHub

2/8/2021
Collaborating on software projects is largely a solved problem, with a variety of hosted or self-managed platforms to choose from. For data science projects, collaboration is still an open question. There are a number of projects that aim to bring collaboration to data science, but they are all solving a different aspect of the problem. Dean Pleban and Guy Smoilovsky created DagsHub to give individuals and teams a place to store and version their code, data, and models. In this episode they...

Duration:01:02:15