Datacast
Datacast
Episode 77: Delivering Modern Data Engineering with Einat Orr
0:00
-1:15:49

Episode 77: Delivering Modern Data Engineering with Einat Orr

Timestamps

  • (1:33) Einat described her experience getting Bachelor’s, Master’s, and Ph.D. degrees in Mathematics from Tel Aviv University in the 90s and early 2000s.

  • (4:01) Einat went over her Ph.D. thesis on approximation algorithms for clustering problems.

  • (6:17) Einat discussed working as an algorithm developer for Compugen while being a Ph.D. student.

  • (8:43) Einat went over projects she contributed to as a senior algorithm developer at Flash Networks back in 2005.

  • (11:50) Einat mentioned achievements and lessons learned from her time as the VP of R&D at Correlix.

  • (17:51) Einat recalled lessons from hiring engineering talent at Correlix.

  • (19:24) Einat unpacked the engineering challenges of building SimilarWeb, a platform that gives a true 360-degree view of all digital activity across customers, prospects, partners, and competition.

  • (24:29) Einat discussed the responsibilities of her role as the CTO of SimilarWeb.

  • (27:40) Einat shared the founding story of Treeverse, whose mission is to simplify the lives of data engineers, data scientists, and data analysts who are transforming the world with data.

  • (29:52) Einat explained the pain points of working with the data lake architecture and the vision that lakeFS is built upon.

  • (34:31) Einat emphasized the importance of asking good questions to extract insights about customers’ pain points.

  • (37:57) Einat explained why data versioning-as-an-Infrastructure matters.

  • (42:28) Einat shared the challenges of incorporating data mesh to develop a data-intensive application.

  • (46:33) Einat provided her take on how to ensure data quality in a data lake environment.

  • (51:02) Einat discussed roadmap prioritization for an open-source project.

  • (52:08) Einat went over the opportunities with the metadata store, data quality, compute, and data discovery components within the data engineering ecosystem.

  • (55:03) Einat captured the three trends on how the data engineering landscape might look in the near future.

  • (01:00:59) Einat emphasized the role of open-source development in the data tooling ecosystem.

  • (01:04:14) Einat fleshed out the recommended pricing strategy for open-source developers.

  • (01:06:09) Einat revisited how lakeFS got started thanks to the Go community and evolved.

  • (01:08:01) Einat shared valuable hiring lessons learned at Treeverse.

  • (01:10:05) Einat described the state of the data community in Israel.

  • (01:11:49) Closing segment.

Einat’s Contact Info

Mentioned Content

lakeFS

Blog Posts

People

  • Ali Ghodsi (Co-Creator of Apache Spark, Co-Founder and CEO of Databricks)

  • Shay Banon (Co-Founder and CEO of Elastic)

  • Gwen Shapira (Engineering Leader at Confluent)

Book

Notes

My conversation with Einat was recorded back in April 2021. Since the podcast was recorded, a lot has happened at Treeverse! I’d recommend:

About the show

Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.

Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing khanhle.1013@gmail.com.

Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:

If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.

0 Comments