-

Dremio Team Authoring O’Reilly’s Definitive Guide on Apache Iceberg, Only Book of Its Kind

With initial chapter in preview, forthcoming O’Reilly book dives deep into Apache Iceberg’s uses and innovations, helps data engineers and architects understand Iceberg soup to nuts

Round-up from Subsurface LIVE keynote and panels highlights Iceberg’s meteoric rise as the open table format standard for big data analytics—enabling data lakehouse architectures

SANTA CLARA, Calif.--(BUSINESS WIRE)--Dremio, the easy and open data lakehouse, today announced the early release of its forthcoming O’Reilly book, Apache Iceberg: The Definitive Guide | Data Lakehouse Functionality, Performance, and Scalability on the Data Lake, which features the opening chapter in preview, with more to come. As the only book on Apache Iceberg, the comprehensive tome will include lessons for achieving interactive, batch, machine learning and streaming analytics, all without duplicating data into many different proprietary systems and formats.

Apache Iceberg provides the capabilities, performance, scalability and savings that fulfill the promise of an open data lakehouse. The recent Subsurface LIVE conference from Dremio highlighted Apache Iceberg’s emergence as the high performance open table format for various analytical workloads and the clear standard that is enabling data lakehouse architecture.

“Open source Apache Iceberg, which originated in Netflix engineering, continues to evolve as the preferred option for enterprises creating a modern data infrastructure,” said Dremio Developer Advocate Dipankar Mazumdar, one of the book’s authors. “This O’Reilly book will cover the architecture of Iceberg, optimization techniques and hands-on, real-world use cases with Iceberg, as well as how to use Iceberg with popular compute engines such as Apache Spark, Apache Flink and Dremio Sonar. We’re excited to share the book’s preview.” The other authors include Dremio CPO Tomer Shiran, Director of Technical Advocacy Jason Hughes, and Developer Advocate Alex Merced.

Apache Iceberg was also the star of Subsurface LIVE 2023. The conference from Dremio, which recently wrapped, featured a keynote address and panels with technology leaders from companies such as Apple, Uber, Wayfair, Pinterest, Shell, Shopify and many others.

Keynote – The Year of the Data Lakehouse: Enabling data lakehouse architecture, Apache Iceberg has grown at a tremendous rate with almost 40M downloads to emerge as the de facto standard open table format. Iceberg is currently supported by Amazon, Snowflake, Google, Tabular and Dremio, among others. A rapid rate of innovation is guaranteed by having the diverse community Iceberg now boasts.

Panel – The State of Apache Iceberg: Iceberg developers shared exciting use cases enabled by new features and discussed the future of the project. Panel participants also explained the benefits of open table formats, scalability and cost savings through Apache Iceberg, as well as the state of the community and adoption.

“The driver is about bringing ACID to the data lake, bringing better confidence in the data that we have there,” said Tony Baer, recognized data industry expert and principal of dbInsight, who moderated the panel. “Iceberg has been rapidly gaining industry support and will become one of the major lakehouse table formats to make the cut.”

To learn more about data lakehouse performance with Apache Iceberg, check out Dremio’s blogs:

About Dremio

Dremio is the easy and open data lakehouse, providing self-service analytics with data warehouse functionality and data lake flexibility across all of your data. Use Dremio's lightning-fast SQL query service and any other processing engine on the same data. Dremio increases agility with a revolutionary data-as-code approach that enables Git-like data experimentation, version control, and governance. In addition, Dremio eliminates data silos by enabling queries across data lakes, databases, and data warehouses, and by simplifying ingestion into the lakehouse. Dremio's fully managed service helps organizations get started with analytics in minutes, and automatically optimizes data for every workload. As the original creator of Apache Arrow and committed to Arrow and Iceberg’s community-driven standards, Dremio is on a mission to reinvent SQL for data lakes and meet customers where they are on their lakehouse journey.

Hundreds of global enterprises like JPMorgan Chase, Microsoft, Regeneron, and Allianz Global Investors use Dremio to deliver self-service analytics on the data lakehouse. Founded in 2015, Dremio is headquartered in Santa Clara. CNBC recognized Dremio as a Top Startup for the Enterprise and Deloitte named Dremio to its 2022 Technology Fast 500. To learn more, follow the company on Github, LinkedIn, Twitter, and Facebook, or visit www.dremio.com.

Contacts

Dremio


Release Versions

Contacts

More News From Dremio

Dremio Announces General Availability on Microsoft Azure

SANTA CLARA, Calif.--(BUSINESS WIRE)--Dremio, the unified lakehouse platform for self-service analytics, today announced the general availability of Dremio Cloud on Microsoft Azure. This SaaS solution brings users closer to their data with lakehouse flexibility, scalability, and performance at a fraction of the cost of traditional data warehouses. Dremio Cloud’s intuitive unified analytics, high-performance SQL query engine, and lakehouse management service for next-gen dataops let organization...

Dremio All In With Achievements Driving Customer Value in 2024 and Beyond

SANTA CLARA, Calif.--(BUSINESS WIRE)--Dremio, the unified lakehouse platform for self-service analytics and AI, today announced significant milestones that are shaping how it drives value for customers and accelerates enterprise decision-making in 2024. Dremio has focused its innovations, achievements, and leadership to ensure customers enjoy easy self-service analytics—with data warehouse functionality and data lake flexibility—across all of their data. AI and product innovation delivering fas...

Dremio Cloud Now Available for Microsoft Azure

SANTA CLARA, Calif.--(BUSINESS WIRE)--Dremio, the easy and open data lakehouse, today announced the public preview of Dremio Cloud for Microsoft Azure. This SaaS solution offers companies self-service analytics coupled with data warehouse functionality and the flexibility of a data lake, all within an environment characterized by rapid setup and deployment, automatic upgrades, scalability, and advanced data lakehouse management features. Dremio Cloud enables companies to rapidly drive value fro...
Back to Newsroom