fabric

In this guide, firstly I will share my experience trying several approaches to create Parquet files from SQL Server tables, then we’ll explore in detail how to efficiently extract and convert data using PyArrow. All my attemps with the code is available here: https://github.com/Riccardocapelli1/my_blog/tree/main/python My experience It took me several attempts to define the approach that suited me better. Here are mine to generat Parquet files: without clustering the output clustering results with the standard Apache Arrow library clustering results with concurrent threads (using ThreadPoolExecutor() from concurrent) clustering results with with DuckDB Without clustering the output It is effective with small tables (less than 2/3GB size)....

fabric

The fourth way: Ingesting data in Fabric using Parquet files

Two approaches to generate Parquet files from on-prem databases