Creating Iceberg tables from a local notebook

Iceberg's open table format allows you to read and write Foundry Iceberg tables using external engines.

The below code example uses PyIceberg ↗ to create a Foundry table from a Jupyter® notebook running on your computer. You can create a Foundry table with any external engine that supports Iceberg REST catalogs.

Copied!1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from pyiceberg.catalog import load_rest
from getpass import getpass
import pyarrow.parquet as pq

# Create catalog client to create, load, and explore Iceberg tables in Foundry
catalog = load_rest(
    'foundry',
    {
        'uri': 'https://<your_foundry_url>/iceberg',
        'token': getpass('Foundry token:')
    }
)

# Read local Parquet file into Arrow table
df = pq.read_table('/<local_filepath>/example_data.parquet')

# Create a new Iceberg table in Foundry
table = catalog.create_table(
    'Namespace.Project.Folder.example_data',
    schema = df.schema
)

# List Iceberg tables - your new empty Foundry table will appear
catalog.list_tables('Namespace.Project.Folder.')

# Use `append` to insert the local PyArrow table into the Foundry Iceberg table
table.append(df)

# Use `scan()` to load the Iceberg table from Foundry - for example to read into a Pandas dataframe
table.scan().to_pandas()

Identifiers in PyIceberg and SQL more broadly are dot-separated. Foundry honors this convention in mapping Iceberg namespaces to Compass paths. For example, an Iceberg namespace identifier Namespace.Project.Dir.Table maps to a Compass path Namespace/Project/Dir/Table.

Jupyter®, JupyterLab®, and the Jupyter® logos are trademarks or registered trademarks of NumFOCUS.

All third-party trademarks (including logos and icons) referenced remain the property of their respective owners. No affiliation or endorsement is implied.

←

PREVIOUSJupyter® in Code Workspaces

NEXTAuthenticating Iceberg clients

→