{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Working with the zea data format\n", "In this tutorial notebook we will show how to load a zea data file and how to access the data stored in it. There are three common ways to load a zea data file:\n", "\n", "1. Loading data from single file with `zea.File`\n", "2. Loading data from a group of files with `zea.Dataset`\n", "3. Loading data in batches with dataloading utilities with `zea.backend.tensorflow.make_dataloader`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/tue-bmd/zea/blob/main/docs/source/notebooks/data/zea_data_example.ipynb)\n", " \n", "[![View on GitHub](https://img.shields.io/badge/GitHub-View%20Source-blue?logo=github)](https://github.com/tue-bmd/zea/blob/main/docs/source/notebooks/data/zea_data_example.ipynb)\n", " \n", "[![Hugging Face dataset](https://img.shields.io/badge/Hugging%20Face-Dataset-yellow?logo=huggingface)](https://huggingface.co/datasets/zeahub/picmus)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%capture\n", "%pip install zea" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "config_picmus_iq = \"hf://zeahub/configs/config_picmus_iq.yaml\"" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [] }, "outputs": [], "source": [ "import os\n", "\n", "os.environ[\"KERAS_BACKEND\"] = \"jax\"\n", "os.environ[\"ZEA_DISABLE_CACHE\"] = \"1\"\n", "os.environ[\"TF_CPP_MIN_LOG_LEVEL\"] = \"3\"" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[38;5;36mzea\u001b[0m\u001b[0m: Using backend 'jax'\n" ] } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "import zea\n", "from zea import init_device, load_file\n", "from zea.visualize import set_mpl_style\n", "from zea.backend.tensorflow import make_dataloader" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will work with the GPU if available, and initialize using `init_device` to pick the best available device. Also, (optionally), we will set the matplotlib style for plotting." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "init_device(verbose=False)\n", "set_mpl_style()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading a file with `zea.File`\n", "The zea data format works with HDF5 files. We can open a zea data file using the `h5py` package and have a look at the contents using the `zea.File.summary()` function. You can see that every dataset element contains a corresponding description and unit. Note that we now pass a url to a Hugging Face dataset, but you can also use a local file path to a zea data file. Here we will use an example from the [PICMUS](https://www.creatis.insa-lyon.fr/Challenge/IEEE_IUS_2016/home) dataset, converted to zea format and hosted on the [Hugging Face Hub](https://huggingface.co/datasets/zeahub/picmus).\n", "\n", "> *Tip:*\n", "> You can also use the [HDFView](https://www.hdfgroup.org/downloads/hdfview/) tool to view the contents of the zea data file without having to run any code. Or if you use VS Code, you can install the [HDF5 extension](https://marketplace.visualstudio.com/items?itemName=h5web.vscode-h5web) to view the contents of the file." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can extract data and acquisition parameters (which are stored together with the data in the zea data file) as follows:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7b3770d770214b7099a5dd055a4b8089", "version_major": 2, "version_minor": 0 }, "text/plain": [ "contrast_speckle_expe_dataset_iq.hdf5: 0%| | 0.00/64.0M [00:00\n", "\n" ] } ], "source": [ "data, scan, probe = load_file(file_path, \"raw_data\", indices=[[0], slice(0, 3)])\n", "\n", "print(\"Raw data shape:\", data.shape)\n", "print(scan)\n", "print(probe)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading data with `zea.Dataset`\n", "We can also load and manage a group of files (i.e. a dataset) using the `zea.Dataset` class. Instead of a path to a single file, we can pass a list of file paths or a directory containing multiple zea data files. The `zea.Dataset` class will automatically load the files and allow you to access the data in a similar way as with `zea.File`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "eb0f34e899b340f0b1d4d8d4a1c670fc", "version_major": 2, "version_minor": 0 }, "text/plain": [ "contrast_speckle_expe_dataset_rf.hdf5: 0%| | 0.00/128M [00:00" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "dataset_path = \"hf://zeahub/camus-sample/val\"\n", "dataloader = make_dataloader(\n", " dataset_path,\n", " key=\"data/image_sc\",\n", " batch_size=4,\n", " shuffle=True,\n", " clip_image_range=[-60, 0],\n", " image_range=[-60, 0],\n", " normalization_range=[0, 1],\n", " image_size=(256, 256),\n", " resize_type=\"resize\", # or \"center_crop or \"random_crop\"\n", " seed=4,\n", ")\n", "\n", "for batch in dataloader:\n", " print(\"Batch shape:\", batch.shape)\n", " break # Just show the first batch\n", "\n", "fig, _ = zea.visualize.plot_image_grid(batch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Processing an example\n", "We will now use one of the zea data files to demonstrate how to process it. A full example can be found in the [zea_pipeline_example](../pipeline/zea_pipeline_example.ipynb) notebook. Here we will just show a simple example for completeness. We will start by loading a config file, that contains all the required information to initiate a processing pipeline." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config = zea.Config.from_path(config_picmus_iq)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can load the zea data file, extract data and parameters, and then process the data using the pipeline defined by the config file." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "325fc3bd98254f7db473e28e45fbbf74", "version_major": 2, "version_minor": 0 }, "text/plain": [ "contrast_speckle_simu_dataset_iq.hdf5: 0%| | 0.00/36.4M [00:00" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "image = zea.display.to_8bit(images[0], dynamic_range=(-50, 0))\n", "plt.figure()\n", "# Convert xlims and zlims from meters to millimeters for display\n", "xlims_mm = [v * 1e3 for v in scan.xlims]\n", "zlims_mm = [v * 1e3 for v in scan.zlims]\n", "plt.imshow(image, cmap=\"gray\", extent=[xlims_mm[0], xlims_mm[1], zlims_mm[1], zlims_mm[0]])\n", "plt.xlabel(\"X (mm)\")\n", "plt.ylabel(\"Z (mm)\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.11" } }, "nbformat": 4, "nbformat_minor": 2 }