{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# How to: Create and Use a ModelNode\n",
    "\n",
    "A `ModelNode` is a computational node that wraps a machine-learning model for use within a `ModelGraph`. It:\n",
    "\n",
    "- Wraps a backend-specific model (PyTorch, TensorFlow, or scikit-learn) as a `BaseModel`\n",
    "- Receives data from a single upstream source (`FeatureSet` or another `ModelNode`)\n",
    "- Optionally holds an `Optimizer` for standalone training\n",
    "- Is composed into a `ModelGraph`, which is then used in an `Experiment`\n",
    "\n",
    "> **Note:** Users typically interact with `Experiment` at the top level. `ModelNode` is the\n",
    "> building block that `ModelGraph` orchestrates. This guide covers the full `ModelNode` API\n",
    "> for users who need fine-grained control.\n",
    "\n",
    "This notebook covers:\n",
    "\n",
    "- {ref}`02-create-modelnode-the-model-hierarchy`\n",
    "- {ref}`02-create-modelnode-built-in-models`\n",
    "- {ref}`02-create-modelnode-wrapping-custom-pytorch-models`\n",
    "- {ref}`02-create-modelnode-scikit-learn-models`\n",
    "- {ref}`02-create-modelnode-creating-a-modelnode`\n",
    "- {ref}`02-create-modelnode-the-optimizer`\n",
    "- {ref}`02-create-modelnode-building-and-running-a-modelnode`\n",
    "- {ref}`02-create-modelnode-chaining-nodes`\n",
    "- {ref}`02-create-modelnode-freezing-and-unfreezing`\n",
    "- {ref}`02-create-modelnode-serialization`\n",
    "- {ref}`02-create-modelnode-summary`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import torch\n",
    "\n",
    "from modularml import Experiment, FeatureSet, ModelNode, Optimizer\n",
    "\n",
    "# Note that we don't need to explicitly create an Experiment right away\n",
    "# We do it here so we can disable the warning raise when creating multiple\n",
    "# nodes with the same name (`registration_policy` is what controls this).\n",
    "exp = Experiment(label=\"create_modelnode\", registration_policy=\"overwrite\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2",
   "metadata": {},
   "source": [
    "We'll use a simple synthetic dataset throughout this notebook: 500 samples of a 10-point voltage signal with a scalar state-of-health (SOH) target."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3",
   "metadata": {},
   "outputs": [],
   "source": [
    "rng = np.random.default_rng(42)\n",
    "\n",
    "fs = FeatureSet.from_dict(\n",
    "    label=\"SensorData\",\n",
    "    data={\n",
    "        \"voltage\": list(rng.standard_normal((500, 10))),\n",
    "        \"soh\": list(rng.standard_normal((500, 1))),\n",
    "    },\n",
    "    feature_keys=\"voltage\",\n",
    "    target_keys=\"soh\",\n",
    ")\n",
    "print(fs)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4",
   "metadata": {},
   "source": [
    "Since FeatureSet can contains more columns than wanted for a certain models inputs, we need to specify which columns are intended to by input to the model.\n",
    "\n",
    "This is done with the `.reference()` method on FeatureSets.\n",
    "Going forward, our models will be trained on only the `voltage` feature, and estimate only the `soh` target."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5",
   "metadata": {},
   "outputs": [],
   "source": [
    "fs_ref = fs.reference(features=\"voltage\", targets=\"soh\")\n",
    "print(fs_ref)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7",
   "metadata": {},
   "source": [
    "(02-create-modelnode-the-model-hierarchy)=\n",
    "## The Model Hierarchy"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8",
   "metadata": {},
   "source": [
    "\n",
    "Before creating a `ModelNode`, it helps to understand the model abstraction layers:\n",
    "\n",
    "```\n",
    "BaseModel (abstract)\n",
    "├── TorchBaseModel          # Base for built-in PyTorch models\n",
    "│   ├── SequentialMLP       # Built-in MLP\n",
    "│   └── SequentialCNN       # Built-in 1D CNN\n",
    "├── TorchModelWrapper       # Wraps any torch.nn.Module\n",
    "├── TensorflowModelWrapper  # Wraps any tf.keras.Model\n",
    "└── ScikitModelWrapper      # Wraps any sklearn.BaseEstimator\n",
    "```\n",
    "\n",
    "All models used in a `ModelNode` must conform to the `BaseModel` interface. You can either:\n",
    "1. Use a **built-in model** (e.g., `SequentialMLP`)\n",
    "2. **Wrap** your own model with `TorchModelWrapper`, `TensorflowModelWrapper`, or `ScikitModelWrapper`\n",
    "3. Pass a raw model directly — it will be **auto-wrapped** via `wrap_model()`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10",
   "metadata": {},
   "source": [
    "(02-create-modelnode-built-in-models)=\n",
    "## Built-In Models"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11",
   "metadata": {},
   "source": [
    "\n",
    "ModularML provides ready-to-use model architectures in `modularml.models`. \n",
    "There are currently only built-in models using the PyTorch backend `modularml.models.torch`.\n",
    "More will be added soon.\n",
    "\n",
    "These built in models inherit from `BaseModel` and support lazy shape inference; you can provide shapes at construction time or let `ModelGraph.build()` infer them automatically."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12",
   "metadata": {},
   "source": [
    "### SequentialMLP\n",
    "\n",
    "A configurable multi-layer perceptron. Inputs are flattened, passed through `n_layers`\n",
    "fully-connected layers with activation and optional dropout, then reshaped to `output_shape`.\n",
    "\n",
    "| Parameter | Type | Default | Description |\n",
    "|-----------|------|---------|-------------|\n",
    "| `input_shape` | `tuple[int, ...]` | `None` | Input shape (no batch dim). Inferred at build if `None`. |\n",
    "| `output_shape` | `tuple[int, ...]` | `None` | Output shape (no batch dim). Inferred at build if `None`. |\n",
    "| `n_layers` | `int` | `2` | Number of linear layers. |\n",
    "| `hidden_dim` | `int` | `32` | Hidden units per layer. |\n",
    "| `activation` | `str` | `\"relu\"` | Activation function (`\"relu\"`, `\"gelu\"`, `\"tanh\"`, etc.). |\n",
    "| `dropout` | `float` | `0.0` | Dropout rate (0 = no dropout). |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13",
   "metadata": {},
   "outputs": [],
   "source": [
    "from modularml.models.torch import SequentialMLP\n",
    "\n",
    "# Option A: Provide both shapes up front (builds immediately)\n",
    "mlp_eager = SequentialMLP(\n",
    "    input_shape=(1, 10),\n",
    "    output_shape=(1, 1),\n",
    "    n_layers=3,\n",
    "    hidden_dim=64,\n",
    "    activation=\"gelu\",\n",
    "    dropout=0.1,\n",
    ")\n",
    "print(f\"Eager MLP built: {mlp_eager.is_built}\")\n",
    "print(f\"  input_shape:  {mlp_eager.input_shape}\")\n",
    "print(f\"  output_shape: {mlp_eager.output_shape}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Option B: Defer shapes (lazy build - ModelGraph.build() will fill them in)\n",
    "mlp_lazy = SequentialMLP(\n",
    "    output_shape=(1, 1),\n",
    "    n_layers=2,\n",
    "    hidden_dim=32,\n",
    ")\n",
    "print(f\"Lazy MLP built: {mlp_lazy.is_built}\")\n",
    "print(f\"  input_shape:  {mlp_lazy.input_shape}\")\n",
    "print(f\"  output_shape: {mlp_lazy.output_shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15",
   "metadata": {},
   "source": [
    "### SequentialCNN\n",
    "\n",
    "A 1D convolutional network.\n",
    "Stacks `Conv1d` layers with optional pooling, dropout, and a final linear projection to `output_shape`.\n",
    "\n",
    "| Parameter | Type | Default | Description |\n",
    "|-----------|------|---------|-------------|\n",
    "| `input_shape` | `tuple[int, ...]` | `None` | Input shape as `(num_channels, length)`. |\n",
    "| `output_shape` | `tuple[int, ...]` | `None` | Output shape (no batch dim). |\n",
    "| `n_layers` | `int` | `2` | Number of Conv1d layers. |\n",
    "| `hidden_dim` | `int` | `16` | Output channels per Conv1d layer. |\n",
    "| `kernel_size` | `int` | `3` | Convolution kernel size. |\n",
    "| `padding` | `int` | `1` | Convolution padding. |\n",
    "| `stride` | `int` | `1` | Convolution stride. |\n",
    "| `activation` | `str` | `\"relu\"` | Activation function. |\n",
    "| `dropout` | `float` | `0.0` | Dropout rate. |\n",
    "| `pooling` | `int` | `1` | MaxPool1d kernel size (1 = no pooling). |\n",
    "| `flatten_output` | `bool` | `True` | Whether to flatten and project to output shape. |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16",
   "metadata": {},
   "outputs": [],
   "source": [
    "from modularml.models.torch import SequentialCNN\n",
    "\n",
    "cnn = SequentialCNN(\n",
    "    input_shape=(1, 10),  # 1 channel, 10-length signal\n",
    "    output_shape=(1, 1),\n",
    "    n_layers=2,\n",
    "    hidden_dim=16,\n",
    "    kernel_size=3,\n",
    "    pooling=2,\n",
    ")\n",
    "print(f\"CNN built: {cnn.is_built}\")\n",
    "print(f\"  input_shape:  {cnn.input_shape}\")\n",
    "print(f\"  output_shape: {cnn.output_shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18",
   "metadata": {},
   "source": [
    "(02-create-modelnode-wrapping-custom-pytorch-models)=\n",
    "## Wrapping Custom PyTorch Models"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19",
   "metadata": {},
   "source": [
    "\n",
    "For models not provided by ModularML, use `TorchModelWrapper` to wrap any `torch.nn.Module`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20",
   "metadata": {},
   "source": [
    "### Wrapping an Instantiated Model\n",
    "\n",
    "If you already have a constructed `torch.nn.Module`, pass it directly. The wrapper\n",
    "validates input/output shapes with a dummy forward pass during `build()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21",
   "metadata": {},
   "outputs": [],
   "source": [
    "from modularml.core.models import TorchModelWrapper\n",
    "\n",
    "\n",
    "# Define a custom PyTorch model\n",
    "class MyEncoder(torch.nn.Module):\n",
    "    def __init__(self, in_features, out_features):\n",
    "        super().__init__()\n",
    "        # Storing constructor args as same-named attributes\n",
    "        # allows TorchModelWrapper to auto-infer them for serialization.\n",
    "        self.in_features = in_features\n",
    "        self.out_features = out_features\n",
    "        self.fc = torch.nn.Linear(in_features, out_features)\n",
    "\n",
    "    def forward(self, x):\n",
    "        return self.fc(x)\n",
    "\n",
    "\n",
    "# Wrap an already-instantiated model\n",
    "raw_model = MyEncoder(in_features=10, out_features=4)\n",
    "wrapped = TorchModelWrapper(model=raw_model)\n",
    "\n",
    "print(f\"Wrapped model built: {wrapped.is_built}\")\n",
    "print(f\"  backend: {wrapped.backend}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "22",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Build to validate shapes\n",
    "wrapped.build(input_shape=(10,), output_shape=(4,))\n",
    "print(f\"After build: {wrapped.is_built}\")\n",
    "print(f\"  input_shape:  {wrapped.input_shape}\")\n",
    "print(f\"  output_shape: {wrapped.output_shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23",
   "metadata": {},
   "source": [
    "Note that custom models cannot be serialized unless they are defined in a separate Python file.\n",
    "\n",
    "Example use case:\n",
    "```python\n",
    "    from my_scipt import MyModel\n",
    "\n",
    "    model = MyModel(...)\n",
    "```\n",
    "\n",
    "More details on this are provided in the [Serialization]`serialization` section."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24",
   "metadata": {},
   "source": [
    "### Lazy Construction from a Class\n",
    "\n",
    "If you want `ModelGraph.build()` to handle instantiation (injecting the correct\n",
    "`input_shape` and `output_shape`), pass a class and kwargs instead."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "25",
   "metadata": {},
   "outputs": [],
   "source": [
    "lazy_wrapped = TorchModelWrapper(\n",
    "    model_class=MyEncoder,\n",
    "    model_kwargs={\"in_features\": 10, \"out_features\": 4},\n",
    ")\n",
    "print(f\"Lazy wrapped built: {lazy_wrapped.is_built}\")\n",
    "\n",
    "# Build later (or let ModelGraph do it)\n",
    "lazy_wrapped.build(input_shape=(10,), output_shape=(4,))\n",
    "print(f\"After build: {lazy_wrapped.is_built}\")\n",
    "print(f\"  output_shape: {lazy_wrapped.output_shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "26",
   "metadata": {},
   "source": [
    "### Injecting Shapes into Custom Constructors\n",
    "\n",
    "By default, `TorchModelWrapper` injects the inferred `input_shape` and `output_shape`\n",
    "into your model class constructor during lazy build. If your constructor uses different\n",
    "parameter names, specify them with `inject_input_shape_as` and `inject_output_shape_as`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "27",
   "metadata": {},
   "outputs": [],
   "source": [
    "class CustomModel(torch.nn.Module):\n",
    "    \"\"\"A model whose constructor uses non-standard shape parameter names.\"\"\"\n",
    "\n",
    "    def __init__(self, in_shape, out_shape):\n",
    "        super().__init__()\n",
    "        self.in_shape = in_shape\n",
    "        self.out_shape = out_shape\n",
    "        self.fc = torch.nn.Linear(int(np.prod(in_shape)), int(np.prod(out_shape)))\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = x.view(x.size(0), -1)\n",
    "        x = self.fc(x)\n",
    "        return x.view(x.size(0), *self.out_shape)\n",
    "\n",
    "\n",
    "# Tell the wrapper to inject shapes using your parameter names\n",
    "custom_wrapped = TorchModelWrapper(\n",
    "    model_class=CustomModel,\n",
    "    model_kwargs={},\n",
    "    inject_input_shape_as=\"in_shape\",\n",
    "    inject_output_shape_as=\"out_shape\",\n",
    ")\n",
    "custom_wrapped.build(input_shape=(10,), output_shape=(4,))\n",
    "print(f\"Custom wrapped output_shape: {custom_wrapped.output_shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "28",
   "metadata": {},
   "source": [
    "### Auto-Wrapping with `wrap_model()`\n",
    "\n",
    "When you pass a raw `torch.nn.Module` directly to `ModelNode`, it is automatically\n",
    "wrapped via `wrap_model()`. This is the simplest path but offers less control over\n",
    "serialization and shape injection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "29",
   "metadata": {},
   "outputs": [],
   "source": [
    "from modularml.core.models import wrap_model\n",
    "\n",
    "raw_module = MyEncoder(in_features=10, out_features=4)\n",
    "auto_wrapped = wrap_model(raw_module)\n",
    "\n",
    "print(f\"Type: {type(auto_wrapped).__name__}\")\n",
    "print(f\"Backend: {auto_wrapped.backend}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "31",
   "metadata": {},
   "source": [
    "(02-create-modelnode-scikit-learn-models)=\n",
    "## Scikit-Learn Models"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "32",
   "metadata": {},
   "source": [
    "The `ScikitModelWrapper` wraps any `sklearn.base.BaseEstimator` for use in a `ModelNode`.\n",
    "It supports both batch-fit models (e.g., `RandomForestRegressor`) and incremental models\n",
    "(e.g., `SGDRegressor`) via the `training_mode` parameter.\n",
    "\n",
    "| Parameter | Type | Default | Description |\n",
    "|-----------|------|---------|-------------|\n",
    "| `model` | `BaseEstimator` | (required) | A scikit-learn estimator instance. |\n",
    "| `training_mode` | `str` | `\"auto\"` | `\"auto\"`, `\"partial_fit\"`, or `\"batch_fit\"`. |\n",
    "| `output_method` | `str` | `\"auto\"` | `\"auto\"`, `\"predict\"`, `\"predict_proba\"`, or `\"decision_function\"`. |\n",
    "| `partial_fit_kwargs` | `dict` | `None` | Extra kwargs passed to every `partial_fit()` call (e.g., `{\"classes\": [0, 1]}`). |"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "33",
   "metadata": {},
   "source": [
    "### Batch-Fit Models\n",
    "\n",
    "Most scikit-learn models are trained on the full dataset at once. These are used with\n",
    "`FitPhase` in an `Experiment` rather than `TrainPhase`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.ensemble import RandomForestRegressor\n",
    "\n",
    "from modularml.core.models import ScikitModelWrapper\n",
    "\n",
    "sklearn_model = ScikitModelWrapper(\n",
    "    model=RandomForestRegressor(n_estimators=50, random_state=42),\n",
    ")\n",
    "\n",
    "print(f\"Backend: {sklearn_model.backend}\")\n",
    "print(f\"Supports partial_fit: {sklearn_model.supports_partial_fit}\")\n",
    "print(f\"Training mode: {sklearn_model.resolved_training_mode}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35",
   "metadata": {},
   "source": [
    "### Incremental Models\n",
    "\n",
    "Models that support `partial_fit()` can be used with `TrainPhase` for mini-batch\n",
    "training, similar to neural network workflows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "36",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.linear_model import SGDRegressor\n",
    "\n",
    "incremental_model = ScikitModelWrapper(\n",
    "    model=SGDRegressor(random_state=42),\n",
    ")\n",
    "\n",
    "print(f\"Supports partial_fit: {incremental_model.supports_partial_fit}\")\n",
    "print(f\"Training mode: {incremental_model.resolved_training_mode}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37",
   "metadata": {},
   "source": [
    "### Auto-Wrapping\n",
    "\n",
    "Like PyTorch models, raw scikit-learn estimators passed to `ModelNode` are automatically\n",
    "wrapped via `wrap_model()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "38",
   "metadata": {},
   "outputs": [],
   "source": [
    "auto_sklearn = wrap_model(RandomForestRegressor(n_estimators=10))\n",
    "print(f\"Type: {type(auto_sklearn).__name__}\")\n",
    "print(f\"Backend: {auto_sklearn.backend}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "40",
   "metadata": {},
   "source": [
    "(02-create-modelnode-creating-a-modelnode)=\n",
    "## Creating a ModelNode"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41",
   "metadata": {},
   "source": [
    "A `ModelNode` combines a model with an upstream data source and an optional optimizer.\n",
    "\n",
    "```python\n",
    "    ModelNode(\n",
    "        label: str,\n",
    "        model: BaseModel | Any,\n",
    "        upstream_ref: ExperimentNode | ExperimentNodeReference,\n",
    "        optimizer: Optimizer | None = None,\n",
    "    )\n",
    "```\n",
    "\n",
    "| Parameter | Description |\n",
    "|-----------|-------------|\n",
    "| `label` | Unique name for this node within the graph. |\n",
    "| `model` | A `BaseModel` instance, or any raw model (auto-wrapped via `wrap_model()`). |\n",
    "| `upstream_ref` | The data source: a `FeatureSetReference`, `FeatureSet`, or another `ModelNode`. |\n",
    "| `optimizer` | Optional `Optimizer` for standalone training. When using `ModelGraph`, the graph optimizer is typically used instead. |"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "42",
   "metadata": {},
   "source": [
    "### With a Built-In Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "43",
   "metadata": {},
   "outputs": [],
   "source": [
    "node_mlp = ModelNode(\n",
    "    label=\"MyMLP\",\n",
    "    model=SequentialMLP(output_shape=(1, 1), n_layers=2, hidden_dim=32),\n",
    "    upstream_ref=fs_ref,\n",
    ")\n",
    "print(node_mlp)\n",
    "print(f\"  is_built: {node_mlp.is_built}\")\n",
    "print(f\"  backend:  {node_mlp.backend}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "44",
   "metadata": {},
   "source": [
    "### With a Custom `torch.nn.Module` (Auto-Wrapped)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "45",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Pass a raw torch.nn.Module - it is automatically wrapped by wrap_model()\n",
    "node_custom = ModelNode(\n",
    "    label=\"MyCustomEncoder\",\n",
    "    model=MyEncoder(in_features=10, out_features=4),\n",
    "    upstream_ref=fs_ref,\n",
    ")\n",
    "print(node_custom)\n",
    "print(f\"  model type: {type(node_custom.model).__name__}\")\n",
    "print(f\"  backend:    {node_custom.backend}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "46",
   "metadata": {},
   "source": [
    "### With a Scikit-Learn Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "47",
   "metadata": {},
   "outputs": [],
   "source": [
    "node_rf = ModelNode(\n",
    "    label=\"RandomForest\",\n",
    "    model=RandomForestRegressor(n_estimators=50, random_state=42),\n",
    "    upstream_ref=fs_ref,\n",
    ")\n",
    "print(node_rf)\n",
    "print(f\"  model type: {type(node_rf.model).__name__}\")\n",
    "print(f\"  backend:    {node_rf.backend}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "48",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49",
   "metadata": {},
   "source": [
    "(02-create-modelnode-the-optimizer)=\n",
    "## The Optimizer"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50",
   "metadata": {},
   "source": [
    "\n",
    "The `Optimizer` class wraps backend-specific optimizers with a consistent API.\n",
    "\n",
    "```python\n",
    "    Optimizer(\n",
    "        opt: str | type | None = None,\n",
    "        *,\n",
    "        opt_kwargs: dict[str, Any] | None = None,\n",
    "        factory: Callable | None = None,\n",
    "        backend: Backend | None = None,\n",
    "    )\n",
    "```\n",
    "\n",
    "There are three ways to specify the optimizer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "51",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 1. By name string (most common)\n",
    "opt_by_name = Optimizer(\"adam\", opt_kwargs={\"lr\": 1e-3}, backend=\"torch\")\n",
    "\n",
    "# 2. By optimizer class\n",
    "opt_by_class = Optimizer(\n",
    "    torch.optim.AdamW,\n",
    "    opt_kwargs={\"lr\": 1e-3, \"weight_decay\": 1e-4},\n",
    ")\n",
    "\n",
    "# 3. By factory callable\n",
    "opt_by_factory = Optimizer(\n",
    "    factory=lambda params: torch.optim.SGD(params, lr=0.01, momentum=0.9),\n",
    "    backend=\"torch\",\n",
    ")\n",
    "\n",
    "print(f\"By name:    {opt_by_name.name}\")\n",
    "print(f\"By class:   {opt_by_class.cls}\")\n",
    "print(f\"By factory: {opt_by_factory}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "52",
   "metadata": {},
   "source": [
    "### Attaching an Optimizer to a ModelNode\n",
    "\n",
    "An optimizer on a `ModelNode` enables standalone `train_step()` / `eval_step()` calls.\n",
    "\n",
    "If creating a ModelGraph with models all from the same backend (e.g., all PyTorch models), it's easier to just use a graph-wise optimizer (set during ModelGraph init)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53",
   "metadata": {},
   "outputs": [],
   "source": [
    "node_with_opt = ModelNode(\n",
    "    label=\"TrainableMLP\",\n",
    "    model=SequentialMLP(output_shape=(1, 1), n_layers=2, hidden_dim=32),\n",
    "    upstream_ref=fs_ref,\n",
    "    optimizer=Optimizer(\"adam\", opt_kwargs={\"lr\": 1e-3}, backend=\"torch\"),\n",
    ")\n",
    "print(f\"Has optimizer: {node_with_opt._optimizer is not None}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "55",
   "metadata": {},
   "source": [
    "(02-create-modelnode-building-and-running-a-modelnode)=\n",
    "## Building and Running a ModelNode"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "56",
   "metadata": {},
   "source": [
    "\n",
    "Normally `ModelGraph.build()` handles building all nodes. But for debugging or\n",
    "standalone use, you can build and forward through a `ModelNode` directly."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "57",
   "metadata": {},
   "source": [
    "### Building Manually"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "58",
   "metadata": {},
   "outputs": [],
   "source": [
    "# build_model() takes explicit shapes\n",
    "node_mlp.build_model(input_shape=(1, 10), output_shape=(1, 1))\n",
    "\n",
    "print(f\"is_built:     {node_mlp.is_built}\")\n",
    "print(f\"input_shape:  {node_mlp.input_shape}\")\n",
    "print(f\"output_shape: {node_mlp.output_shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "59",
   "metadata": {},
   "source": [
    "### Forward Pass with `SampleData`\n",
    "\n",
    "The `forward_single()` method (also available as `__call__`) accepts `SampleData`,\n",
    "`RoleData`, or `Batch`. It passes features through the model, preserving targets and tags."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "60",
   "metadata": {},
   "outputs": [],
   "source": [
    "from modularml.core.data.sample_data import SampleData\n",
    "from modularml.utils.data.data_format import DataFormat\n",
    "\n",
    "# Create SampleData from the FeatureSet reference\n",
    "fsv = fs_ref.resolve()\n",
    "sample_data = SampleData(\n",
    "    features=fsv.get_features(fmt=DataFormat.TORCH),\n",
    "    targets=fsv.get_targets(fmt=DataFormat.TORCH),\n",
    ")\n",
    "print(f\"Input features: {sample_data.features.shape}\")\n",
    "\n",
    "# Forward pass\n",
    "with torch.no_grad():\n",
    "    output = node_mlp(sample_data)\n",
    "    print(f\"Output features: {output.features.shape}\")\n",
    "    print(f\"Targets passed through: {output.targets.shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61",
   "metadata": {},
   "source": [
    "### Auto-Build on First Forward Pass\n",
    "\n",
    "If a `ModelNode` has a `FeatureSetReference` as its upstream, calling `forward_single()`\n",
    "on an unbuilt node will attempt to auto-build by inferring shapes from the upstream\n",
    "`FeatureSet`.\n",
    "\n",
    "Note that `output_shape` will be determined in the following sequence:\n",
    "- If provided, that output shape is used\n",
    "- If the node has no downstream connections, the target shape of the referenced FeatureSet will be used\n",
    "- Otherwise, the hidden layer shape (if using a built model) will be used."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "62",
   "metadata": {},
   "outputs": [],
   "source": [
    "# This node is not built yet\n",
    "auto_build_node = ModelNode(\n",
    "    label=\"AutoBuild\",\n",
    "    model=SequentialMLP(output_shape=(1, 1), n_layers=1, hidden_dim=16),\n",
    "    upstream_ref=fs_ref,\n",
    ")\n",
    "print(f\"Before forward: is_built={auto_build_node.is_built}\")\n",
    "\n",
    "# First forward pass triggers auto-build\n",
    "with torch.no_grad():\n",
    "    output = auto_build_node(sample_data)\n",
    "\n",
    "print(f\"After forward:  is_built={auto_build_node.is_built}\")\n",
    "print(f\"  input_shape:  {auto_build_node.input_shape}\")\n",
    "print(f\"  output_shape: {auto_build_node.output_shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "63",
   "metadata": {},
   "source": [
    "We could've omitted `output_shape`, which results in the same `(1,1)` shape (because the FeatureSet `'soh'` data has shape (1,1)).\n",
    "\n",
    "It is generally best practice to explicitly define the output shape of any models you create."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65",
   "metadata": {},
   "source": [
    "(02-create-modelnode-chaining-nodes)=\n",
    "## Chaining Nodes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66",
   "metadata": {},
   "source": [
    "\n",
    "A `ModelNode` can take another `ModelNode` (or any `ComputeNode`) as its upstream,\n",
    "enabling multi-stage pipelines."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "67",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Encoder -> Regressor chain\n",
    "encoder = ModelNode(\n",
    "    label=\"Encoder\",\n",
    "    model=SequentialMLP(output_shape=(1, 8), n_layers=2, hidden_dim=32),\n",
    "    upstream_ref=fs_ref,\n",
    ")\n",
    "\n",
    "regressor = ModelNode(\n",
    "    label=\"Regressor\",\n",
    "    model=SequentialMLP(output_shape=(1, 1), n_layers=1, hidden_dim=16),\n",
    "    upstream_ref=encoder,  # Receives output from Encoder\n",
    ")\n",
    "\n",
    "print(f\"Encoder upstream:   {encoder.upstream_ref.resolve()}\")\n",
    "print(f\"Regressor upstream: {regressor.upstream_ref.resolve()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69",
   "metadata": {},
   "source": [
    "(02-create-modelnode-freezing-and-unfreezing)=\n",
    "## Freezing and Unfreezing\n",
    "\n",
    "Freezing a node prevents its parameters from being updated during training.\n",
    "This is useful for transfer learning or multi-stage training events.\n",
    "\n",
    "The below `requires_grad` property is PyTorch-specific, but similar gradient blocking is enforced for TensorFlow models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "70",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"Frozen: {node_mlp.is_frozen}\")\n",
    "\n",
    "node_mlp.freeze()\n",
    "print(f\"After freeze:   {node_mlp.is_frozen}\")\n",
    "\n",
    "# Verify PyTorch parameters are frozen\n",
    "param = next(node_mlp.model.parameters())\n",
    "print(f\" - requires_grad:  {param.requires_grad}\")\n",
    "\n",
    "node_mlp.unfreeze()\n",
    "print(f\"After unfreeze: {node_mlp.is_frozen}\")\n",
    "print(f\" - requires_grad:  {param.requires_grad}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "71",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "72",
   "metadata": {},
   "source": [
    "(02-create-modelnode-serialization)=\n",
    "## Serialization"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73",
   "metadata": {},
   "source": [
    "\n",
    "`ModelNode` supports full config and state serialization via `get_config()` / `from_config()`\n",
    "and `get_state()` / `set_state()`. The underlying `BaseModel` handles weight serialization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Configuration (structure, no weights)\n",
    "config = node_mlp.get_config()\n",
    "print(\"Config keys:\", list(config.keys()))\n",
    "\n",
    "# State (includes learned weights)\n",
    "state = node_mlp.get_state()\n",
    "print(\"State keys:\", list(state.keys()))\n",
    "print(\"Model weight keys:\", list(state[\"model\"][\"weights\"].keys())[:3], \"...\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75",
   "metadata": {},
   "source": [
    "Models can also be saved to and loaded from disk independently of the `ModelNode`.\n",
    "\n",
    "Note that `save` and `load` methods are not provided on ModelNodes, only on the `BaseModel` itself.\n",
    "This is intentional.\n",
    "ModelNodes are not useful outside of its parent `Experiment` (their upstream and downstream connections have no meaning on their own). \n",
    "However, the underlying model is useful to share independently."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "76",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "from tempfile import TemporaryDirectory\n",
    "\n",
    "from modularml import BaseModel\n",
    "\n",
    "SAVE_DIR = TemporaryDirectory()\n",
    "\n",
    "# Save and reload a built-in model\n",
    "save_path = node_mlp.model.save(Path(SAVE_DIR.name) / \"my_mlp\", overwrite=True)\n",
    "reloaded = BaseModel.load(save_path)\n",
    "print(f\"Models equal: {reloaded == node_mlp.model}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "77",
   "metadata": {},
   "source": [
    "For custom (non-built-in) models, the model's source code is packaged alongside the weights. This requires that the custom model be defined in a standalone python file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "78",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Save a wrapped custom model\n",
    "node_custom.build_model(input_shape=(10,), output_shape=(4,))\n",
    "try:\n",
    "    save_path_custom = node_custom.model.save(\n",
    "        Path(SAVE_DIR.name) / \"my_custom\",\n",
    "        overwrite=True,\n",
    "    )\n",
    "except RuntimeError as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79",
   "metadata": {},
   "outputs": [],
   "source": [
    "from utils.my_model import MyEncoder\n",
    "\n",
    "# After moving the MyEncoder class to an external python file, we can save\n",
    "custom_node = ModelNode(\n",
    "    label=\"imported_model\",\n",
    "    model=MyEncoder(in_features=10, out_features=4),\n",
    "    upstream_ref=fs_ref,\n",
    ")\n",
    "\n",
    "save_path_custom = custom_node.model.save(\n",
    "    Path(SAVE_DIR.name) / \"my_custom\",\n",
    "    overwrite=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80",
   "metadata": {},
   "source": [
    "Packaging source code is the only way to ensure full reproducibility of custom code.\n",
    "However, you should always inspect unknown code below executing it. \n",
    "\n",
    "If you try to load a saved file that contains packaged code, an error will be thrown unless you intentionally set `allow_packaged_code=True`.\n",
    "\n",
    "The following procedure is recommended when loading unknown serialized `mml` files:\n",
    "1. Call `load(..., allow_packaged_code=False)` (it will always defaul to False)\n",
    "2. If an error occurs, indicating packaged code, use the following utility to inspect the source code before importing as an executable: `modularml.utils.inspect_packaged_code`\n",
    "3. After verifying the nature of the source code, you can retry `load` with `allow_packaged_code=True`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "81",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Reload with packaged code\n",
    "try:\n",
    "    reloaded_custom = BaseModel.load(save_path_custom, allow_packaged_code=False)\n",
    "except RuntimeError as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "82",
   "metadata": {},
   "outputs": [],
   "source": [
    "from modularml.utils import inspect_packaged_code\n",
    "\n",
    "# Inspect the file before reloadinge\n",
    "# The method returns a dict with keys for each class that would need to be loaded\n",
    "res = inspect_packaged_code(save_path_custom)\n",
    "for k, code in res.items():\n",
    "    print(k)\n",
    "    print(code)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "83",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Now that we've verified the code is safe to run, we can try loading again\n",
    "reloaded_custom = BaseModel.load(save_path_custom, allow_packaged_code=True)\n",
    "print(f\"Models equal: {reloaded_custom == custom_node.model}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "85",
   "metadata": {},
   "source": [
    "(02-create-modelnode-summary)=\n",
    "## Summary"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "86",
   "metadata": {},
   "source": [
    "\n",
    "### Model Classes\n",
    "\n",
    "| Class | Module | Backend | Description |\n",
    "|-------|--------|---------|-------------|\n",
    "| `BaseModel` | `modularml.core.models` | Abstract | Base interface for all models. |\n",
    "| `TorchBaseModel` | `modularml.core.models` | PyTorch | Base for built-in PyTorch models. |\n",
    "| `TorchModelWrapper` | `modularml.core.models` | PyTorch | Wraps any `torch.nn.Module`. |\n",
    "| `TensorflowModelWrapper` | `modularml.core.models` | TensorFlow | Wraps any `tf.keras.Model`. |\n",
    "| `ScikitModelWrapper` | `modularml.core.models` | scikit-learn | Wraps any `sklearn.BaseEstimator`. |\n",
    "| `SequentialMLP` | `modularml.models.torch` | PyTorch | Built-in multi-layer perceptron. |\n",
    "| `SequentialCNN` | `modularml.models.torch` | PyTorch | Built-in 1D convolutional network. |\n",
    "\n",
    "### ModelNode Properties and Methods\n",
    "\n",
    "| Property / Method | Description |\n",
    "|-------------------|-------------|\n",
    "| `.model` | The underlying `BaseModel` instance. |\n",
    "| `.backend` | Backend enum (`Backend.TORCH`, `Backend.TENSORFLOW`, etc.). |\n",
    "| `.is_built` | Whether the model has been built with input/output shapes. |\n",
    "| `.input_shape` | Input shape tuple (no batch dim), or `None`. |\n",
    "| `.output_shape` | Output shape tuple (no batch dim), or `None`. |\n",
    "| `.upstream_ref` | The single upstream reference (read-only property). |\n",
    "| `.is_frozen` | Whether training is disabled for this node. |\n",
    "| `build_model(input_shape, output_shape)` | Build the model and optimizer manually. |\n",
    "| `forward_single(x)` / `__call__(x)` | Forward pass on `SampleData`, `RoleData`, or `Batch`. |\n",
    "| `freeze()` / `unfreeze()` | Toggle parameter trainability. |\n",
    "| `get_config()` / `from_config()` | Config serialization (structure only). |\n",
    "| `get_state()` / `set_state()` | State serialization (includes weights). |"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "87",
   "metadata": {},
   "source": [
    "### Next Steps\n",
    "\n",
    "- **ModelGraph:** Compose multiple `ModelNode`s (and `MergeNode`s) into a computational\n",
    "  graph that handles build order, shape inference, and forward pass routing.\n",
    "\n",
    "- **Experiment:** Use `Experiment` to combine a `ModelGraph` with training phases,\n",
    "  loss functions, and evaluation — the primary user-facing entry point.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv (3.10.18)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}