Make TFRecord work with dynamic mode #6151

rostan-t · 2026-01-02T18:25:19Z

Category: Bug fix (non-breaking change which fixes an issue)

Description:

TFRecord returns a dictionary, which dynamic mode doesn't handle properly. This PR fixes this issue.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

greptile-apps · 2026-01-02T18:27:39Z

Greptile Summary

This PR adds support for dictionary return values in DALI's dynamic mode, specifically fixing TFRecord reader compatibility. The changes propagate dictionary handling through the entire invocation pipeline:

Core Changes: Modified Invocation, Operator, and Reader classes to detect when operators return dictionaries (via _output_names) and preserve the key-value structure throughout the execution path
Type Signatures: Updated TFRecord's next_epoch return type annotations to reflect dict[str, Tensor] or dict[str, Batch] instead of tuples
Testing: Added comprehensive test coverage for TFRecord in both single-sample and batch modes
Bug Fix: Corrected an f-string formatting error in _tensor.py (line 70)

The implementation is clean and follows the existing architecture patterns. The changes are well-contained and don't affect operators that return tuples.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The changes are well-structured, maintain backward compatibility for tuple-returning operators, include comprehensive tests, and fix a clear bug. The implementation correctly handles both dictionary and tuple return types throughout the pipeline.
No files require special attention

Important Files Changed

Filename	Overview
dali/python/nvidia/dali/experimental/dynamic/_invocation.py	Added `__iter__` method and dict handling in `_run_impl` to support dictionary return values from operators
dali/python/nvidia/dali/experimental/dynamic/_op_builder.py	Modified `build_call_function` to return dicts when operator has `_output_names`, unifying batch/tensor handling
dali/python/nvidia/dali/experimental/dynamic/_ops.py	Added `_output_names` tracking and dict return handling throughout Operator and Reader classes
dali/python/nvidia/dali/ops/_signatures.py	Updated type signatures for TFRecord's `next_epoch` to return `dict[str, Tensor/Batch]`
dali/test/python/experimental_mode/test_reader_decoder.py	Added comprehensive test for TFRecord reader covering both single sample and batch modes

Sequence Diagram

sequenceDiagram
    participant User
    participant TFRecord as TFRecord Reader
    participant Operator as Operator._run()
    participant Invocation as Invocation._run_impl()
    participant OpBuilder as build_call_function()
    participant Reader as Reader._samples()/_batches()

    User->>TFRecord: next_epoch(batch_size)
    TFRecord->>Reader: _samples() or _batches()
    Reader->>Operator: _run(ctx, batch_size)
    Operator->>Invocation: Create Invocation
    Invocation->>Invocation: _run_impl(ctx)
    
    Note over Invocation: Execute operator backend
    Invocation->>Invocation: Check result type
    alt Result is dict
        Invocation->>Invocation: Convert to tuple(r.values())
    else Result is tuple/list
        Invocation->>Invocation: Keep as tuple
    else Result is single value
        Invocation->>Invocation: Wrap in tuple
    end
    
    Invocation-->>Operator: Return tuple results
    
    Note over Operator: Check _output_names
    alt _output_names is set
        Operator->>Operator: zip(names, results) → dict
    else _output_names is None
        Operator->>Operator: Return tuple as-is
    end
    
    Operator-->>Reader: dict or tuple
    
    alt Result is dict
        Reader->>Reader: Iterate over dict.values()
        Reader->>Reader: yield dict(zip(names, tensors))
    else Result is tuple
        Reader->>Reader: Iterate over tuple
        Reader->>Reader: yield tuple(tensors)
    end
    
    Reader-->>User: dict[str, Tensor/Batch]

greptile-apps · 2026-01-02T18:27:40Z

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

_{This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".}

rostan-t · 2026-01-02T18:28:55Z

!build

dali-automaton · 2026-01-02T18:31:02Z

CI MESSAGE: [41073266]: BUILD STARTED

dali-automaton · 2026-01-02T20:11:32Z

CI MESSAGE: [41073266]: BUILD PASSED

mzient · 2026-01-05T09:26:08Z

dali/python/nvidia/dali/experimental/dynamic/_invocation.py

            self.run(self._eval_context)
        return self._results[result_index].layout()

+    def __iter__(self):


Out of curiosity: what is this needed for? You can iterate an object based on __len__/__getitem__.

It's true that __getitem__ is enough for an object to be iterable but this adds extra overhead and this can be called in a hot path.

mzient · 2026-01-05T09:27:33Z

dali/python/nvidia/dali/experimental/dynamic/_invocation.py

            self._num_outputs = self._operator._infer_num_outputs(*self._inputs, **self._args)
+        assert self._num_outputs is not None


Suggested change

self._num_outputs = self._operator._infer_num_outputs(*self._inputs, **self._args)

assert self._num_outputs is not None

self._num_outputs = self._operator._infer_num_outputs(*self._inputs, **self._args)

assert self._num_outputs is not None

No need to run the assert all the time, especially right after we've checked that the requested condition is met.

True. I did it because type checkers infer the return type as Any | int | None and they really hate that __len__ can return none but there's actually a less expensive way to fix it.

Fixed in 155b391.

mzient · 2026-01-05T09:31:14Z

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py

-                    return tuple(
-                        Batch(invocation_result=invocation[i]) for i in range(len(invocation))
-                    )
+            cls = Batch if is_batch else Tensor


I'd recommend something more precise than cls, which might be confused for the operator class.

Suggested change

cls = Batch if is_batch else Tensor

ResultType = Batch if is_batch else Tensor

Fixed in 64d0c4a

mzient · 2026-01-05T09:56:55Z

dali/python/nvidia/dali/experimental/dynamic/_ops.py

-            if is_batch:
+
+            if self._output_names is not None:
+                return dict(zip(self._output_names, tuple(out)))


Does it work with non-batch outputs?
Shouldn't you rather convert out to tensors based on is_batch and make a dictionary afterwards?

Fixed in ce374de

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

rostan-t · 2026-01-05T11:00:21Z

!build

dali-automaton · 2026-01-05T11:06:34Z

CI MESSAGE: [41161266]: BUILD STARTED

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

rostan-t · 2026-01-05T11:58:42Z

!build

dali-automaton · 2026-01-05T12:00:22Z

CI MESSAGE: [41163669]: BUILD STARTED

dali-automaton · 2026-01-05T13:54:35Z

CI MESSAGE: [41163669]: BUILD FAILED

dali-automaton · 2026-01-05T14:42:41Z

CI MESSAGE: [41163669]: BUILD PASSED

rostan-t added 4 commits January 2, 2026 18:17

Make invocations iterable

4cf7faa

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Handle dictionary outputs

3adec18

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Test TFRecord with dictionary output

1ac995b

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Generate the signature for TFRecord properly

5a60c42

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

rostan-t added the Dynamic Mode label Jan 2, 2026

mzient reviewed Jan 5, 2026

View reviewed changes

mzient self-assigned this Jan 5, 2026

mzient reviewed Jan 5, 2026

View reviewed changes

rostan-t added 3 commits January 5, 2026 10:21

Fix type checking of Invocation.__len__

155b391

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Fix a naming issue

64d0c4a

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Fix handling of dictionaries in invocations

90f596e

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

dali-automaton assigned stiepan Jan 5, 2026

Fix handling of dictionaries in single sample mode

ce374de

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

rostan-t force-pushed the ndd-tfrecord branch from 9022879 to ce374de Compare January 5, 2026 11:55

rostan-t assigned banasraf and unassigned stiepan Jan 5, 2026

banasraf approved these changes Jan 5, 2026

View reviewed changes

mzient approved these changes Jan 5, 2026

View reviewed changes

rostan-t merged commit 4aeb8a2 into NVIDIA:main Jan 5, 2026
8 checks passed

rostan-t deleted the ndd-tfrecord branch January 5, 2026 16:24

		self._num_outputs = self._operator._infer_num_outputs(self._inputs, *self._args)
		assert self._num_outputs is not None

	cls = Batch if is_batch else Tensor
	ResultType = Batch if is_batch else Tensor

Make TFRecord work with dynamic mode #6151

Make TFRecord work with dynamic mode #6151

Uh oh!

Conversation

rostan-t commented Jan 2, 2026

Category: Bug fix (non-breaking change which fixes an issue)

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Uh oh!

greptile-apps bot commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot commented Jan 2, 2026

Greptile's behavior is changing!

Uh oh!

rostan-t commented Jan 2, 2026

Uh oh!

dali-automaton commented Jan 2, 2026

Uh oh!

dali-automaton commented Jan 2, 2026

Uh oh!

mzient Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

rostan-t Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

mzient Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rostan-t Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

rostan-t Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

mzient Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

rostan-t Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

mzient Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

rostan-t Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rostan-t commented Jan 5, 2026

Uh oh!

dali-automaton commented Jan 5, 2026

Uh oh!

rostan-t commented Jan 5, 2026

Uh oh!

dali-automaton commented Jan 5, 2026

Uh oh!

dali-automaton commented Jan 5, 2026

Uh oh!

dali-automaton commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

greptile-apps bot commented Jan 2, 2026 •

edited

Loading

mzient Jan 5, 2026 •

edited

Loading

rostan-t Jan 5, 2026 •

edited

Loading