For the "pixels" inputs, we need to allow for: * ~multiple cues~ * ~a way to name them~ (naming split to #133) * a way to transition between them, either immediately, manually, or linearly