RapidataAI · LinoGiger · Jan 13, 2026 · Nov 24, 2025 · Nov 24, 2025 · Nov 25, 2025
diff --git a/.claude/settings.local.json b/.claude/settings.local.json
@@ -5,7 +5,8 @@
       "Bash(tree:*)",
       "Bash(find:*)",
       "Bash(cat:*)",
-      "Bash(.venv/Scripts/python.exe:*)"
+      "Bash(.venv/Scripts/python.exe:*)",
+      "Bash(git ls-tree:*)"
     ],
     "deny": [],
     "ask": []

diff --git a/docs/confidence_stopping.md b/docs/confidence_stopping.md
@@ -18,8 +18,8 @@ Early Stopping addresses this by:
 The Early Stopping feature leverages the trustworthiness, quantified through their `userScores`, to calculate the confidence level of each category for any given datapoint.
 
 ### Confidence Calculation
-- **UserScores**: Each annotator has a `userScore` between 0 and 1, representing their reliability. [More information](/understanding_the_results/#understanding-the-user-scores)
-- **Aggregated Confidence**: By combining the userScores of annotators who selected a particular category, the system computes the probability that this category is the correct one.
+- **UserScores**: Each labeler has a `userScore` between 0 and 1, representing their reliability. [More information](understanding_the_results.md#understanding-the-user-scores)
+- **Aggregated Confidence**: By combining the userScores of labelers who selected a particular category, the system computes the probability that this category is the correct one.
 - **Threshold Comparison**: If the calculated confidence exceeds your specified threshold, the system stops collecting further responses for that datapoint.
 
 ## Understanding the Confidence Threshold
@@ -28,51 +28,68 @@ We've created a plot based on empirical data aided by simulations to give you an
 
 There are a few things to keep in mind when interpreting the results:
 
-- **Unambiguous Scenario**: The graph represents an ideal situation such as in the [example below](#using-early-stopping-in-your-order) with no ambiguity which category is the correct one. A counter-example would be subjective tasks like "Which image do you prefer?", where there's no clear correct answer.
+- **Unambiguous Scenario**: The graph represents an ideal situation such as in the [example below](#using-early-stopping-in-your-job) with no ambiguity which category is the correct one. A counter-example would be subjective tasks like "Which image do you prefer?", where there's no clear correct answer.
 - **Real-World Variability**: Actual required responses may vary based on task complexity.
-- **Guidance Tool**: Use the graph as a reference to set realistic expectations for your orders. 
+- **Guidance Tool**: Use the graph as a reference to set realistic expectations for your jobs.
 - **Response Overflow**: The number of responses per datapoint may exceed the specified amount due to multiple users answering simultaneously.
 
 
 <div style="width: 780px; height: 650px; overflow: hidden;">
     <iframe src="/plots/confidence_threshold_plot_with_slider_darkmode.html"
-            width="100%" 
-            height="100%" 
-            frameborder="0" 
+            width="100%"
+            height="100%"
+            frameborder="0"
             scrolling="no"
             style="overflow: hidden;">
     </iframe>
 </div>
 
 >**Note:** The Early Stopping feature is supported for the Classification and Comparison workflows. The number of categories is the number of options in the Classification task. For the Comparison task, the number of categories is always 2.
 
-## Using Early Stopping in Your Order
+## Using Early Stopping in Your Job
 
-Implementing Early Stopping is straightforward. You simply add the confidence threshold as a parameter when creating the order.
+Implementing Early Stopping is straightforward. You simply add the confidence threshold as a parameter when creating the job definition.
 
-### Example: Classification Order with Early Stopping
+### Example: Classification Job with Early Stopping
 
 ```python
-order = rapi.order.create_classification_order(
-    name="Test Classification Order with Early Stopping",
+from rapidata import RapidataClient
+
+client = RapidataClient()
+
+# Create audience with qualification example
+audience = client.audience.create_audience(name="Animal Classification Audience")
+audience.add_classification_example(
+    instruction="What do you see in the image?",
+    answer_options=["Cat", "Dog"],
+    datapoint="https://assets.rapidata.ai/cat.jpeg",
+    truth=["Cat"]
+)
+
+# Create job definition with early stopping
+job_definition = client.job.create_classification_job_definition(
+    name="Test Classification with Early Stopping",
     instruction="What do you see in the image?",
     answer_options=["Cat", "Dog"],
     datapoints=["https://assets.rapidata.ai/dog.jpeg"],
     responses_per_datapoint=50,
     confidence_threshold=0.99,
-).run()
-
-order.display_progress_bar()
-result = order.get_results()
-print(result)
+)
+
+# Preview and run
+job_definition.preview()
+job = audience.assign_job_to_audience(job_definition)
+job.display_progress_bar()
+results = job.get_results()
+print(results)
 ```
 
 In this example:
 
-- responses_per_datapoint=50: Sets the maximum number of responses per datapoint.
-- confidence_threshold=0.99: Specifies that data collection for a datapoint should stop once a 99% confidence level is reached.
+- `responses_per_datapoint=50`: Sets the maximum number of responses per datapoint.
+- `confidence_threshold=0.99`: Specifies that data collection for a datapoint should stop once a 99% confidence level is reached.
 
-We'd expect this to take roughtly 4 responses to reach the 99% confidence level.
+We'd expect this to take roughly 4 responses to reach the 99% confidence level.
 
 ## When to Use Early Stopping
 
@@ -83,7 +100,7 @@ We recommend using Early Stopping when:
 
 ## Analyzing Early Stopping Results
 
-When using Early Stopping, the [results](/understanding_the_results/) will additionally include a `confidencePerCategory` field for each datapoint. This field shows the confidence level for each of the categories in the task.
+When using Early Stopping, the [results](understanding_the_results.md) will additionally include a `confidencePerCategory` field for each datapoint. This field shows the confidence level for each of the categories in the task.
 
 Example:
 ```json
@@ -117,7 +134,7 @@ Example:
                     "Cat": 0.0
                 },
                 # this only appears when using early stopping
-                "confidencePerCategory": { 
+                "confidencePerCategory": {
                     "Dog": 0.9943,
                     "Cat": 0.0057
                 },

diff --git a/docs/examples/classify_job.md b/docs/examples/classify_job.md
@@ -0,0 +1,49 @@
+# Classification Job Example
+
+To learn about the basics of creating a job, please refer to the [quickstart guide](../quickstart.md).
+
+In this example, we want to rate different images based on a Likert scale to assess how well generated images match their descriptions. The `NoShuffle` setting ensures answer options remain in order since they represent a scale.
+
+```python
+from rapidata import RapidataClient, NoShuffle
+
+IMAGE_URLS = [
+    "https://assets.rapidata.ai/tshirt-4o.png",
+    "https://assets.rapidata.ai/tshirt-aurora.jpg",
+    "https://assets.rapidata.ai/teamleader-aurora.jpg",
+]
+
+CONTEXTS = ["A t-shirt with the text 'Running on caffeine & dreams'"] * len(IMAGE_URLS)
+
+client = RapidataClient()
+
+# Create audience with qualification example
+audience = client.audience.create_audience(name="Likert Scale Audience")
+audience.add_classification_example(
+    instruction="How well does the image match the description?",
+    answer_options=["1: Not at all", "2: A little", "3: Moderately", "4: Very well", "5: Perfectly"],
+    datapoint="https://assets.rapidata.ai/tshirt-4o.png",
+    truth=["5: Perfectly"],
+    context="A t-shirt with the text 'Running on caffeine & dreams'"
+)
+
+# Create job definition
+job_definition = client.job.create_classification_job_definition(
+    name="Likert Scale Example",
+    instruction="How well does the image match the description?",
+    answer_options=["1: Not at all", "2: A little", "3: Moderately", "4: Very well", "5: Perfectly"],
+    contexts=CONTEXTS,
+    datapoints=IMAGE_URLS,
+    responses_per_datapoint=25,
+    settings=[NoShuffle()]
+)
+
+# Preview the job definition
+job_definition.preview()
+
+# Assign to audience and get results
+job = audience.assign_job_to_audience(job_definition)
+job.display_progress_bar()
+results = job.get_results()
+print(results)
+```
diff --git a/docs/examples/classify_order.md b/docs/examples/classify_order.md
diff --git a/docs/examples/compare_job.md b/docs/examples/compare_job.md
@@ -0,0 +1,57 @@
+# Compare Job Example
+
+To learn about the basics of creating a job, please refer to the [quickstart guide](../quickstart.md).
+
+In this example, we compare images from two image generation models (Flux and Midjourney) to determine which more accurately follows the given prompts.
+
+```python
+from rapidata import RapidataClient
+
+PROMPTS = [
+    "A sign that says 'Diffusion'.",
+    "A yellow flower sticking out of a green pot.",
+    "hyperrealism render of a surreal alien humanoid.",
+    "psychedelic duck",
+    "A small blue book sitting on a large red book."
+]
+
+IMAGE_PAIRS = [
+    ["https://assets.rapidata.ai/flux_sign_diffusion.jpg", "https://assets.rapidata.ai/mj_sign_diffusion.jpg"],
+    ["https://assets.rapidata.ai/flux_flower.jpg", "https://assets.rapidata.ai/mj_flower.jpg"],
+    ["https://assets.rapidata.ai/flux_alien.jpg", "https://assets.rapidata.ai/mj_alien.jpg"],
+    ["https://assets.rapidata.ai/flux_duck.jpg", "https://assets.rapidata.ai/mj_duck.jpg"],
+    ["https://assets.rapidata.ai/flux_book.jpg", "https://assets.rapidata.ai/mj_book.jpg"]
+]
+
+client = RapidataClient()
+
+# Create audience with qualification example
+audience = client.audience.create_audience(name="Prompt Alignment Audience")
+audience.add_compare_example(
+    instruction="Which image follows the prompt more accurately?",
+    datapoint=[
+        "https://assets.rapidata.ai/flux_sign_diffusion.jpg",
+        "https://assets.rapidata.ai/mj_sign_diffusion.jpg"
+    ],
+    truth="https://assets.rapidata.ai/flux_sign_diffusion.jpg",
+    context="A sign that says 'Diffusion'."
+)
+
+# Create job definition
+job_definition = client.job.create_compare_job_definition(
+    name="Example Image Prompt Alignment Job",
+    instruction="Which image follows the prompt more accurately?",
+    datapoints=IMAGE_PAIRS,
+    responses_per_datapoint=25,
+    contexts=PROMPTS
+)
+
+# Preview the job definition
+job_definition.preview()
+
+# Assign to audience and get results
+job = audience.assign_job_to_audience(job_definition)
+job.display_progress_bar()
+results = job.get_results()
+print(results)
+```
diff --git a/docs/examples/compare_order.md b/docs/examples/compare_order.md
diff --git a/docs/examples/draw_order.md b/docs/examples/draw_order.md