Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
4518b36
Add initial specification for file column type
claude Dec 20, 2025
ba3c66b
Revise file type spec: unified storage backend with fsspec
claude Dec 20, 2025
965a30f
Update file type spec to use existing datajoint.json settings
claude Dec 20, 2025
667e740
Add filename collision avoidance and transaction handling to spec
claude Dec 20, 2025
9d3e194
Major spec revision: files/folders, transactions, fetch handles
claude Dec 20, 2025
93559a4
Update path structure: field after PK, add partition pattern
claude Dec 20, 2025
dc1c899
Add PK value encoding rules for paths
claude Dec 20, 2025
5f27b75
Clarify orphan cleanup as separate maintenance procedure
claude Dec 20, 2025
4f15c90
Add legacy type deprecation notice
claude Dec 20, 2025
af6cef2
Add store metadata and client verification mechanism
claude Dec 20, 2025
ec2e737
Simplify store metadata - remove schema tracking
claude Dec 20, 2025
b32ef8d
Rename type from 'file' to 'object'
claude Dec 20, 2025
93ce01e
Add Zarr compatibility: staged insert and fsspec access
claude Dec 20, 2025
997d992
Finalize staged_insert1 API for direct object storage writes
claude Dec 20, 2025
36806cc
Simplify object naming: field name as base, extension from source
claude Dec 20, 2025
6c6349b
Restructure store paths: objects/ after table, rename store config
claude Dec 20, 2025
0ea880a
Make content hashing optional, add folder manifests
claude Dec 21, 2025
c340ec7
Clarify folder manifest storage location and rationale
claude Dec 21, 2025
6cd9b9c
Add optional database_host and database_name to store metadata
claude Dec 21, 2025
38844f1
Highlight no hidden tables - key architectural difference
claude Dec 21, 2025
d65ece7
Refactor external storage to use fsspec for unified backend
claude Dec 21, 2025
4b7e7bd
Fix unused imports (ruff lint)
claude Dec 21, 2025
949b8a6
Fix ruff-format: add blank lines after local imports
claude Dec 21, 2025
0019109
Implement object column type for managed file storage
claude Dec 21, 2025
b45df2c
Fix ruff lint: line length and unused imports
claude Dec 21, 2025
adf4305
Fix unused imports (ruff lint)
claude Dec 21, 2025
095753f
Add documentation for object column type
claude Dec 21, 2025
08838f6
Fix ruff-format: code formatting adjustments
claude Dec 21, 2025
3da69fd
Add pytest tests for object column type
claude Dec 21, 2025
944c9be
Fix E402: move schema_object import to top of file
claude Dec 21, 2025
752248c
Fix unused imports (ruff lint)
claude Dec 21, 2025
7ef4e61
Fix ruff-format: add blank lines after local imports
claude Dec 21, 2025
2be5f11
Introduce AttributeType system to replace AttributeAdapter
claude Dec 21, 2025
055c9c6
Update documentation for new AttributeType system
claude Dec 21, 2025
af9bd8d
Apply ruff-format fixes to AttributeType implementation
claude Dec 21, 2025
9bd37f6
Add DJBlobType and migration utilities for blob columns
claude Dec 21, 2025
c8d8a22
Clarify migration handles all blob type variants
claude Dec 21, 2025
61db015
Fix ruff linter errors: add migrate to __all__, remove unused import
claude Dec 21, 2025
78e0d1d
Remove serializes flag; longblob is now raw bytes
claude Dec 21, 2025
c173356
Remove unused blob imports from fetch.py and table.py
claude Dec 21, 2025
106f859
Update docs: use <djblob> for serialized data, longblob for raw bytes
claude Dec 21, 2025
e293fec
Merge branch 'claude/add-file-column-type-LtXQt' into claude/upgrade-…
dimitri-yatsenko Dec 21, 2025
15418c3
Address Zarr reviewer feedback: optional metadata fields
claude Dec 22, 2025
fb8c0cb
Add Augmented Schema vs External References section
claude Dec 22, 2025
a9447e7
Rename file-type-spec.md to object-type-spec.md
claude Dec 22, 2025
5170ab1
Fix ruff-format: single line error message
claude Dec 22, 2025
3e32188
Simplify ExternalTable storage initialization
claude Dec 22, 2025
4e90c1e
Clarify staged insert compatibility: Zarr/TileDB yes, HDF5 no
claude Dec 22, 2025
5a727d2
Add remote URL support for copy insert
claude Dec 22, 2025
4bdc882
Remove redundant self.spec attribute from ExternalTable
claude Dec 22, 2025
cc96f03
Fix ruff-format: single line error message in upload_filepath
claude Dec 22, 2025
b2bc219
Merge branch claude/add-type-aliases-6uN3E
claude Dec 22, 2025
d66f76e
Merge claude/add-file-column-type-LtXQt into upgrade-adapted-type
claude Dec 22, 2025
2f9b2be
Merge remote-tracking branch 'origin/claude/upgrade-adapted-type-1W3a…
claude Dec 22, 2025
9ad4830
Add Autopopulate 2.0 specification document
claude Dec 22, 2025
df94fcc
Add foreign-key-only primary key constraint to spec
claude Dec 22, 2025
9110515
Remove FK constraints from jobs tables for performance
claude Dec 22, 2025
4637708
Add table drop/alter behavior and schema.jobs list API
claude Dec 22, 2025
68d876d
Clarify ignore status is manual, not automatic transition
claude Dec 22, 2025
f0b7cd8
Simplify job reset mechanism and migration path
claude Dec 22, 2025
6b986ae
Simplify job reservation: no locking, rely on make() transaction
claude Dec 22, 2025
8900fea
Clarify per-key reservation flow in populate()
claude Dec 22, 2025
1f56102
Merge pre/v2.0 into spec-issue-1243
claude Dec 22, 2025
7c22b6d
Update state diagram to Mermaid, consolidate scheduling into refresh()
claude Dec 22, 2025
3018b8f
Add (none)->ignore transition, simplify reserve description
claude Dec 22, 2025
7eda583
Add success->pending transition via refresh()
claude Dec 22, 2025
bab7e10
Use explicit (none) state in Mermaid diagram
claude Dec 22, 2025
586effa
Simplify diagram notation, remove clear_completed()
claude Dec 22, 2025
5b1e3e8
Refine jobs spec: priority, delete, populate logic
claude Dec 22, 2025
2e0a3d9
Clarify stale vs orphaned job terminology
claude Dec 22, 2025
77c7cf5
Remove FK-only PK requirement, add hazard analysis
claude Dec 23, 2025
86e21f4
Clarify conflict resolution and add pre-partitioning pattern
claude Dec 23, 2025
314ad0a
Fix incorrect statement about deleting reserved jobs
claude Dec 23, 2025
61cc759
Use relative delay (seconds) instead of absolute scheduled_time
claude Dec 23, 2025
7b11d65
Clarify that only make() errors are logged as error status
claude Dec 23, 2025
086de07
Implement Autopopulate 2.0 job system
claude Dec 23, 2025
53bd28d
Drop jobs table when auto-populated table is dropped
claude Dec 23, 2025
428c572
Add tests for Autopopulate 2.0 jobs system
claude Dec 23, 2025
e89e064
Fix ruff linting errors and reformat
claude Dec 23, 2025
0f98b18
Remove legacy schema-wide jobs system
claude Dec 23, 2025
956fa27
Rename jobs_v2.py to jobs.py
claude Dec 23, 2025
608020a
Improve jobs.py: use update1, djblob, cleaner f-string
claude Dec 23, 2025
8430e2a
Simplify reserve() to use update1
claude Dec 23, 2025
34c302a
Use update1 in complete() method
claude Dec 23, 2025
e0d6fd9
Simplify: use self.proj() for jobs table projections
claude Dec 23, 2025
83b7f49
Simplify ignore(): only insert new records, cannot convert existing
claude Dec 23, 2025
080b6c0
Use insert1 in _insert_job_with_status instead of explicit SQL
claude Dec 23, 2025
84ba4b7
Remove AutoPopulate._job_key - no longer needed
claude Dec 23, 2025
6ef2de7
Remove AutoPopulate.target property
claude Dec 23, 2025
55d7f32
Remove legacy _make_tuples callback support - use self.make exclusively
claude Dec 23, 2025
7b28c64
Eliminate _jobs_to_do method
claude Dec 23, 2025
d28fa7c
Simplify jobs variable usage in populate()
claude Dec 23, 2025
7d595fb
Inline _get_pending_jobs into populate()
claude Dec 23, 2025
0a5f3a9
Remove order parameter and consolidate limit/max_calls
claude Dec 23, 2025
61bb2b6
Add comprehensive documentation update plan
claude Dec 23, 2025
2979769
Revise documentation plan to focus on API, minimize theory
claude Dec 23, 2025
df682a5
Restructure documentation navigation and file organization
claude Dec 23, 2025
46f362d
Fix internal links after documentation restructure
claude Dec 23, 2025
ea90062
Add job management documentation
claude Dec 23, 2025
7328cbe
Remove documentation update plan (restructure complete)
claude Dec 23, 2025
9aa2a13
Enhance documentation with detailed examples
claude Dec 23, 2025
f77f32f
Enhance populate and blob documentation with detailed examples
claude Dec 23, 2025
f483466
Fix master-part example to show proper nested class indentation
claude Dec 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 54 additions & 61 deletions docs/mkdocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,78 +4,71 @@ site_name: DataJoint Documentation
repo_url: https://github.com/datajoint/datajoint-python
repo_name: datajoint/datajoint-python
nav:
- DataJoint Python: index.md
- Quick Start Guide: quick-start.md
- Home: index.md
- Quick Start: quick-start.md
- Concepts:
- Principles: concepts/principles.md
- Data Model: concepts/data-model.md
- Data Pipelines: concepts/data-pipelines.md
- Teamwork: concepts/teamwork.md
- concepts/index.md
- Terminology: concepts/terminology.md
- System Administration:
- Database Administration: sysadmin/database-admin.md
- Bulk Storage Systems: sysadmin/bulk-storage.md
- External Store: sysadmin/external-store.md
- Client Configuration:
- Install: client/install.md
- Credentials: client/credentials.md
- Settings: client/settings.md
- File Stores: client/stores.md
- Getting Started:
- Installation: client/install.md
- Connection: client/credentials.md
- Configuration: client/settings.md
- Schema Design:
- Schema Creation: design/schema.md
- Table Definition:
- Table Tiers: design/tables/tiers.md
- Declaration Syntax: design/tables/declare.md
- Primary Key: design/tables/primary.md
- Attributes: design/tables/attributes.md
- Lookup Tables: design/tables/lookup.md
- Manual Tables: design/tables/manual.md
- Blobs: design/tables/blobs.md
- Attachments: design/tables/attach.md
- Filepaths: design/tables/filepath.md
- Custom Datatypes: design/tables/customtype.md
- Dependencies: design/tables/dependencies.md
- Indexes: design/tables/indexes.md
- Master-Part Relationships: design/tables/master-part.md
- Schema Diagrams: design/diagrams.md
- Entity Normalization: design/normalization.md
- Data Integrity: design/integrity.md
- Schema Recall: design/recall.md
- Schema Drop: design/drop.md
- Schema Modification: design/alter.md
- Data Manipulations:
- manipulation/index.md
- Insert: manipulation/insert.md
- Delete: manipulation/delete.md
- Update: manipulation/update.md
- Transactions: manipulation/transactions.md
- Data Queries:
- Principles: query/principles.md
- Example Schema: query/example-schema.md
- Schemas: design/schema.md
- Table Tiers: design/tables/tiers.md
- Declaration: design/tables/declare.md
- Primary Key: design/tables/primary.md
- Attributes: design/tables/attributes.md
- Foreign Keys: design/tables/dependencies.md
- Indexes: design/tables/indexes.md
- Lookup Tables: design/tables/lookup.md
- Manual Tables: design/tables/manual.md
- Master-Part: design/tables/master-part.md
- Diagrams: design/diagrams.md
- Alter: design/alter.md
- Drop: design/drop.md
- Data Types:
- Blob: datatypes/blob.md
- Attach: datatypes/attach.md
- Filepath: datatypes/filepath.md
- Object: datatypes/object.md
- Adapted Types: datatypes/adapters.md
- Data Operations:
- operations/index.md
- Insert: operations/insert.md
- Delete: operations/delete.md
- Update: operations/update.md
- Transactions: operations/transactions.md
- Make Method: operations/make.md
- Populate: operations/populate.md
- Key Source: operations/key-source.md
- Jobs: operations/jobs.md
- Distributed: operations/distributed.md
- Queries:
- query/principles.md
- Fetch: query/fetch.md
- Iteration: query/iteration.md
- Operators: query/operators.md
- Restrict: query/restrict.md
- Projection: query/project.md
- Project: query/project.md
- Join: query/join.md
- Aggregation: query/aggregation.md
- Union: query/union.md
- Universal Sets: query/universals.md
- Query Caching: query/query-caching.md
- Computations:
- Make Method: compute/make.md
- Populate: compute/populate.md
- Key Source: compute/key-source.md
- Distributed Computing: compute/distributed.md
- Publish Data: publish-data.md
- Internals:
- SQL Transpilation: internal/transpilation.md
- Iteration: query/iteration.md
- Caching: query/query-caching.md
- Administration:
- Database: admin/database.md
- Storage Backends: admin/storage.md
- External Store: admin/external-store.md
- Tutorials:
- JSON Datatype: tutorials/json.ipynb
- FAQ: faq.md
- Developer Guide: develop.md
- Citation: citation.md
- Changelog: changelog.md
- JSON Datatype: tutorials/json.ipynb
- Reference:
- FAQ: reference/faq.md
- SQL Transpilation: reference/transpilation.md
- Publishing Data: reference/publish-data.md
- Developer Guide: reference/develop.md
- Citation: reference/citation.md
- Changelog: changelog.md
- API: api/ # defer to gen-files + literate-nav

# ---------------------------- STANDARD -----------------------------
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ For example, the following table stores motion-aligned two-photon movies.
aligned_movie : blob@external # motion-aligned movie in 'external' store
```

All [insert](../manipulation/insert.md) and [fetch](../query/fetch.md) operations work
All [insert](../operations/insert.md) and [fetch](../query/fetch.md) operations work
identically for `external` attributes as they do for `blob` attributes, with the same
serialization protocol.
Similar to `blobs`, `external` attributes cannot be used in restriction conditions.
Expand Down Expand Up @@ -116,12 +116,12 @@ configured external store.
[foreign keys](../design/tables/dependencies.md) referencing the
`~external_<storename>` table (but are not shown as such to the user).

8. The [insert](../manipulation/insert.md) operation encodes and hashes the blob data.
8. The [insert](../operations/insert.md) operation encodes and hashes the blob data.
If an external object is not present in storage for the same hash, the object is saved
and if the save operation is successful, corresponding entities in table
`~external_<storename>` for that store are created.

9. The [delete](../manipulation/delete.md) operation first deletes the foreign key
9. The [delete](../operations/delete.md) operation first deletes the foreign key
reference in the target table. The external table entry and actual external object is
not actually deleted at this time (`soft-delete`).

Expand Down
File renamed without changes.
56 changes: 55 additions & 1 deletion docs/src/client/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ dj.config.database.use_tls = None # Auto (default)

## External Storage

Configure external stores in the `stores` section. See [External Storage](../sysadmin/external-store.md) for details.
Configure external stores in the `stores` section. See [External Storage](../admin/external-store.md) for details.

```json
{
Expand All @@ -164,3 +164,57 @@ Configure external stores in the `stores` section. See [External Storage](../sys
}
}
```

## Object Storage

Configure object storage for the [`object` type](../design/tables/object.md) in the `object_storage` section. This provides managed file and folder storage with fsspec backend support.

### Local Filesystem

```json
{
"object_storage": {
"project_name": "my_project",
"protocol": "file",
"location": "/data/my_project"
}
}
```

### Amazon S3

```json
{
"object_storage": {
"project_name": "my_project",
"protocol": "s3",
"bucket": "my-bucket",
"location": "my_project",
"endpoint": "s3.amazonaws.com"
}
}
```

### Object Storage Settings

| Setting | Environment Variable | Required | Description |
|---------|---------------------|----------|-------------|
| `object_storage.project_name` | `DJ_OBJECT_STORAGE_PROJECT_NAME` | Yes | Unique project identifier |
| `object_storage.protocol` | `DJ_OBJECT_STORAGE_PROTOCOL` | Yes | Backend: `file`, `s3`, `gcs`, `azure` |
| `object_storage.location` | `DJ_OBJECT_STORAGE_LOCATION` | Yes | Base path or bucket prefix |
| `object_storage.bucket` | `DJ_OBJECT_STORAGE_BUCKET` | For cloud | Bucket name |
| `object_storage.endpoint` | `DJ_OBJECT_STORAGE_ENDPOINT` | For S3 | S3 endpoint URL |
| `object_storage.partition_pattern` | `DJ_OBJECT_STORAGE_PARTITION_PATTERN` | No | Path pattern with `{attr}` placeholders |
| `object_storage.token_length` | `DJ_OBJECT_STORAGE_TOKEN_LENGTH` | No | Random suffix length (default: 8) |
| `object_storage.access_key` | — | For cloud | Access key (use secrets) |
| `object_storage.secret_key` | — | For cloud | Secret key (use secrets) |

### Object Storage Secrets

Store cloud credentials in the secrets directory:

```
.secrets/
├── object_storage.access_key
└── object_storage.secret_key
```
Loading
Loading