Skip to content

isamplesorg/isamplesorg.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title subtitle
isamples.github.io
README for the isamples.github.io source

isamplesorg.github.io

This repository provides the source for SMR fork isamplesorg.github.io.

The site uses the Quarto and is built using GitHub actions.

Sources are in markdown or "quarto markdown" (.qmd files), and may include content computed at build time.

Visit the Quarto site for documentation on using the Quarto environment and features.

Tutorials

The tutorials/ directory contains interactive data analysis tutorials:

  • isamples_explorer.qmd - Interactive search and exploration of 6.7M samples
  • zenodo_isamples_analysis.qmd - Deep-dive DuckDB-WASM analysis tutorial
  • parquet_cesium_isamples_wide.qmd - Cesium-based 3D globe visualization
  • narrow_vs_wide_performance.qmd - Technical schema comparison

All tutorials use browser-based analysis with DuckDB-WASM - no server required.

Development

For simple editing tasks, the sources may be edited directly on GitHub. A local setup will be beneficial for larger or more complex changes.

To setup a development environment:

  1. Install Quarto
  2. Create a python virtual environment, e.g. mkvirtualenv isamples-quarto
  3. git clone https://github.com/isamplesorg/isamplesorg.github.io.git
  4. cd isamplesorg.github.io

Preview the site:

quarto preview

Vocabulary documentation is generated from the vocabulary source ttl files using a python script, scripts/vocab2md.py and a convenience shell script wrapper, scripts/generate_vocab_docs.sh. To regenerate the vocabulary documentation, first cd to the root folder of the documentation, then:

scripts/generate_vocab_docs.sh

The generated docs are placed under models/generated/vocabularies

After editing, push the sources to GitHub. The rendered pages are generated using the Render using Quarto and push to GH-pages GitHub action that is currently manually triggered.

Updating dependencies using pip -U <<package name>> and regenerate requirements.txt with pip freeze > requirements.txt.

Data Sources

All tutorials query parquet files hosted on Cloudflare R2:

// Wide format (recommended) - 280 MB, 20M rows
const WIDE_URL = "https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202601_wide.parquet";

// Narrow format (advanced) - 850 MB, 106M rows
const NARROW_URL = "https://pub-a18234d962364c22a50c787b7ca09fa5.r2.dev/isamples_202512_narrow.parquet";

Related Repositories

Repo Purpose Start Here
isamplesorg-metadata Schema definition (8 types, 14 predicates) src/schemas/isamples_core.yaml
isamples-python Jupyter examples (DuckDB + Lonboard) examples/basic/isamples_explorer.ipynb
vocabularies SKOS vocabulary terms Material types, context categories

About

iSamples project web site

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 10