Skip to content

ADR-002: Package Files and OCI Layers

Status Date Implementation
Accepted 2026-05-16 Directory package implemented as of 0.1.0.

Context

stacpkg needs a package mechanism that can be inspected and moved later. The package must keep:

  • selected STAC Item metadata;
  • an asset-lock table with asset locations and object facts;
  • optional included files, such as reports or licenses;
  • optional asset bytes when a package should carry the data itself.

The package should be reproducible: the same inputs should produce the same package contents. It must also avoid credentials. It records object locations and the object facts supported by the asset-lock schema; fields such as media type and checksum can be added by later schema revisions. Access tokens, passwords, signed URLs, and other secrets must stay outside the package.

Package consumers should be able to inspect an OCI artifact before pulling all content. OCI descriptors already expose layer media type, digest, size, and string annotations, so the package should use those fields instead of adding a second metadata file that repeats the same information.

Decision

Use fixed package files for the required tables:

stacpkg.pkg/
  items.parquet
  assets.lock.parquet
  <optional content>
  assets/

items.parquet is always present. It contains one row per selected STAC Item.

assets.lock.parquet is always present. It contains the package asset-lock table. If the caller does not provide an asset-lock table, stacpkg creates one from the selected Items. The table may be empty when the selected Items have no lockable assets.

The table files have fixed package-relative names. Table schema kind and version are stored in the Parquet schema metadata.

Optional included content is not a source of package metadata. A single included file keeps its package-relative filename. An included directory is represented as a ZIP archive whose entries preserve the package-relative directory layout.

Optional asset bytes are distinct from optional included content. Asset bytes copied into the package live under assets/ in the local package. Their identity and package-relative location stay in assets.lock.parquet, with store_type=file and key pointing at the package path.

Use typed OCI layers for registry transport. No separate metadata JSON file or custom OCI config JSON is required to rebuild the package.

OCI image manifests require a config descriptor. For now, stacpkg uses the OCI empty JSON config descriptor and does not put authoritative package information in that config. The package information lives in fixed file names, Parquet schema metadata, OCI layer media types, descriptor annotations, and ZIP entries. This avoids a custom STAC package config blob until package metadata appears that cannot be represented without meaningful duplication.

The OCI artifact uses:

artifactType: application/vnd.stacpkg.package.v1+json
config.mediaType: application/vnd.oci.empty.v1+json

Required table layers:

application/vnd.stacpkg.items.v1.parquet
  extracts as items.parquet

application/vnd.stacpkg.asset-lock.v1.parquet
  extracts as assets.lock.parquet

Every layer records its safe package-relative path with the OCI org.opencontainers.image.title annotation.

Optional directory content uses ZIP:

application/vnd.stacpkg.files.v1+zip

Copied package assets use separate asset layers so they remain distinguishable from generic optional files:

application/vnd.stacpkg.asset.v1
application/vnd.stacpkg.asset.v1+zip

For single asset blobs, org.opencontainers.image.title records the safe package-relative asset path. For asset ZIP layers, the ZIP entries preserve package-relative assets/ paths. Consumers must reject unsafe paths such as absolute paths and .. parent traversal.

Package files must not contain credentials. Source and target access stays in the runtime environment: cloud profiles, environment variables, workload identity, mounted secrets, or provider-specific runtime configuration.

Alternatives Considered

  • STAC Catalog directory only: Familiar to STAC users, but it does not provide an asset-lock table or a package-level table layout.
  • Manifest-only package: Small and easy to inspect, but it pushes all item and asset-lock data into JSON and loses the efficient table files.
  • Package metadata JSON as OCI source of truth: Explicit, but it duplicates data already present in fixed filenames, Parquet schema metadata, OCI descriptors, descriptor annotations, and ZIP entries.
  • Store credentials in the package: Packages could be self-contained for immediate access, but this is unsafe to share. Authentication belongs to the runtime environment.

Consequences

The package is easy to inspect, archive, hash, and move through OCI or other package stores. Consumers can identify the required tables from fixed local filenames or OCI layer media types without guessing.

OCI artifacts expose the package shape before pull: readers can see whether the artifact has item metadata, an asset lock, optional file archives, and copied asset layers from descriptor media types and annotations.

The package does not include credentials. Workflows that need source or target access must provide credentials at execution time.