Skip to main content

Why one bucket per tenant

Customer datasets are sensitive and must remain strictly isolated.
This design uses one private bucket per tenant to reduce cross-tenant blast radius and make access boundaries explicit for security reviews and audits.
Bucket naming convention: <app>-<env>-tenant-<tenantSlug> Example: eomer-production-tenant-acme

Isolation model

The API enforces tenant isolation at control-plane level:
  1. Client authenticates with a Bearer API key.
  2. Backend resolves API key -> tenant slug from EOMER_API_KEY_TENANT_MAP_JSON.
  3. Backend computes tenant bucket name server-side.
  4. Backend computes object key server-side (clients cannot choose bucket/key authority).
  5. Backend issues short-lived presigned URL.
  6. Browser uploads directly to R2 (no long-lived credentials in frontend).
  7. Backend persists object metadata and verifies ownership on all follow-up operations.
Cross-tenant lookups are denied and return 404 to avoid leaking object existence.

API endpoints

POST /storage/uploads/presign

Returns a short-lived presigned PUT URL. Input includes:
  • filename
  • content_type
  • file_size (optional)
  • category (raw, processed, exports, tmp)

POST /storage/uploads/{object_id}/complete

Marks upload as complete and optionally verifies object existence/size via HeadObject.

GET /storage/objects/{object_id}/download-url

Returns a short-lived presigned GET URL for the tenant-owned object.

GET /storage/objects

Lists tenant-owned uploaded object metadata (object ID, filename, content type, size, status, timestamps).

GET /storage/objects/{object_id}/preview

Returns a bounded CSV preview payload for a tenant-owned uploaded object:
  • columns
  • sample rows
  • detected timestamp/target/item_id columns
  • detected frequency (when inferable)
Preview is restricted to text/csv objects and read caps are enforced server-side.

POST /forecast/storage-object

Submits a forecast job using a tenant-owned object_id instead of multipart upload.
The backend resolves bucket/key server-side, reads the object, and reuses the normal forecast execution path (local or Anyscale offload).

Required environment variables

Storage and tenant mapping:
  • EOMER_API_KEY_TENANT_MAP_JSON
  • EOMER_STORAGE_PROVIDER (r2)
  • EOMER_STORAGE_APP_NAME
  • EOMER_STORAGE_APP_ENV
  • EOMER_STORAGE_PRESIGN_TTL_SECONDS
  • EOMER_STORAGE_ALLOWED_CONTENT_TYPES (optional)
  • EOMER_STORAGE_MAX_UPLOAD_BYTES (optional)
  • EOMER_STORAGE_PREVIEW_MAX_BYTES (optional, default 262144)
  • EOMER_STORAGE_PREVIEW_MAX_ROWS (optional, default 100)
Cloudflare R2:
  • EOMER_R2_ACCOUNT_ID
  • EOMER_R2_ACCESS_KEY_ID
  • EOMER_R2_SECRET_ACCESS_KEY
  • EOMER_R2_REGION (typically auto)
  • EOMER_R2_ENDPOINT_URL (optional override)

Cloudflare setup (manual)

  1. Create one private bucket per tenant using the naming convention above.
  2. Create an R2 access key with least privilege for required bucket operations:
    • PutObject
    • GetObject
    • HeadObject
  3. Set API environment variables in deployment platform.
  4. Set EOMER_API_KEY_TENANT_MAP_JSON for all keys that should access storage.

Local testing

  1. Set env vars from .env.example (including a test key -> tenant map entry).
  2. Start API:
uvicorn eomer_forecasting.api.app:app --host 0.0.0.0 --port 8000
  1. Request presign URL:
curl -X POST http://localhost:8000/storage/uploads/presign \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"filename":"data.csv","content_type":"text/csv","file_size":1234,"category":"raw"}'
  1. Upload file directly to returned upload_url with PUT.
  2. Call POST /storage/uploads/{object_id}/complete.
  3. Call GET /storage/objects/{object_id}/download-url.

Automated backend checks

Run the complete backend verification suite:
scripts/run_backend_storage_checks.sh
The script auto-loads .env from repo root (if present) before running checks. Optional live smoke-test against deployed API:
# Set in .env (or export in shell):
# EOMER_SMOKE_API_URL="https://your-api.example.com"
# EOMER_SMOKE_API_KEY="your-bearer-key"
scripts/run_backend_storage_checks.sh
Direct smoke-test only:
python scripts/storage_smoke_test.py --url "$EOMER_SMOKE_API_URL" --key "$EOMER_SMOKE_API_KEY"

Portal integration flow (www.eomer.ai)

  1. Portal server authenticates the user and resolves the active organization.
  2. Portal maps organizationId -> forecasting API key via FORECAST_STORAGE_ORG_API_KEYS_JSON.
  3. Portal requests POST /storage/uploads/presign.
  4. Browser uploads CSV directly to R2 using presigned PUT.
  5. Portal finalizes with POST /storage/uploads/{object_id}/complete.
  6. Portal stores dataset metadata (Dataset/DatasetVersion) keyed by storage-object:{object_id}.
  7. For “Connect to database”, portal calls GET /storage/objects and GET /storage/objects/{object_id}/preview.
  8. Portal submits jobs with POST /forecast/storage-object.

Portability and migration optionality

The API uses an internal object storage interface and an S3-compatible adapter.
Cloudflare R2 is currently implemented, but migration to AWS S3 or another S3-compatible provider should only require adapter/config changes, not route contract changes.

Operational notes and current limitations

  • Runtime does not auto-create buckets; provisioning is an explicit ops step.
  • Presigned upload size limits are enforced at presign request time; the API can also verify size during completion.
  • Metadata is stored in Redis (with in-memory fallback in local/test mode).
  • Current upload path categories are fixed: raw/, processed/, exports/, tmp/.