Tenant-Isolated Storage (Cloudflare R2) - eomer.ai Forecasting API

Why one bucket per tenant

Customer datasets are sensitive and must remain strictly isolated.
This design uses one private bucket per tenant to reduce cross-tenant blast radius and make access boundaries explicit for security reviews and audits. Bucket naming convention: <app>-<env>-tenant-<tenantSlug> Example: eomer-production-tenant-acme

Isolation model

The API enforces tenant isolation at control-plane level:

Client authenticates with a Bearer API key.
Backend resolves API key -> tenant slug from EOMER_API_KEY_TENANT_MAP_JSON.
Backend computes tenant bucket name server-side.
Backend computes object key server-side (clients cannot choose bucket/key authority).
Backend issues short-lived presigned URL.
Browser uploads directly to R2 (no long-lived credentials in frontend).
Backend persists object metadata and verifies ownership on all follow-up operations.

Cross-tenant lookups are denied and return 404 to avoid leaking object existence.

API endpoints

`POST /storage/uploads/presign`

Returns a short-lived presigned PUT URL. Input includes:

filename
content_type
file_size (optional)
category (raw, processed, exports, tmp)

`POST /storage/uploads/{object_id}/complete`

Marks upload as complete and optionally verifies object existence/size via HeadObject.

`GET /storage/objects/{object_id}/download-url`

Returns a short-lived presigned GET URL for the tenant-owned object.

`GET /storage/objects`

Lists tenant-owned uploaded object metadata (object ID, filename, content type, size, status, timestamps).

`GET /storage/objects/{object_id}/preview`

Returns a bounded CSV preview payload for a tenant-owned uploaded object:

columns
sample rows
detected timestamp/target/item_id columns
detected frequency (when inferable)

Preview is restricted to text/csv objects and read caps are enforced server-side.

`POST /forecast/storage-object`

Submits a forecast job using a tenant-owned object_id instead of multipart upload.
The backend resolves bucket/key server-side, reads the object, and reuses the normal forecast execution path (local or Anyscale offload).

Required environment variables

Storage and tenant mapping:

EOMER_API_KEY_TENANT_MAP_JSON
EOMER_STORAGE_PROVIDER (r2)
EOMER_STORAGE_APP_NAME
EOMER_STORAGE_APP_ENV
EOMER_STORAGE_PRESIGN_TTL_SECONDS
EOMER_STORAGE_ALLOWED_CONTENT_TYPES (optional)
EOMER_STORAGE_MAX_UPLOAD_BYTES (optional)
EOMER_STORAGE_PREVIEW_MAX_BYTES (optional, default 262144)
EOMER_STORAGE_PREVIEW_MAX_ROWS (optional, default 100)

Cloudflare R2:

EOMER_R2_ACCOUNT_ID
EOMER_R2_ACCESS_KEY_ID
EOMER_R2_SECRET_ACCESS_KEY
EOMER_R2_REGION (typically auto)
EOMER_R2_ENDPOINT_URL (optional override)

Cloudflare setup (manual)

Create one private bucket per tenant using the naming convention above.
Create an R2 access key with least privilege for required bucket operations:
- PutObject
- GetObject
- HeadObject
Set API environment variables in deployment platform.
Set EOMER_API_KEY_TENANT_MAP_JSON for all keys that should access storage.

Local testing

Set env vars from .env.example (including a test key -> tenant map entry).
Start API:

uvicorn eomer_forecasting.api.app:app --host 0.0.0.0 --port 8000

Request presign URL:

curl -X POST http://localhost:8000/storage/uploads/presign \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"filename":"data.csv","content_type":"text/csv","file_size":1234,"category":"raw"}'

Upload file directly to returned upload_url with PUT.
Call POST /storage/uploads/{object_id}/complete.
Call GET /storage/objects/{object_id}/download-url.

Automated backend checks

Run the complete backend verification suite:

scripts/run_backend_storage_checks.sh

The script auto-loads .env from repo root (if present) before running checks. Optional live smoke-test against deployed API:

# Set in .env (or export in shell):
# EOMER_SMOKE_API_URL="https://your-api.example.com"
# EOMER_SMOKE_API_KEY="your-bearer-key"
scripts/run_backend_storage_checks.sh

Direct smoke-test only:

python scripts/storage_smoke_test.py --url "$EOMER_SMOKE_API_URL" --key "$EOMER_SMOKE_API_KEY"

Portal integration flow (`www.eomer.ai`)

Portal server authenticates the user and resolves the active organization.
Portal maps organizationId -> forecasting API key via FORECAST_STORAGE_ORG_API_KEYS_JSON.
Portal requests POST /storage/uploads/presign.
Browser uploads CSV directly to R2 using presigned PUT.
Portal finalizes with POST /storage/uploads/{object_id}/complete.
Portal stores dataset metadata (Dataset/DatasetVersion) keyed by storage-object:{object_id}.
For “Connect to database”, portal calls GET /storage/objects and GET /storage/objects/{object_id}/preview.
Portal submits jobs with POST /forecast/storage-object.

Portability and migration optionality

The API uses an internal object storage interface and an S3-compatible adapter.
Cloudflare R2 is currently implemented, but migration to AWS S3 or another S3-compatible provider should only require adapter/config changes, not route contract changes.

Operational notes and current limitations

Runtime does not auto-create buckets; provisioning is an explicit ops step.
Presigned upload size limits are enforced at presign request time; the API can also verify size during completion.
Metadata is stored in Redis (with in-memory fallback in local/test mode).
Current upload path categories are fixed: raw/, processed/, exports/, tmp/.

​Why one bucket per tenant

​Isolation model

​API endpoints

​POST /storage/uploads/presign

​POST /storage/uploads/{object_id}/complete

​GET /storage/objects/{object_id}/download-url

​GET /storage/objects

​GET /storage/objects/{object_id}/preview

​POST /forecast/storage-object

​Required environment variables

​Cloudflare setup (manual)

​Local testing

​Automated backend checks

​Portal integration flow (www.eomer.ai)

​Portability and migration optionality

​Operational notes and current limitations