Skip to main content
The S3-compatible connector (AWS S3, Cloudflare R2, MinIO) is fully wired into the REST API: create a connection once, then test, validate, and forecast against it. Connection-management calls require a tenant identity (an API key with an organization, or an EOMER_API_KEY_TENANT_MAP_JSON mapping for local keys).

What a connector gives you

A data connection is a tenant-owned, reusable pointer to your data system, with credentials handled securely on eomer’s side. Once configured, eomer can:
  • Test connectivity and authentication (test_connection)
  • Discover objects under a prefix (list_objects) and estimate volume (estimate_size)
  • Infer schema from a bounded sample (infer_schema)
  • Validate the data as a forecasting dataset (validate_source)
  • Read the data for a forecast job (read_data)
  • Write forecast results back to a destination (write_output)

Credential model (IAM-first)

Auth methodSecret stored?Use when
sts_assume_roleNoRecommended for AWS — you grant eomer’s AWS identity sts:AssumeRole on a read-only role (role_arn + external_id on the connection); eomer uses short-lived STS credentials.
iam_role, instance_profile, workload_identityNoSame-account / workload-identity setups.
access_keyYes — envelope-encryptedCloudflare R2, MinIO, or S3 without role access.
sas_token, service_account_jsonYes — envelope-encryptedReserved for Azure / GCS in a later phase.
When a secret is stored, it is encrypted with a Fernet key from EOMER_CONNECTOR_ENCRYPTION_KEY before persistence. API responses never return secrets — only a redacted summary (auth_method, a masked hint, key version). See the package README.md for rotation and the KMS path.

REST quickstart

BASE=https://api.eomer.ai   # or http://localhost:8000
AUTH="Authorization: Bearer $EOMER_API_KEY"

# 1. Create a connection (Cloudflare R2 — credentials are sealed at rest)
curl -s -X POST "$BASE/data-connections" -H "$AUTH" -H "Content-Type: application/json" -d '{
  "connector_type": "r2",
  "name": "sales-lake",
  "object_storage": {
    "provider": "cloudflare_r2",
    "bucket": "company-forecasting-data",
    "prefix": "sales/daily/",
    "endpoint_url": "https://<account_id>.r2.cloudflarestorage.com",
    "region": "auto",
    "auth_method": "access_key",
    "file_format": "csv",
    "credentials": {"access_key_id": "<r2-key-id>", "secret_access_key": "<r2-secret>"}
  }
}'
# → 201 {"connection_id": "conn_ab12cd34ef56", "status": "untested",
#        "credential": {"auth_method": "access_key", "hint": "***REDACTED...", ...}}

CONN=conn_ab12cd34ef56

# 2. Test connectivity (status is persisted on the connection)
curl -s -X POST "$BASE/data-connections/$CONN/test" -H "$AUTH"

# 3. Validate as a forecasting dataset
curl -s -X POST "$BASE/data-connections/$CONN/validate" -H "$AUTH" \
  -H "Content-Type: application/json" \
  -d '{"item_id_column": "store_id", "timestamp_column": "date", "target_column": "sales"}'

# 4. Submit a forecast straight from the connection
curl -s -X POST "$BASE/data-connections/$CONN/forecast" -H "$AUTH" \
  -H "Content-Type: application/json" \
  -d '{"prediction_length": 14, "presets": "small",
       "item_id_column": "store_id", "timestamp_column": "date", "target_column": "sales"}'
# → 202 {"job_id": "...", "status": "pending", ...} — poll GET /jobs/{job_id} as usual

# 5. Review the audit trail
curl -s "$BASE/data-connections/audit?connection_id=$CONN" -H "$AUTH"
Responses never contain credentials — only a redacted credential summary (auth method, masked key hint, key version).

Configure an S3-compatible source (Python library)

AWS S3 (keyless)

from eomer_forecasting.connectors import get_enterprise_connector
from eomer_forecasting.connectors.types import ConnectorType, ConnectorProvider
from eomer_forecasting.connectors.credentials import AuthMethod
from eomer_forecasting.connectors.schemas import (
    DataConnectionSpec, ObjectStorageConnectionSpec, ForecastDatasetSpec,
)

spec = DataConnectionSpec(
    tenant_id="org_123",
    connector_type=ConnectorType.s3,
    name="sales-lake",
    object_storage=ObjectStorageConnectionSpec(
        provider=ConnectorProvider.aws_s3,
        bucket="company-forecasting-data",
        prefix="sales/daily/",
        region="us-east-1",
        auth_method=AuthMethod.iam_role,   # no secret stored
        file_format="parquet",
    ),
)

connector = get_enterprise_connector(ConnectorType.s3)
print(connector.test_connection(spec))

Cloudflare R2

R2 is S3-compatible, so it reuses the same connector — only the endpoint, region, and (sealed) access key differ:
from eomer_forecasting.connectors.credentials import SecretCipher, AuthMethod

cred = SecretCipher.from_env().seal(
    AuthMethod.access_key,
    {"aws_access_key_id": "<r2-key-id>", "aws_secret_access_key": "<r2-secret>"},
    hint_field="aws_access_key_id",
)

os_spec = ObjectStorageConnectionSpec(
    provider=ConnectorProvider.cloudflare_r2,
    bucket="company-forecasting-data",
    prefix="sales/daily/",
    endpoint_url="https://<account_id>.r2.cloudflarestorage.com",
    region="auto",
    auth_method=AuthMethod.access_key,
    credential=cred,
    file_format="parquet",
)

Validate before you forecast

dataset = ForecastDatasetSpec(
    item_id_column="store_id",
    timestamp_column="date",
    target_column="sales",
    known_covariates_columns=["price", "promotion_flag"],
)
report = connector.validate_source(spec, dataset)
if not report.is_valid:
    print(report.errors, report.error_codes)
print("frequency:", report.inferred_frequency, "rows:", report.row_count)
validate_source reuses eomer’s core validator (required columns, null checks, numeric target, duplicate entity-time rows, parseable timestamps, minimum history) and adds frequency inference, schema-drift detection, and a volume guardrail.

Deliver results back to your system

Add an output block to the forecast request and eomer writes the completed forecast straight back to your bucket — closing the loop with no polling or manual download:
curl -s -X POST "$BASE/data-connections/$CONN/forecast" -H "$AUTH" \
  -H "Content-Type: application/json" -d '{
    "prediction_length": 14, "presets": "small",
    "item_id_column": "store_id", "timestamp_column": "date", "target_column": "sales",
    "output": {
      "key_template": "eomer-forecasts/{date}/{job_id}.csv",
      "file_format": "csv",
      "allow_overwrite": false
    }
  }'
  • output.connection_id may point at a different connection (it must belong to your tenant); omitted = write back to the source connection.
  • key_template supports {job_id} and {date} placeholders; the default is eomer-forecasts/{date}/{job_id}.csv. Formats: csv, json, jsonl, parquet.
  • Delivery never overwrites existing objects unless allow_overwrite: true.
  • Track it on the job: GET /jobs/{job_id} returns delivery_status (pending → delivering → delivered | failed | skipped), delivery_uri, and a sanitized delivery_error. A failed delivery never fails the forecast — the result stays downloadable and the attempt is recorded in the audit trail (action: "deliver").

Partitioned data, globs, and compression

  • A trailing-slash prefix reads and concatenates all objects under it; add key_pattern (e.g. "part-*.csv", "2026-*/part-*.parquet") to filter.
  • Set compression: "gzip" on the connection for gzip’d CSV/JSON objects (zstd supported with the optional zstandard package); Parquet’s internal codec needs no configuration.
  • Forecast submission returns 202 immediately — the connector read happens inside the job worker, and a failed read marks the job failed with the sanitized connector error (visible on GET /jobs/{job_id} and in the audit trail).

Safety guarantees

  • Reads are byte- and row-bounded (max_bytes / max_rows) — oversized sources raise DATASET_TOO_LARGE instead of OOMing.
  • Object keys are sanitised against path traversal and control characters.
  • Format magic bytes are checked (a Parquet file declared as CSV fails fast).
  • Writes refuse to overwrite existing objects unless explicitly allowed, and can apply server-side encryption.
  • Every failure maps to a stable ConnectorErrorCode (AUTHENTICATION_FAILED, AUTHORIZATION_FAILED, SOURCE_NOT_FOUND, RATE_LIMITED, …); provider error text is never echoed back.