Polymr
Submitted to NeurIPS 2026

Longer-form analytic papers on what an operations layer actually sees.

Work-in-progress papers drawn from the corpus of operational data Polymr ingests across deployments, with explicit methodology, sample, and illustrative-data disclosure. The headline paper, on the ontology / reasoning / action loop, is currently submitted to NeurIPS 2026. The compile pipeline that feeds the ontology parses documents at roughly ten percent the cost of equivalent inference on Anthropic or OpenAI APIs.

Papers in flight.

Each paper lists status, abstract, methodology, and a single figure that anchors the argument.

  • submitted to NeurIPS 2026Preview available on request

    Ontology, reasoning, and action. An AI-native systems layer for manufacturing.

    We describe an AI-native manufacturing systems layer centred on a dynamic, extensible ontology of parts, processes, resources, and constraints. Rather than a fixed schema, the ontology is treated as a structured object that can be parameterised, extended, and re-mapped as new data, factories, and decision contexts appear. Unstructured inputs (BOMs, routings, drawing PDFs, planner spreadsheets, operator notes) are continuously compiled into this evolving representation with explicit provenance and uncertainty on every cell. The compile pipeline parses documents at roughly ten percent the cost of equivalent inference on Anthropic or OpenAI APIs by compiling against the ontology rather than re-extracting per call. On top of the ontology sits a reasoning layer that operates over the induced state space: validating consistency, inferring missing structure, simulating production flows, and selecting actions under constraints. These reasoning components produce machine-verifiable decision traces linking evidence to claims to state transitions to actions (scheduling, purchasing, vendor selection), enabling adaptive behaviour rather than static workflows. The system is designed as ontology plus reasoning plus action, where learning and automation emerge from iterating this loop in real manufacturing environments. The paper presents the formalism, the compile-from-unstructured pipeline, the trace-verification protocol, and field results from the first production tenants running the loop end to end.

    Unstructured plant inputs compile into the ontology

    Ontology
    typed manufacturing state
    Reasoning
    validate, infer, simulate
    Action
    schedule, purchase, select

    Machine-verifiable decision traces, evidence to action

    Ontology, reasoning, action. The closing loop.

    Methodology

    Formal definition of the ontology as a parameterised typed graph with provenance and uncertainty annotations. Compile pipeline evaluated on a corpus of 4,200 unstructured artefacts across 11 tenant plants (BOMs, routings, drawings, planner spreadsheets, operator notes). Reasoning layer instrumented end-to-end across scheduling, purchasing, and vendor-selection decisions; decision traces validated against deterministic ground truth where available and against operator judgement otherwise. RL component trained on production traces, evaluated for action-quality against held-out tenant data.

  • previewPreview Q3 2026 · full Q4 2026

    The shape of operational data across 80+ manufacturing plants

    We pulled every inbound document, ERP export, and integration payload that crossed the Polymr ingestion boundary across 80+ deployed plants and reduced the universe to a finite taxonomy of operational data shapes. The dominant finding is that the long tail does not converge. Every plant has at least one feed (typically a supplier quote PDF, a freight-cost CSV, or a hand-edited cycle-count spreadsheet) that does not share schema with any other plant. The paper proposes a five-axis classification (rate, latency, schema discipline, source-of-truth claim, regulator-touched) that predicts which feed will dominate integration cost during a rollout. We argue that the operations layer above ERP is the only architecturally honest place to absorb the tail, because each individual ERP project ships against a uniform-schema assumption that the data does not satisfy.

    Five-axis classification, three plants overlaid

    • Rate
    • Latency
    • Schema
    • Truth claim
    • Regulator
    • Plant A · automotive tier-2
    • Plant B · adhesives
    • Plant C · F&B

    Five-axis classification, three plant traces overlaid.

    Methodology

    Sample: 86 production plants, Aug 2024 to Mar 2026. Per-feed coding by two reviewers; inter-rater agreement 0.84. Five-axis scoring derived inductively then validated against a held-out subsample of 11 plants.

  • draftSummary on request · full Q1 2027

    Where ERP integration actually leaks

    ERP integrations fail at the seams, not at the centre. We catalogue 14 failure surface categories observed across deployments: clock skew between sap-side and partner-side timestamps, idempotency-key collision under multi-replay, EDI character-set drift between trading partners, posting-date assumptions that desynchronise across plants in different timezones, AP three-way-match windows that exclude weekends asymmetrically, and rank them by the cost a typical rollout absorbs to address each one. The argument is that integration cost is more accurately modelled as failure-surface area than as endpoint count: a single endpoint with three poorly-defined failure modes is more expensive than four endpoints with disciplined error contracts. We end with a vendor-evaluation checklist that scores each failure surface explicitly rather than rolling it into a generic SLA number.

    14 failure-surface categories at the ERP seam

    • Clock skewtime
    • Idempotency keyreplay
    • EDI charset driftschema
    • Posting date TZtime
    • Weekend AP windowtime
    • PO partial matchschema
    • Lot split roundingschema
    • Multi-replay dupreplay
    • GR backdatetime
    • Vendor field mapschema
    • Currency roundingschema
    • COA gate racereplay
    • Tax line splitschema
    • Cancellation echoreplay
    timezone or window idempotency or replay schema drift

    14 failure-surface categories at the ERP seam ring.

    Methodology

    Coded failure reports across 47 integration projects (Polymr internal plus 9 partner engagements). Cost attribution by recovery-hours-logged, conservative-side rounded.

  • summary availableSummary available · full Q2 2027

    Cost-basis chains under revision propagation

    When a vendor revises a unit cost on a previously-booked PO, the chain of derived costs (work-order materials cost, finished-goods unit cost, posted COGS) has to either propagate the revision or carry a divergence. We derive the conditions under which a cost-basis graph remains stable under revision (revisions never cross a kind-gating boundary; the chain skips no node; idempotency keys are preserved across replays) and characterise the corruption modes when those conditions break. The mathematical core is a fixed-point argument over a directed cost-derivation graph, with proofs of stability and a worked counterexample drawn from a real two-plant deployment where a single mis-gated revision corrupted three months of COGS. Practical takeaway: cost-basis chains must skip kind-gating during revision propagation, or downstream draft-locking protections are silently bypassed.

    Directed cost-derivation graph with a kind-gate skip path

    Vendor PO
    Revision
    WO material
    FG cost
    COGS
    Kind-gate skip path

    Revision skips the WO material node and writes directly into FG cost. One propagated update silently corrupted three months of COGS in the deployment.

    Directed cost-derivation graph with a kind-gate skip path.

    Methodology

    Formal model verified against three production cost-derivation traces. Proofs in appendix; the corruption case is from a deployment audit log with customer identity removed.

  • previewPreview Q4 2026 · full Q1 2027

    Approval queue dynamics. What bulk approval really collapses

    Bulk approval is the operations team's most reached-for queue control. We instrumented approval queues across a small set of plants for six months and asked what bulk approval actually collapses behaviourally. The dominant finding: bulk approval rarely shortens the median queue but dramatically shortens the p95 tail by ~63% (illustrative, across the instrumented set). The tail-shortening comes from approvers using bulk as a stale-context recovery rather than a daily throughput tool. We argue this changes the surface a vendor needs to ship: not 'select many, click one', but 'approve everything that is still stale-context clean, with explicit override for what is not'. The paper presents the queue traces, the override rates, and a proposed UI contract that maps to the observed behaviour.

    Queue-depth distribution, before vs after bulk approval

    • B0
      12to14
    • B1
      22to26
    • B2
      34to38
    • B3
      30to32
    • B4
      24to26
    • B5
      18to16
    • B6
      14to10
    • B7
      10to6
    • B8
      8to4
    • B9 (tail)
      22to2
    before bulk after bulk
    p95 down 63 percent

    Queue-depth distribution before and after bulk approval.

    Methodology

    Six-month instrumentation, four plants, 38 approvers. Approval events coded by (queue depth at decision, time-since-arrival, batch size). p95 computed against full distribution; illustrative across this set.

All findings are illustrative. Drawn from specific deployments rather than averaged marketing aggregates.