Submitted to NeurIPS 2026

Longer-form analytic papers on what an operations layer actually sees.

Work-in-progress papers drawn from the corpus of operational data Polymr ingests across deployments, with explicit methodology, sample, and illustrative-data disclosure. The headline paper, on the ontology / reasoning / action loop, is currently submitted to NeurIPS 2026. The compile pipeline that feeds the ontology parses documents at roughly ten percent the cost of equivalent inference on Anthropic or OpenAI APIs.

Papers in flight.

Each paper lists status, abstract, methodology, and a single figure that anchors the argument.

submitted to NeurIPS 2026Preview available on request
Ontology, reasoning, and action. An AI-native systems layer for manufacturing.
We describe an AI-native manufacturing systems layer centred on a dynamic, extensible ontology of parts, processes, resources, and constraints. Rather than a fixed schema, the ontology is treated as a structured object that can be parameterised, extended, and re-mapped as new data, factories, and decision contexts appear. Unstructured inputs (BOMs, routings, drawing PDFs, planner spreadsheets, operator notes) are continuously compiled into this evolving representation with explicit provenance and uncertainty on every cell. The compile pipeline parses documents at roughly ten percent the cost of equivalent inference on Anthropic or OpenAI APIs by compiling against the ontology rather than re-extracting per call. On top of the ontology sits a reasoning layer that operates over the induced state space: validating consistency, inferring missing structure, simulating production flows, and selecting actions under constraints. These reasoning components produce machine-verifiable decision traces linking evidence to claims to state transitions to actions (scheduling, purchasing, vendor selection), enabling adaptive behaviour rather than static workflows. The system is designed as ontology plus reasoning plus action, where learning and automation emerge from iterating this loop in real manufacturing environments. The paper presents the formalism, the compile-from-unstructured pipeline, the trace-verification protocol, and field results from the first production tenants running the loop end to end.
Unstructured plant inputs compile into the ontology
Ontology
typed manufacturing state
Reasoning
validate, infer, simulate
Action
schedule, purchase, select
Machine-verifiable decision traces, evidence to action
Ontology, reasoning, action. The closing loop.
Methodology
Formal definition of the ontology as a parameterised typed graph with provenance and uncertainty annotations. Compile pipeline evaluated on a corpus of 4,200 unstructured artefacts across 11 tenant plants (BOMs, routings, drawings, planner spreadsheets, operator notes). Reasoning layer instrumented end-to-end across scheduling, purchasing, and vendor-selection decisions; decision traces validated against deterministic ground truth where available and against operator judgement otherwise. RL component trained on production traces, evaluated for action-quality against held-out tenant data.
previewPreview Q3 2026 · full Q4 2026
The shape of operational data across 80+ manufacturing plants
We pulled every inbound document, ERP export, and integration payload that crossed the Polymr ingestion boundary across 80+ deployed plants and reduced the universe to a finite taxonomy of operational data shapes. The dominant finding is that the long tail does not converge. Every plant has at least one feed (typically a supplier quote PDF, a freight-cost CSV, or a hand-edited cycle-count spreadsheet) that does not share schema with any other plant. The paper proposes a five-axis classification (rate, latency, schema discipline, source-of-truth claim, regulator-touched) that predicts which feed will dominate integration cost during a rollout. We argue that the operations layer above ERP is the only architecturally honest place to absorb the tail, because each individual ERP project ships against a uniform-schema assumption that the data does not satisfy.
Five-axis classification, three plants overlaid
Rate
Latency
Schema
Truth claim
Regulator
Plant A · automotive tier-2
Plant B · adhesives
Plant C · F&B
Five-axis classification, three plant traces overlaid.
Methodology
Sample: 86 production plants, Aug 2024 to Mar 2026. Per-feed coding by two reviewers; inter-rater agreement 0.84. Five-axis scoring derived inductively then validated against a held-out subsample of 11 plants.
draftSummary on request · full Q1 2027
Where ERP integration actually leaks
ERP integrations fail at the seams, not at the centre. We catalogue 14 failure surface categories observed across deployments: clock skew between sap-side and partner-side timestamps, idempotency-key collision under multi-replay, EDI character-set drift between trading partners, posting-date assumptions that desynchronise across plants in different timezones, AP three-way-match windows that exclude weekends asymmetrically, and rank them by the cost a typical rollout absorbs to address each one. The argument is that integration cost is more accurately modelled as failure-surface area than as endpoint count: a single endpoint with three poorly-defined failure modes is more expensive than four endpoints with disciplined error contracts. We end with a vendor-evaluation checklist that scores each failure surface explicitly rather than rolling it into a generic SLA number.
14 failure-surface categories at the ERP seam
Clock skewtime
Idempotency keyreplay
EDI charset driftschema
Posting date TZtime
Weekend AP windowtime
PO partial matchschema
Lot split roundingschema
Multi-replay dupreplay
GR backdatetime
Vendor field mapschema
Currency roundingschema
COA gate racereplay
Tax line splitschema
Cancellation echoreplay
timezone or window idempotency or replay schema drift
14 failure-surface categories at the ERP seam ring.
Methodology
Coded failure reports across 47 integration projects (Polymr internal plus 9 partner engagements). Cost attribution by recovery-hours-logged, conservative-side rounded.
summary availableSummary available · full Q2 2027
Cost-basis chains under revision propagation
When a vendor revises a unit cost on a previously-booked PO, the chain of derived costs (work-order materials cost, finished-goods unit cost, posted COGS) has to either propagate the revision or carry a divergence. We derive the conditions under which a cost-basis graph remains stable under revision (revisions never cross a kind-gating boundary; the chain skips no node; idempotency keys are preserved across replays) and characterise the corruption modes when those conditions break. The mathematical core is a fixed-point argument over a directed cost-derivation graph, with proofs of stability and a worked counterexample drawn from a real two-plant deployment where a single mis-gated revision corrupted three months of COGS. Practical takeaway: cost-basis chains must skip kind-gating during revision propagation, or downstream draft-locking protections are silently bypassed.
Directed cost-derivation graph with a kind-gate skip path
Vendor PO
Revision
WO material
FG cost
COGS
Kind-gate skip path
Revision skips the WO material node and writes directly into FG cost. One propagated update silently corrupted three months of COGS in the deployment.
Directed cost-derivation graph with a kind-gate skip path.
Methodology
Formal model verified against three production cost-derivation traces. Proofs in appendix; the corruption case is from a deployment audit log with customer identity removed.
previewPreview Q4 2026 · full Q1 2027
Approval queue dynamics. What bulk approval really collapses
Bulk approval is the operations team's most reached-for queue control. We instrumented approval queues across a small set of plants for six months and asked what bulk approval actually collapses behaviourally. The dominant finding: bulk approval rarely shortens the median queue but dramatically shortens the p95 tail by ~63% (illustrative, across the instrumented set). The tail-shortening comes from approvers using bulk as a stale-context recovery rather than a daily throughput tool. We argue this changes the surface a vendor needs to ship: not 'select many, click one', but 'approve everything that is still stale-context clean, with explicit override for what is not'. The paper presents the queue traces, the override rates, and a proposed UI contract that maps to the observed behaviour.
Queue-depth distribution, before vs after bulk approval
B0
12to14
B1
22to26
B2
34to38
B3
30to32
B4
24to26
B5
18to16
B6
14to10
B7
10to6
B8
8to4
B9 (tail)
22to2
before bulk after bulk
p95 down 63 percent
Queue-depth distribution before and after bulk approval.
Methodology
Six-month instrumentation, four plants, 38 approvers. Approval events coded by (queue depth at decision, time-since-arrival, batch size). p95 computed against full distribution; illustrative across this set.

All findings are illustrative. Drawn from specific deployments rather than averaged marketing aggregates.

Longer-form analytic papers on what an operations layer actually sees.

Papers in flight.

Ontology, reasoning, and action. An AI-native systems layer for manufacturing.

The shape of operational data across 80+ manufacturing plants

Where ERP integration actually leaks

Cost-basis chains under revision propagation

Approval queue dynamics. What bulk approval really collapses