Parts knowledge platform v1
Recommended first architecture for Circuit Genome's part browser, part-number decipher, BOM decipher, and reusable part-backed FMEA knowledge. The system is intentionally split into discovery, canonical library, and user overlay layers so the UI stays fast while every displayed answer remains footnoted to source material.
Status
| Audience | App implementers defining the first serious parts platform layer for Circuit Genome. |
| V1 Goal | Resolve BOM lines and part-number queries into source-backed canonical part records with fast library lookup and per-project deviation notes. |
| Canonical Identity | Prefer manufacturer_id + mpn_normalized. Fall back to provisional unresolved records only when exact identity is still unknown. |
| Trust Model | The UI shows canonical values first, but every displayed field must expose supporting footnotes back to the underlying source document or provider record. |
Non-Negotiable Rules
- Discovery jobs must not write raw scraped values directly into canonical fields without a normalization pass.
- Canonical part records, source evidence, and user or project overlays must be stored separately.
- Every displayed field that originated from a provider, listing, or datasheet must be traceable to at least one source document row.
- Part-level risk priors belong in the parts platform, but final risk statements for a board still depend on circuit role and project context.
- User deviation notes and approved substitutions must never silently overwrite the canonical library.
System Layers
| Layer | What Lives There | What Does Not |
|---|---|---|
| Discovery | Unknown-part lookup jobs, provider adapters, raw source documents, extracted assertions, candidate ranking. | User-facing truth without review context. |
| Canonical Library | Reviewed normalized part records, aliases, parameters, offers, package and lifecycle metadata, part-level risk priors. | Per-board interpretation notes and one-off project deviations. |
| User / Project Overlay | Deviation notes, approved substitutes, local risk-statement overrides, upload-scoped interpretations. | Global edits that affect all users and all future projects. |
First-Version Pages
| Page | Purpose | Access |
|---|---|---|
/app/parts/ | Entry page for the parts workspace. | Authenticated |
/app/parts/browser/ | Search and filter known parts from the maintained library. | Authenticated |
/app/parts/lookup/ | Resolve unknown part numbers, rough BOM entries, or markings into candidates. | Authenticated |
/app/parts/:id | Show one canonical part record with parameters, footnotes, offers, and risk priors. | Authenticated |
/app/part-number-decipher/ | Parse series tokens, package hints, and probable family details from a part string. | Authenticated |
/app/bom-decipher/ | Upload or paste BOM rows and return structured part matches plus unresolved lines. | Authenticated |
End-to-End Flow
| Step | System Action | Stored Output |
|---|---|---|
| 1 | User searches for a part number or submits BOM rows. | part_lookup_jobs or bom_decode_jobs row. |
| 2 | Resolver checks aliases and canonical library before any outbound provider call. | Immediate hit or discovery job continuation. |
| 3 | Provider adapters fetch candidate product records and source documents. | source_documents rows plus raw provider identifiers. |
| 4 | Normalizer extracts structured fields and stores them as assertions. | source_assertions rows keyed by field path. |
| 5 | Matcher ranks candidates and either links to an existing canonical part or creates a provisional one. | part_lookup_candidates rows and optional part_records row. |
| 6 | Reviewed canonical fields become visible in the browser, always with footnotes. | part_records, part_param_values, part_offers, part_risk_priors. |
| 7 | User adds project-specific notes or overrides without mutating canonical truth. | part_overlays row scoped to user and upload. |
Core API Surface
| Route | Purpose | Notes |
|---|---|---|
GET /api/parts/search?q= | Search maintained canonical parts. | Fast library-first query. Supports filters like family, manufacturer, lifecycle, and review_status. |
POST /api/parts/resolve | Resolve one unknown part string synchronously when the library hit rate is high. | Use for single-line lookup from the browser or a hover action. |
POST /api/parts/lookups | Create an asynchronous discovery job for ambiguous or web-backed lookup. | Returns a job ID immediately. |
GET /api/parts/lookups/:id | Return discovery job status plus ranked candidates. | Used by the part-number lookup UI. |
GET /api/parts/:id | Return one canonical part record. | Includes summary fields, primary footnotes, and overlay summary for the current user scope. |
GET /api/parts/:id/sources | Return source documents and assertions supporting the part record. | Backs the footnote drawer. |
GET /api/parts/:id/offers | Return latest provider and distributor offer records. | Price and stock are volatile; cache aggressively but stamp observed_at. |
GET /api/parts/:id/risk | Return part-level risk priors and reusable selection notes. | Feeds FMEA defaults, not final project conclusions. |
PATCH /api/parts/:id/overlay | Create or update user or upload-scoped notes and overrides. | Body should carry upload_id when the note is project-scoped. |
POST /api/part-number-decoder/decode | Parse a part string into probable manufacturer, family, package, and series tokens. | Library-first, then heuristic fallback. |
POST /api/bom-decodes | Create a BOM decipher job from pasted text or an uploaded file. | Returns job ID immediately and stores row-level matching state. |
GET /api/bom-decodes/:id | Return BOM job status, summary metrics, and unresolved counts. | Top-level polling endpoint. |
GET /api/bom-decodes/:id/rows | Return row-level decipher results and candidate links. | Paginates large BOMs. |
Canonical Library Tables
| Table | Key Fields | Notes |
|---|---|---|
manufacturers | id, name, name_normalized, website_url, created_at | Normalize names once so aliases do not pollute canonical identity. |
part_records | id, manufacturer_id, mpn, mpn_normalized, part_number, family_key, description, package_code, lifecycle_status, review_status, primary_source_document_id, timestamps | Primary canonical part row. Enforce uniqueness on (manufacturer_id, mpn_normalized) when both are present. |
part_aliases | id, part_record_id, alias_type, alias_value, alias_normalized, source_id, confidence, created_at | Supports distributor SKUs, legacy BOM strings, and formatting variants. |
part_param_values | id, part_record_id, param_key, display_name, normalized_value_text, normalized_value_num, unit, value_json, primary_assertion_id, updated_at | Stores normalized electrical and mechanical parameters such as capacitance, tolerance, ESR, voltage, package, and temperature range. |
part_offers | id, part_record_id, source_id, seller_name, seller_part_number, stock_qty, moq, price_breaks_json, currency, observed_at | Volatile commercial data should stay separate from the stable canonical part row. |
part_risk_priors | id, part_record_id, family_key, technology_key, failure_mode_key, occurrence_bias, detection_bias, selection_guidance, statement_template, primary_assertion_id, updated_at | Reusable part-backed FMEA priors. This is where known short-fail tendencies, polymer self-healing notes, or derating guidance belong. |
Discovery and Provenance Tables
| Table | Key Fields | Notes |
|---|---|---|
part_sources | id, provider_key, provider_name, source_type, trust_tier, base_url, created_at | Examples: manufacturer datasheet, distributor API, aggregator API, user import. |
source_documents | id, source_id, document_type, provider_part_key, url, title, sha256_hex, retrieved_at, published_at | One row per datasheet, provider record, or listing snapshot used as evidence. |
source_assertions | id, source_document_id, part_record_id, field_path, raw_value, normalized_value_json, confidence, extraction_method, created_at | Field-level provenance ledger. This is what powers footnotes and future diff reviews. |
part_lookup_jobs | id, user_id, query_text, query_type, status, top_candidate_part_record_id, requested_at, completed_at | Tracks unknown-part discovery and resolver work. |
part_lookup_candidates | id, lookup_job_id, part_record_id, manufacturer_guess, mpn_guess, score, match_reason, source_summary_json, created_at | Stores the candidate list returned to the user for ambiguous matches. |
Overlay and BOM Tables
| Table | Key Fields | Notes |
|---|---|---|
part_overlays | id, user_id, upload_id, part_record_id, role_key, function_key, failure_mode_key, deviation_notes, risk_statement_override, selection_guidance_override, approved_substitute_part_id, timestamps | Per-user or per-upload interpretation layer. This is where Circuit Genome should store deviation notes and board-specific risk wording. |
bom_decode_jobs | id, user_id, upload_id, filename, status, row_count, resolved_count, requested_at, completed_at | Top-level job for BOM decipher work. |
bom_decode_rows | id, bom_decode_job_id, row_index, raw_line_json, manufacturer_guess, mpn_guess, part_record_id, resolution_status, candidate_count, created_at | Row-level result set for BOM browsers and export back to CSV. |
Resolver Precedence
- User or upload-scoped overlay for the current part and failure mode.
- Exact canonical part match by
manufacturer_id + mpn_normalized. - Exact canonical alias match from
part_aliases. - Part-specific risk priors from
part_risk_priors. - Family or technology priors from the canonical library.
- Existing family, role, and function-level FMEA knowledge files.
This precedence keeps the library fast and deterministic while still allowing source-backed part knowledge to sharpen the default FMEA behavior.
Part Detail Response Shape
The browser should not receive anonymous flattened strings. A part detail response should carry canonical values plus visible footnotes and overlay state.
{
"part": {
"id": "part_01",
"manufacturer": "Panasonic",
"mpn": "EEE-FK1V152XP",
"family_key": "capacitor",
"review_status": "reviewed"
},
"summary": {
"description": {
"value": "Aluminum electrolytic capacitor",
"footnotes": ["src_12"]
},
"voltage_rating": {
"value": "35 V",
"footnotes": ["src_12", "src_21"]
}
},
"risk_priors": [
{
"failure_mode_key": "open",
"selection_guidance": "Use ripple and lifetime margin review for bulk-storage roles.",
"footnotes": ["src_12"]
}
],
"overlay": {
"upload_id": "upl_01",
"risk_statement_override": "",
"deviation_notes": ""
}
}
Suggested Build Sequence
| Phase | Deliverable | Outcome |
|---|---|---|
| 1 | Canonical tables, aliases, search endpoint, and part browser shell. | Circuit Genome can serve known parts quickly from maintained records. |
| 2 | Discovery job pipeline with source documents and assertions. | Unknown parts can be resolved without contaminating canonical truth. |
| 3 | Footnote drawer, part detail page, and overlay notes. | The user sees source-backed answers plus project-level deviation notes. |
| 4 | BOM decipher jobs and row-level resolver UI. | Parts knowledge becomes useful at BOM scale, not just one part at a time. |
| 5 | Part-backed FMEA priors and risk-statement composition. | Known part behavior starts affecting analysis quality across the platform. |