Methodology

AU-Codex data pipeline and methodology

AU-Codex is built from official source files, normalized into JSON and rendered as static pages. The goal is to keep the published pages faithful to source data while making the hierarchy easier to browse and compare.

Current pipeline stages

  • Source manifest and verified direct-download URLs
  • Raw-source downloader with report output
  • ABS detailed-classification seed URL inventory
  • Parser orchestration and generated JSON exports
  • Static Astro page generation from normalized data
  • XML sitemap generation and noindex handling for utility pages

Validation rules

Data is cross-checked across multiple official or authoritative sources where possible. The project prefers source-preserving normalization over rewriting the source facts into marketing copy.

If a mapping is approximate or inferred, the page should say so. If a page is a utility page, it should stay noindex rather than compete with core entity pages.

Source and trust

Primary sources
ABS and ATO publications
Last reviewed
2026-04-17

This site is an independent reference resource. It is not affiliated with, endorsed by, or connected to the ABS, ATO or any Australian Government agency.

Please verify critical classification decisions with the official authority before using them for tax, payroll, licensing, immigration or compliance work.

Report a correction