SydClaw is a managed AI workforce platform for Australian professional services firms, built by AI Lab Australia Pty Ltd. It provides an AI employee that handles email automation, calendar management, document generation, invoice processing, safety compliance, and reporting — 800+ automated actions across 22 modules. It connects to Gmail, Microsoft 365, Xero, SafetyCulture, HubSpot, SharePoint, Smartsheet, and 8,000+ apps via Zapier.

How much does SydClaw cost?

SydClaw Pilot pricing is $360 per user per month plus a one-time $5,000 setup fee. The Pilot plan includes up to 5 modules, full AI guardrails, and a 30-day pilot with rollback. Enterprise pricing is custom and includes unlimited modules, SSO/SAML, dedicated infrastructure, and an SLA. Contact sydclaw.com.au/contact for a quote.

Is SydClaw compliant with Australia's Privacy Act 2026?

Yes. SydClaw includes an immutable audit trail of every AI decision — input data, reasoning, output, model version, and timestamp — automatically meeting the December 10, 2026 Privacy Act transparency requirements. Consumers and employees can request explanations of any automated decision. Reports are generated automatically without manual preparation.

Does SydClaw store data in Australia?

Yes. SydClaw uses Australian data residency by default — hosted on Vercel and Supabase infrastructure located in Sydney, NSW. An on-premise option is also available using Docker Compose and local LLM inference via Ollama, where zero data leaves your building. The on-premise option is architected for IRAP PROTECTED assessment for government contracts.

How long does it take to deploy SydClaw?

SydClaw deploys in approximately 10 days. This includes OAuth connection of your existing tools, configuration of policies and guardrails specific to your business, compliance setup for your applicable frameworks (Privacy Act, AFSL, APRA CPS 234), and testing the AI workflows against your real data before go-live.

What integrations does SydClaw support?

SydClaw has native integrations with Gmail, Microsoft 365, Xero (invoicing and accounting), SafetyCulture (safety compliance), HubSpot (CRM), SharePoint and OneDrive (document storage), Smartsheet (spreadsheets), and Vapi (AI voice calls). It also connects to 8,000+ additional applications via Zapier. OAuth tokens are stored with AES-256 encryption.

How does SydClaw protect personal data and PII?

SydClaw uses a zero-knowledge PII tokenisation system before any data is processed by AI models. It automatically detects 17 categories of personal information — including names, email addresses, ABNs, TFNs, phone numbers, and physical addresses — and replaces them with reversible tokens. AI models never process real personal information. All tokens are reversible by authorised users only. Data is encrypted with AES-256 at rest and TLS 1.3 in transit.

Is SydClaw suitable for AFSL-licensed financial services firms?

Yes. SydClaw automatically screens every AI-generated response for 12 categories of unlicensed financial advice language patterns, blocking or flagging responses that could constitute financial advice without proper licensing. It also maintains a Best Interests Duty audit trail that links every AI recommendation to the licensed professional who reviewed and approved it, supporting ASIC compliance requirements.

Does SydClaw support APRA CPS 234 requirements?

Yes. SydClaw supports APRA CPS 234 for banks, insurers, and superannuation funds. It automatically maintains an AI asset register, logs all security-relevant incidents with timestamps, and generates security testing evidence. These controls are maintained continuously rather than assembled at audit time.

What security certifications does SydClaw have?

SydClaw's architecture is aligned to ISO 27001 (certification in progress) and SOC 2 Type II (audit in progress). It uses AES-256 encryption, TLS 1.3 in transit, role-based access control with Row Level Security at the database layer, multi-factor authentication, immutable audit logging, and human-in-the-loop approval workflows for sensitive actions. Enterprise deployments support SSO/SAML.

What is the difference between SydClaw and ChatGPT or Copilot?

SydClaw is a managed AI workforce platform — not a chat assistant. Unlike ChatGPT or Microsoft Copilot, SydClaw autonomously executes business workflows: it connects directly to your email, calendar, invoicing, and document systems and takes action without prompting. It runs on a schedule (morning briefings at 7 AM), responds to triggers (new email arrives), and processes tasks overnight. It also has Australian-specific compliance built in — Privacy Act 2026, AFSL screening, APRA CPS 234 — and all data stays in Australia.

Who is SydClaw built for?

SydClaw is built for Australian professional services firms — including legal, accounting, financial planning, construction project management, consulting, and managed services businesses — typically with 5 to 100 employees. It is particularly valuable for firms subject to Australian regulatory requirements including Privacy Act 2026, AFSL obligations, or APRA CPS 234.

← Back to blog

30 April 2026 · 11 min read

PII Tokenisation: How Zero-Knowledge AI Actually Works in Production

Australian firms cannot send client PII to overseas AI models in plaintext, and yet the AI needs context to be useful. The privacy router pattern resolves the tension. Here's how it works in production, what it costs, and what it cannot protect against.

Roman Silantev — Founder, AI Lab Australia

The contradiction at the heart of business AI

Every Australian professional services firm using AI is operating against a contradiction. To be useful, the AI needs to read the firm's actual data — the client's name in the email, the participant's NDIS number on the invoice, the patient's medication chart, the borrower's TFN on the loan application. To be safe under the Privacy Act 2026 ADM transparency provisions, the firm has to be able to explain what the AI did with that data. To be safe under common sense, the firm shouldn't be sending the data to a third-party model provider in plaintext, where it can be retained, logged, fine-tuned on, or breached.

The industry's answer to this contradiction has typically been one of three things. Some firms accept the risk and send the data anyway, banking on the model provider's data-handling promises. Some firms refuse to use AI at all, accepting the operational cost. Some firms attempt to redact PII before each call — manually, or with regex tools — accepting the loss of context. None of these is satisfactory for a firm that wants both usefulness and defensibility.

What tokenisation does

Tokenisation is the structural answer. Before any client data reaches an AI model, it passes through a privacy router that detects personal information and replaces it with reversible tokens. The model receives TKN_a8f2c1 instead of John Smith, TKN_b3d7e9 instead of 0412 345 678, TKN_c4f1a2 instead of TFN 123 456 789. The model still has the structure of the input — it knows that this is a name, a phone number, a tax file number — and it can reason about what to do with that structure. What it does not have is the actual personal data.

The reverse-mapping table that converts tokens back to plaintext is held only inside the firm's own infrastructure, encrypted with the firm's per-organisation key. When the AI's output comes back through the router, the tokens are detokenised to plaintext, but only inside the user's authenticated browser session. At no point does the model provider see the actual personal data; the data is structurally meaningless to them.

In architectural terms, this is zero-knowledge AI. The platform operates on a zero-knowledge basis with respect to its model providers — they process tokens, and the only place the tokens are meaningful is inside the firm's own dedicated infrastructure.

The seventeen categories

SydClaw's privacy router detects seventeen categories of personal information: full names, given names, surnames, email addresses, phone numbers, postal addresses, dates of birth, ABNs, ACNs, TFNs, Medicare numbers, NDIS participant numbers, bank account numbers, BSB numbers, driver licence numbers, passport numbers, and IP addresses. The detection runs as a chain of detectors — high-confidence regex for structured identifiers (TFNs, NDIS numbers, ABNs all have well-defined formats), context-aware named-entity recognition for less structured ones (names, addresses), and a final pass that resolves overlapping detections to the highest-confidence interpretation.

The categories were not chosen academically. Each one is a category of data that, if leaked, would constitute a notifiable data breach under the OAIC's threshold tests. The list is calibrated to the regulatory reality, not to a generic privacy framework.

Why reversible, not destructive

The tokens are reversible by design. An alternative architecture would simply destroy the personal information at the privacy boundary — replace it with [REDACTED] and never look back. That approach is more secure but it is also useless for any workflow where the output needs to reach a human in plaintext. Drafting an email to John Smith is operationally meaningless if the AI's output reads "Dear [REDACTED]".

The reversibility is bounded carefully. The reverse-mapping table is held only inside the firm's own Supabase project, with column-level encryption using a separate key from the bulk database key. RESTRICTIVE row-level security policies prevent UPDATE and DELETE on the mapping table, even by service-role authentication. The detokenisation happens inside the user's authenticated browser session, never on a server that an external party could compromise. And the mapping table is segregated by client matter, so a token resolved in the context of Client A cannot be resolved in the context of Client B.

Where it doesn't help

Tokenisation is a layer; it is not the whole defence. There are categories of leakage it does not protect against, and being honest about those is part of what makes the architecture defensible.

It does not protect against the AI inadvertently reproducing personal information that was tokenised. If the AI receives "the client TKN_a8f2c1 lives at TKN_e7d2b9" and writes back "the client lives at the address mentioned in TKN_e7d2b9", the structural information is still there, even if the literal value is not. We mitigate this with output-rail screening that flags any AI response containing token references and routes it for review, but the mitigation is probabilistic, not absolute.

It does not protect against side-channel inference. A skilled adversary with access to the model provider's logs could, in theory, infer information from the patterns of token usage — same token appearing in many requests, statistical correlations between tokens and observed outcomes. The risk is small but not zero. The Anthropic, OpenAI, and Microsoft data-handling agreements all prohibit this kind of analysis, and we monitor model-provider behaviour for any indication it is happening, but the technical defence stops at tokenisation.

It does not protect against an authenticated user inside the firm exfiltrating the mapping table. That risk is a different category and is addressed through different controls — RBAC, MFA, audit logging, access reviews. The privacy router protects against external leakage, not against internal misuse.

What it costs

Tokenisation costs latency and accuracy. Each request adds approximately 80 to 250 milliseconds for the privacy-router pass — detection, tokenisation, lookup against the existing mapping table, persist any new tokens. The cost is dominated by the named-entity recognition for unstructured fields; the structured detectors are fast.

The accuracy cost is harder to quantify. Replacing John Smith with TKN_a8f2c1 reduces the model's ability to draw on contextual knowledge about the name (which is rarely useful) but also reduces its ability to detect when the named entity is the same across multiple inputs (which is sometimes useful). For most professional services workflows — email triage, invoice processing, BAS preparation, claim drafting — the impact is negligible because the workflows are about structure and process, not about who the person is. For workflows that genuinely require named-entity reasoning (relationship mapping, sentiment analysis tied to specific individuals), tokenisation degrades the AI's quality, and we make that explicit when scoping those workflows with a client.

What it enables

Tokenisation is what makes Australian-residency PII handling under the Privacy Act 2026 actually work for an AI workforce platform. Without it, a firm using SydClaw would be sending client personal data to overseas model providers and trusting their data-handling practices. With it, the firm can demonstrate to a regulator that personal data structurally never leaves the firm's own infrastructure in identifiable form.

The Privacy Act 2026 ADM transparency requirement asks: explain how this automated decision was made. SydClaw's audit trail captures the prompt template, the tokens that were passed in, the model's response, and the reviewer who approved the result. The reviewer can, in their authenticated session, see the detokenised version. The regulator, if they request the audit log, can see the tokenised version. The chain of custody is unambiguous, and the explanation is reproducible.

For NDIS providers operating under the Quality and Safeguards Commission, tokenisation is what makes participant-scoped AI safe. A worker can draft a service delivery record from a voice memo in 30 seconds; the AI sees TKN_p4d9c2 instead of the participant's name; the SDR is detokenised back to plaintext only when the worker reviews and signs. The participant's identity does not transit any external service in plaintext.

What we'd build differently

If we were building the privacy router from scratch today, we would invest more heavily in detection accuracy for low-incidence categories. The high-incidence ones (names, emails, phone numbers, TFNs) have detection rates above 99%; the lower-incidence ones (driver licence numbers in unusual formats, foreign passport numbers, edge-case addresses) sit closer to 95%. The 95% is acceptable for the categories where the consequence of a miss is contained — a name that slips through is often not actually personally identifying in context. It is less acceptable for categories where any miss is significant.

We would also expose the mapping table operations more clearly to the firm's audit log. The current design captures every tokenisation event but the export is more verbose than it needs to be; a firm asking "what personal data has been tokenised for this matter" should see a one-page summary, not a 200-row CSV. That is on the roadmap and not yet shipped.

The architecture is correct. The implementation has the kinds of refinements that come from running against real client workflows — the kind that don't surface until production. We document them in the security review pack we share with prospects under NDA, and we update them as we learn.

About the author

Roman Silantev — Founder, AI Lab Australia. Roman is the founder of AI Lab Australia Pty Ltd, the company that builds and operates SydClaw. He has spent the last decade building enterprise software for Australian professional services firms, and writes regularly on AI compliance and Privacy Act obligations.

Talk to us about Privacy Act 2026 readiness