Nobody tells you about the questionnaires.
When you decide to build an integration with NHS systems, you imagine the technical challenges: APIs, authentication, data formats, security. What you don't imagine is a 73-question technical conformance document that asks you to describe, in detail, how your application handles every conceivable edge case, error condition, and security scenario. And that's just one of several forms.
I built an NHS MESH integration for a healthcare referrals platform, and Claude was involved in virtually every step - from the initial API exploration through to filling out the compliance paperwork. This is the story of using AI to navigate government-grade technical requirements, and it's probably the most unexpected use case in this entire series.
What MESH Actually Is
MESH is the NHS's Message Exchange for Social Care and Health. It's essentially a secure mailbox system that NHS organisations use to exchange messages and files. If you're building a system that needs to send or receive data to and from NHS services - referrals, results, notifications - MESH is likely part of your integration path.
The concept is simple. The implementation is not. MESH involves TLS mutual authentication with NHS-issued certificates, specific message formats, chunked message handling for large payloads, and a workflow that goes through sandbox testing, conformance testing, and a witness test before you're allowed anywhere near a live environment.
I went into this knowing it would be complex. I underestimated by roughly an order of magnitude.
Certificates and the Docker Problem
The first real hurdle was TLS certificates. NHS MESH requires mutual TLS - your application presents a client certificate to the MESH server, and the server presents its certificate to you. Both sides verify each other. This is standard for government APIs, and it's fundamentally different from normal HTTPS where only the server presents a certificate.
Getting the certificates configured correctly was its own project. The NHS provides certificates in specific formats that need to be imported, trusted, and configured in your application. On a development machine, this is fiddly but manageable. Inside a Docker container, which is how the application was going to run in production, it's a different beast entirely.
Claude was genuinely helpful here. Certificate management in Docker involves configuring the container's trust store, mounting certificate files correctly, and ensuring the .NET HTTP client picks up the right client certificate. There are about fifteen different ways to get this wrong and the error messages are spectacularly unhelpful. "The remote certificate was rejected by the provided RemoteCertificateValidationCallback" tells you something went wrong but gives you no clue which of the fifteen things it might be.
We worked through this iteratively. Claude would suggest a configuration, I'd try it, it would fail with a cryptic error, I'd paste the error back, and we'd narrow down the issue. It took longer than I'd like to admit, but every iteration got closer. The final Docker configuration for certificate handling was something I'm confident I'd have spent days figuring out alone.
SNOMED Codes and Healthcare Data
MESH messages in healthcare contexts use SNOMED CT codes - a standardised clinical terminology system. When you're building a referrals platform, you need to map your internal concepts to SNOMED codes so that receiving systems understand what you're sending.
This is the kind of task where AI shines. SNOMED CT is enormous - hundreds of thousands of concepts. Finding the right code for a specific clinical concept manually means searching through the NHS terminology browser, understanding the hierarchy, and choosing between codes that seem similar but have specific meanings in specific contexts.
Claude knew the SNOMED hierarchy. When I described a clinical concept we needed to encode, it could suggest the appropriate code and explain where it sat in the hierarchy and why it was the right choice over similar-looking alternatives. I verified every suggestion against the official browser, and it was right more often than I expected. Not always - there were cases where the specific usage context in NHS Digital's implementation guide overrode the generic SNOMED meaning, and Claude couldn't know that without seeing the guide. But as a starting point, it saved hours of manual searching.
The Path to Live
NHS integrations have a formal path to live. You don't just build something and deploy it. You go through stages: development against sandbox APIs, conformance testing where you prove your implementation handles all the required scenarios, and a witness test where someone from NHS Digital watches you run through the test cases.
Claude helped me navigate this entire process. Not just the technical implementation, but understanding what each stage required, what documentation needed to be produced, and what the common failure points were. It was like having a colleague who'd been through the process before - not a guarantee of getting it right, but a significant reduction in the number of surprises.
The sandbox testing phase involved sending and receiving test messages through MESH's sandbox environment. The messages had to be correctly formatted, properly authenticated, and handled according to the specification. Chunked message handling - where large messages are split into multiple parts and reassembled by the receiver - was particularly tricky. The specification describes the chunking algorithm, but the edge cases around reassembly order and error handling during partial delivery required careful implementation.
73 Questions
Then came the PDS FHIR API technical conformance questionnaire. Seventy-three questions about how your application handles the Personal Demographics Service FHIR API. Questions about error handling, data validation, consent management, audit logging, rate limiting, timeout handling, and about forty other topics.
Each question required a specific, technical answer about your implementation. Not generic boilerplate - the assessors want to know exactly how your application handles each scenario. "What happens when the API returns a 429 rate limit response?" needs an answer like "The application implements exponential backoff with a base delay of 1 second, maximum 3 retries, logging each retry attempt with the Retry-After header value" - not "We handle rate limiting appropriately."
I gave Claude the questionnaire and our codebase context. It drafted answers for all 73 questions based on the actual implementation. Were they all perfect? No. Some needed adjustment because they described what the code should do rather than what it actually did, and there's always a gap between those two things. But as a starting point, having 73 technically detailed draft answers that I could review and correct was enormously more efficient than writing each one from scratch.
The alternative was spending two or three full days writing detailed technical answers to questions about my own code. Claude got me to a reviewable first draft in about an hour. The review and corrections took another couple of hours. That's a genuine time saving of days, on a task that adds no features and no value to users but is an absolute gate to going live.
CIS2 Authentication: The Other Headache
As if MESH wasn't enough, the platform also needed CIS2 authentication - the NHS's identity service that allows healthcare professionals to log in using their smartcards. This is an OAuth-based flow, but it's NHS-flavoured OAuth, which means specific token endpoints, specific claim types, and specific identity verification requirements.
The OAuth flow itself was manageable. The debugging was not. We hit an "identity not found in token" error that was particularly maddening because the token clearly contained an identity - we could decode it and see the claims. The issue turned out to be a mismatch between the claim type our code was looking for and the claim type CIS2 was actually providing. Different naming conventions between what the documentation described and what the live system returned.
Claude helped narrow this down by analysing the decoded token and comparing it against the CIS2 specification. It suggested checking the specific claim URIs rather than the short names, which led us to the discrepancy. A human developer would have got there eventually, but having an AI that could hold the entire token structure, the specification, and our code in context simultaneously made the debugging significantly faster.
What I'd Tell Someone Building an NHS Integration
First: budget more time than you think. The technical implementation is maybe 40% of the work. The compliance, documentation, testing, and approval process is the other 60%. Nobody warns you about this ratio because the people who know it are too tired from going through it to write blog posts.
Second: AI is remarkably good at government compliance work. Not because compliance is easy - it's not. But because compliance involves processing large amounts of specification text, mapping requirements to implementation details, and producing structured documentation. That's exactly what AI does well.
Third: the domain-specific knowledge still matters. Claude knew OAuth, knew TLS, knew FHIR, knew SNOMED. But it didn't always know the NHS-specific quirks, the undocumented behaviours, or the gaps between what the specification says and what the live system does. You still need someone who understands the domain to validate the AI's work.
Fourth: keep the AI conversation going across the whole process. Don't use it just for coding and then switch to manual work for compliance. The conformance questionnaire, the test documentation, the deployment procedures - AI adds value across all of it.
Compliance as a Use Case Nobody Talks About
Most of the AI conversation is about building things faster. Writing code, generating content, automating workflows. The compliance use case is different. It's not about building faster - it's about navigating bureaucracy more efficiently.
Government APIs come with government-grade documentation requirements. Healthcare systems come with healthcare-grade security requirements. Financial services come with regulatory-grade audit requirements. All of these involve processing large volumes of specification text, answering detailed questionnaires, and producing documentation that proves your implementation meets the requirements.
This is tedious, time-consuming, and absolutely essential. It's also exactly the kind of work where AI can save you days of effort without sacrificing quality. The 73-question conformance questionnaire is the most dramatic example, but the entire path-to-live process was smoother with AI assistance than it would have been without.
If you're building integrations with government or regulated systems, don't just use AI for the code. Use it for the compliance. That might be where it saves you the most time.