The AI Vendor Questionnaire Every Financial Services Firm Should Send

Most financial services firms are buying AI the same way they bought SaaS in 2018. They send a security questionnaire, check a SOC 2 report, ask about data residency, and call it procurement. That was fine for a document storage tool. It is nowhere near enough for a system that reads client data, makes decisions, and acts inside a regulated workflow.

The questions that actually matter are not in the standard vendor pack.

This is the questionnaire we send on behalf of clients. It is opinionated. It is written for Australian firms operating under ASIC, APRA, AUSTRAC, OAIC and Privacy Act obligations. Steal it. Adapt it. Send it before the contract, not after the pilot.

Why the standard security questionnaire is not enough

A traditional vendor review asks whether the supplier can keep your data safe. That is still a real question. It is not the question that separates good AI vendors from bad ones.

AI vendors fail in different ways.

They fail because their model was trained on data you cannot inspect. They fail because the "AI" is a thin wrapper over someone else's API and the real policy lives four layers down the stack. They fail because the product worked in the demo on clean test data and collapses on the messy reality of a real CRM. They fail because nobody wrote down what happens when the model returns an unexpected output and the workflow carries on regardless.

None of those failure modes show up in a SOC 2 report.

The questionnaire

Seven sections. Forty-one questions. If a vendor cannot answer most of them clearly, that is the answer.

1. Model and provider transparency

Which underlying foundation models does your product use, and who operates them?
For each model, where is it hosted, which region, and which entity holds the contract with the model provider?
Do you use the same model version for every customer, or does each customer get a pinned version?
When a model is retired or updated, how are customers notified and what is the migration path?
Is our data used, in any form, to train, fine-tune, or improve any model you or your upstream providers operate?
If fine-tuning is offered, is the resulting model isolated to our tenant, or shared?

2. Data handling

What data does the product collect, derive, or infer beyond what the user explicitly provides?
Where is our data stored at rest, and where does it flow in transit?
What is your data retention policy, broken down by data type (prompts, outputs, logs, embeddings, telemetry)?
Can you provide a data flow diagram that shows every third party our data touches during a typical workflow?
Is our data logically segregated from other customers' data or physically segregated?
Can we request deletion of specific records, specific conversations, or all of our data, and how quickly can you action it?

3. Identity, permissions, and containment

How does your product authenticate users, and does it support SSO with our identity provider?
Can permissions be scoped to specific document types, client segments, or workflow steps?
Can an administrator run the product in read-only mode for an individual user or team?
What credentials does the product use to access our internal systems, and can those credentials be rotated or revoked without a support ticket?
If a user is disabled in our identity provider, how quickly does access revoke, and is there a test we can run to verify?

4. Audit and traceability

For every AI-generated output, can we retrieve the source data that was used, the prompt that was issued, the model that responded, and the user or agent that reviewed it?
Are audit logs immutable and timestamped, and how long are they retained?
Can audit logs be exported on demand in a structured format we can feed into our own systems?
If we receive a regulator request or a client complaint, what is the process and timeline for producing a complete audit trail for a single client interaction?
Does the product record prompt-level telemetry, and if so, is that telemetry part of our tenant or yours?

5. Agent behaviour and boundaries

Can the product take actions in our systems, or only produce text?
If it can take actions, what actions, in which systems, under what conditions?
How is the set of allowed actions configured, and can it be tightened without a release cycle?
What happens when the model returns an output the workflow cannot parse, a contradictory output, or an unexpected output?
What is the default behaviour when a downstream system is unavailable?
Is there an explicit "low confidence" threshold, and how is it surfaced to the user?

6. Governance alignment

How does your product support section 912A efficient, honest and fair obligations where it sits inside financial services?
For APRA-regulated customers, how does your product align with CPS 230 (operational resilience) and CPS 234 (information security)?
For AUSTRAC-regulated workflows, how does your product support AML/CTF controls, including the 31 March 2026 reform deadlines?
How does your product handle personal information under the Privacy Act and OAIC guidance on generative AI?
If you process data in a jurisdiction outside Australia, what is the legal basis, and how are cross-border data flows disclosed to our clients?
Are you prepared to be named in our CPS 230 service provider register if applicable, and will you accept contractual obligations accordingly?

7. Continuity, incident, and change

What is your incident response process for a model failure, a data breach, or a significant output error, and what notification timelines apply to us?
How is model drift detected, and what is the threshold that triggers a customer notification?
What is your uptime commitment, and how is "uptime" defined when the model is degraded but technically responsive?
If your company is acquired, wound down, or changes model providers, what is our contractual exit path and data portability guarantee?
Are contracts with your model providers passed through to us, or held by you, and do they align with our licensee or prudential obligations?
What is the most critical thing a customer has asked you to change in the last twelve months, and did you change it?
Who at your organisation is accountable if the product produces an output that causes client harm, and what does accountability mean in practice?

How to use it

Do not send all 41 questions on the first email. You will not get useful answers.

Send the questionnaire in two passes.

First pass. Sections 1, 2, and 6. Model transparency, data handling, and governance alignment. These three sections will tell you, within a week, whether the vendor is a serious fit for a regulated financial services environment or not. If the answers are evasive, generic, or routed through a sales channel that cannot answer technical questions, that is the answer.

Second pass. Sections 3, 4, 5, and 7. Send these only to vendors who passed the first cut. Identity, audit, agent behaviour, and continuity. These are the questions that separate a vendor who built for enterprise from one who bolted enterprise language onto a startup product.

At least half of the vendors you approach will not be able to answer all 41 questions clearly. That is useful information. Score them, rank them, and keep the record.

The three answers that should end the conversation

Some answers are disqualifying.

"We do not store customer data" when you can see the product builds memory across sessions.

"We are not currently trained on customer data" without a written, contractually enforceable guarantee that it will not be in future.

"Your audit trail requirement is not something we have heard before" from a vendor selling into financial services.

If you get any of those three, thank them for their time and move on.

What to do on Monday morning

If you are a principal, COO, or CTO of an Australian financial services firm, licensee, or wealth platform, do this before your next AI procurement.

Take this questionnaire.
Remove any questions that do not apply to your regulatory context.
Add three questions that are specific to your firm.
Send the first pass to every vendor on your shortlist.
Score the responses.

You will learn more about the AI market in two weeks than in the last six months of demos.

Because the quality of the answers is the product.

A vendor who can answer 41 hard questions clearly is a vendor who has thought about this. A vendor who cannot is selling you a demo.

In Australian financial services, a demo is not a product. It is a liability. That filter is built into every AI strategy engagement we run.