top of page

The Life of a Data Error and your Enterprise AI Agent

  • Writer: Jill Singleton
    Jill Singleton
  • 2 days ago
  • 12 min read

Why data quality isn’t just a technical problem and why it matters even more in an age of AI-powered analytics.


Welcome to the Iamdata Solutions Asset Management Newsletter – July 2026

 

 

Councils, along with other organisations are increasingly exploring how AI language models can help staff access and interpret the data they work with every day. The idea is that rather than waiting for a report to be built or an analyst to run a query, staff can type a plain-English question, for example,  ‘how many open customer requests are older than 30 days?’ or ‘what is our total outstanding development applications by suburb?’, and receive an answer directly from their data in seconds.

 

This is made possible by connecting an LLM to the council's data systems through an intermediary layer, most commonly an MCP server, which handles the communication between the AI and the underlying databases. The LLM does not store the council's data or learn from it, it simply reads what it is given at the time of the query, interprets the question, and formulates a response.

 

In that sense it works much like a very capable analyst who has been handed a dataset and asked to answer a question, except it does so in moments rather than hours. Councils are finding this particularly valuable for ad hoc queries that doesn’t warrant the effort of a full Power BI report, for quickly surfacing information during meetings, and for giving non-technical staff the ability to explore data without needing to understand how it is structured underneath.

 

This all sounds great, but there is a question quietly sitting underneath every conversation about AI-powered business intelligence, and it rarely gets asked directly. ‘What if the data the AI is reading is wrong?’

 

The hype around AI analytics tends to focus on capability. Can it answer natural language questions? Can it generate insights on demand? Can it replace the need for an analyst to write SQL at eleven o’clock at night? The answer to all of these is increasingly yes, with the right setup. But there is a prior question that matters just as much, and it has nothing to do with AI at all.

 

It is the same question that has always mattered in data management. Is the information in your systems actually correct?

 

This month’s blog post brings together two threads that are usually treated separately. The first is the unglamorous reality of how data errors originate, spread through an organisation, and quietly corrupt decision-making for years before anyone notices. The second is the practical challenge of making AI-powered analytics reliable and why that challenge is impossible to solve without addressing the first.

 

I believe that these two things are not separate problems. They are the same problem, seen from different angles.

 

How a Single Wrong Number Travels Through an Organisation

 

Most people, when they think about infrastructure risks, they picture something physical. A collapsed stormwater pipe. A bridge requiring emergency repairs. A failing water main. Rarely does anyone picture the risks hidden inside the asset register itself.

 

Consider a council that has just completed a stormwater project. The contractor submits as-constructed documentation, the assets are handed over, and a new pipe network is added to the organisation’s systems. Let’s say that one of the pipes, a 600-millimetre reinforced concrete pipe, is incorrectly recorded as 375 millimetres. Maybe someone mistyped. Maybe the information was copied from an older drawing. Maybe the final construction differed from the plans and the update was missed. The important thing is that nobody notices.

 

At this point, the mistake appears harmless. It is one incorrect field amongst thousands of records.

 

But what makes data errors dangerous is not the size of the mistake. It is the number of places that mistake eventually travels.

 

The asset information is built into council’s GIS. It now appears on every map as a 375-millimetre pipe. Maintenance crews reference it. Consultants extract it. Engineers use it to model network capacity. Each person trusts the information because it comes from an official source. The GIS is doing exactly what it is supposed to do, faithfully displaying what it has been given. The system is not wrong. The attribute information is.

 

The same information flows into the asset management system. Inspection histories accumulate against that record. Work orders reference it. Over time, the record gains credibility simply because it has existed in the system for years. A new employee reviewing the register naturally assumes the information is accurate, after all, it has maintenance history attached to it. Surely somebody would have noticed if it was wrong.

 

That assumption is precisely how data errors survive for decades.

 

When the organisation undertakes an asset valuation, the incorrect diameter flows into the replacement cost calculations. A 600-millimetre reinforced concrete pipe is considerably more expensive to replace than a 375-millimetre one. The valuation is understated. That understated valuation feeds the long-term renewal model. The renewal forecasts are lower than they should be. Those forecasts inform budget discussions. Senior management and elected members make funding decisions based on numbers that trace back, invisibly, to a single mistyped digit during a contractor handover years earlier.

 

No one in that chain is doing anything wrong. They are simply working with the information available to them. That is precisely what makes data quality failures so insidious. The models are working. The calculations are correct. The forecasting methodology is sound. The problem is the data feeding those calculations, and it is a problem that is almost invisible until something forces it into the open.

 

AI Makes This Problem More Consequential, Not Less


Here is where the AI conversation becomes relevant, and I must admit sobering.

 

For years, the audience for a data error was limited by the people who knew how to access it. An incorrect pipe diameter lived in the asset register, surfaced occasionally in reports, and influenced decisions made by people who at least had some familiarity with the underlying data. There was friction between the error and its consequences.

 

AI-powered analytics removes that friction.

 

When you connect an LLM to your data and give staff the ability to ask plain-English questions and receive instant answers, you are dramatically expanding the audience for whatever is in your systems, (correct or not).  An executive who would never have opened the asset management system can now ask ‘what is our projected renewal spend over the next ten years?’ and receive a confident, well-presented answer in seconds. That answer inherits every error in the underlying data, and there is nothing in the presentation of the answer that indicates this.

 

This is one of the risks of AI-powered BI that I believe does not get discussed enough. LLMs whether that is ChatGPT, Claude, or any other frontier model, produce answers that look right even when they are wrong. They are fluent, confident, and coherent regardless of the quality of the data they are drawing from. A human analyst reviewing the data might notice that a renewal estimate looks surprisingly low and go back to check the source data. An AI chat interface will not flag that concern unless it has been specifically configured to do so.

 

Garbage in, garbage out has always been true. With AI, the garbage comes out looking polished.

 

The Foundation That Makes AI Analytics Trustworthy


None of this is an argument against AI-powered analytics. The capability is real and the value is genuine. But it does clarify what needs to be in place before AI can be trusted to support meaningful decisions.

 

The answer is not a more sophisticated AI model. It is a more reliable data foundation.

 

Organisations that are getting genuinely impressive results from AI-powered BI are, without exception, the ones that have already done the hard work of data governance. They have consolidated siloed systems. They have resolved the mismatches, like IDs that don’t align across the systems, the asset records that exist in the GIS but not the asset management system, the fields that have been used differently by different teams over time. They have built what is commonly described as a medallion architecture. This means raw data ingested at the bronze layer, cleaned and validated at silver, and business-ready at gold.

 

When an AI is pointed at a well-governed gold layer, the results can be remarkable. The AI is not compensating for ambiguity or guessing at entity relationships. It is working with data that has already been validated, reconciled, and made consistent. The answers it produces are reliable not because the AI is especially clever, but because the foundation it is drawing from is trustworthy.

 

When an AI is pointed at raw transactional data, or at a poorly maintained asset register, or at systems that have never been properly integrated, it produces answers that feel useful but cannot be verified. And crucially, it does not tell you it cannot be verified.

 

Preparing Your Data Specifically for AI

 

Even with a solid data foundation, there are additional steps that make a meaningful difference when AI is going to be part of the picture.

 

Resolve ambiguity before it reaches the AI.

 

This pattern appears everywhere in real data, two columns that represent similar things, with no clear signal about which one applies in a given context. An LLM will pick one, often inconsistently.

 

Consider a council's property dataset where rateable value has been recorded in two separate fields: one reflecting the current valuation, one carrying the value from the previous revaluation cycle that hasn't yet been cleared out. Ask the AI for total rateable value across a suburb and it may use either column depending on how the question is phrased, with no indication it made a choice at all. The solution is to make the decision at the data layer and consolidate to a single column for each concept, with the business logic baked in, and document the choice clearly. The AI should never have to guess.

 

Denormalise for readability.

 

A star schema with fact tables and dimension tables is ideal for Power BI. For an LLM, that complexity introduces opportunities for misinterpretation. Creating a wider, largely denormalised view with only the columns the AI needs, named in plain language that reflects their actual meaning, significantly improves answer consistency. Column names like ‘SLS_AMT_EXTX’ mean nothing without context. Column names like ‘Total Sales Excluding Tax’ are self-documenting.

 

Materialise for performance.

 

In traditional BI, views-on-views are perfectly acceptable. A nightly Power BI refresh may be able to afford to wait five minutes for a complex query to resolve. In an AI chat interface, a five-minute wait feels like a broken system. Users give up. The solution is to materialise the data into a physical table, sometimes called a Platinum Layer, so the LLM is reading pre-computed results rather than triggering a chain of views against raw data.

 

In practice, this means building an ETL process that runs on a schedule and executes a stored procedure to rebuild or incrementally refresh the table with current data. The stored procedure does the heavy lifting, for example, applying joins, resolving business rules, calculating derived fields, and writing the results to a clean, flat table. When the AI queries it, that work has already been done. There is no chain of logic to unwind, just a straightforward read against a table that is already in exactly the shape the AI needs it.

 

The result is that response times drop from minutes to seconds, and the AI experience becomes genuinely usable. The trade-off is that the data reflects the last time the ETL ran rather than the live state of the source system, but for most analytics use cases, data that is current to the last scheduled refresh is more than sufficient.

 

Give the AI the context a good analyst would have.

 

Even perfect data will not tell an AI what your organisation means by ‘active customer’, or how your fiscal year is defined, or which revenue figure the executive team actually uses for reporting purposes. The best implementations pair the data model with a set of markdown documents loaded via a custom MCP server each time the AI connects that encode the institutional knowledge an experienced analyst would carry. Common metric definitions, known quirks in the data, business rules, glossary terms. This is the difference between an AI that produces technically correct but contextually wrong answers and one that produces answers the business can actually use.

 

Two Different Modes: Operations and Exploration

 

One of the most useful ways to think about AI and traditional BI together is to distinguish clearly between two different modes of working with data.

 

Traditional dashboards and reports are your operational layer. They are how the business runs day to day. The field supervisor checking maintenance schedules. The finance team reviewing expenditure against budget. The asset manager tracking renewal progress. These reports need to be deterministic. They need to be able to open the report and see the data and be confident that it is being pulled together using the same methodology each time, with full confidence that those numbers reflect reality.

 

For this kind of work, a well-governed Power BI report connected to a validated data model is still the right tool. The cost is predictable, the output is consistent, and the numbers can be audited.

 

AI-powered natural language query is your exploration mode. It is where analysts investigate a hypothesis without waiting for a developer to write a query. It is where an executive in a meeting can ask a question that does not have a pre-built report and get a directionally accurate answer in sixty seconds. It is where someone can ask ‘which asset classes are most likely to drive renewal expenditure in the next five years given current condition data?’ and get a useful starting point for a conversation rather than a two-week wait for an analyst to come back with a polished report.

 

The two modes complement each other naturally. AI surfaces an insight, while the Power BI Report validates and monitors it over time. The practical rule is straightforward: if you find yourself asking the same question every week, it belongs in a Power BI report. If you are exploring something for the first time, AI is the right starting point.

 

But in both cases, the answer is only as good as the data it comes from. The operational report built on an asset register full of incorrect diameters will produce wrong renewal forecasts just as surely as the AI that queries the same register will. The tool is not the problem. The foundation is.

 

Data Quality Is a Business Issue, Not a Technical One

 

There is a tendency to treat data quality as something that lives in the IT department, a technical problem for technical people to solve, somewhere in the background, while the business gets on with making decisions.

 

That framing has always been a mistake, but it becomes a more expensive mistake as AI takes a larger role in how organisations access and interpret information.

 

When a pipe diameter is incorrectly recorded in an asset register, the consequences eventually reach the budget table. When ambiguous valuation columns cause an AI to report the wrong revenue figure to an executive in a meeting, the consequences can move even faster. The difference is speed and scale. A data error that once took quite some time to work its way from the source system to a boardroom decision can now do so in seconds, dressed in the confident language of an AI-generated response.

 

This means that the organisations best positioned to benefit from AI-powered analytics are not necessarily the ones with the most sophisticated AI implementations. They are the ones that have treated data quality as a genuine organisational priority and have invested in understanding what is actually in their systems, resolving conflicts, establishing governance practices, and building the kind of reliable foundation that makes any analytics tool trustworthy.

 

The good news is that this investment compounds. Every hour spent improving data quality, resolving entity mismatches, and establishing clear business rules makes traditional BI more reliable and AI analytics more accurate at the same time. You are not choosing between two approaches. You are building a foundation that serves both.

 

The Question Worth Asking Before the AI Question


Before asking whether your organisation is ready for AI-powered analytics, it is worth asking a more fundamental question:

 

Do you trust the data in your systems?

 

Not in a general sense. Specifically. Do you trust that your asset attributes have been correctly captured and maintained? Do you trust that your customer records are consistent across systems? Do you trust that the metrics your Power BI Reports reflect the business rules your organisation actually applies?

 

If the honest answer is ‘partly’ or ‘it depends’ or ‘we’re not really sure’, that is the right place to start. Not because AI is too risky to use until the data is perfect (the data never will be 100% perfect), and AI can still provide value in exploratory contexts even with imperfect data, provided those limitations are understood. But because the decisions that matter most, the budget submissions, the renewal forecasts, the infrastructure investment strategies, deserve to be built on a foundation that has been deliberately designed to be reliable.

 

AI will make your analytics faster, more accessible, and in many ways more powerful. What it will not do is fix the data error that was entered by mistake into the system five years ago.







I have worked on many different projects with my Local Government clients, from designing and developing Power BI Reports, to building SQL Server databases for spatial data, to managing and maintaining GIS and the Asset Management systems. If you'd like to discuss how we might work together, then please email Jill at ➡️ jill.singleton@iamdata.solutions

 

If you would like to receive the latest Newsletter Blog straight to your inbox, please subscribe here: ➡️ https://www.iamdata.solutions/subscribe

 

You can read all our Newsletters and Blogs here:➡️ https://www.iamdata.solutions/blog

 

You may also be interested in our Projects Page:➡️ https://www.iamdata.solutions/past-projects

 

Check out what our clients say about us here:➡️ https://www.iamdata.solutions/reviews

 

If you would like to see a particular topic covered in these newsletters, then please let me know about it. The chances are other people will be interested and would like to hear about it too! Please email me at: ➡️ jill.singleton@iamdata.solutions with your suggestions.  



Comments


IAMDATA SOLUTIONS PTY LTD

If you’ve enjoyed reading our newsletters and blogs, how about subscribing to our email list and get the latest notifications straight to your inbox.

You won’t get spammed by hundreds of advertising emails – just notifications about my latest blog or newsletter.  

Subscribe Form

Contact us:

PO Box 58, Clifton Beach, Queensland 4879

jill.singleton@iamdata.solutions

0423 240 439

  • facebook
  • linkedin
  • instagram

©2019 by IAMDATA SOLUTIONS PTY LTD.

bottom of page