The question operators need to answer about AI in underwriting isn’t “is it accurate?” It’s “which outputs can I trust, and which ones need analyst eyes before they go into a model?” Without a confidence signal, the answer is “review everything”—which means you haven’t saved any time. Confidence scoring is what makes AI extraction into a real workflow rather than an experiment.
The trust problem with unscored extraction
AI extraction from financial documents is good but not perfect. Real estate documents are messy: scanned PDFs, inconsistent formatting, numbers that appear in multiple places and sometimes conflict with each other. A seller might present NOI in three different places in the same OM—the executive summary, the financial table, and the pro forma—and those numbers won’t always match.
When an AI extracts a field from a document with those characteristics, it makes a judgment call about which value to use. Sometimes that judgment is obviously correct. Sometimes it isn’t. Without a signal telling you which is which, you have to treat every extracted field as potentially wrong—which means manually verifying everything, which puts you roughly back where you started.
Teams that try to use AI extraction without confidence scoring often report the same experience: they don’t trust it, so they end up checking everything anyway, and after a few weeks they stop using the tool. The extraction was accurate most of the time, but “most of the time” isn’t enough when you don’t know which times.
What confidence scoring actually does
Confidence scoring attaches a reliability signal to each extracted field. The signal reflects how clearly the value was found, how many times it appeared consistently across the document, and whether it contradicts related fields.
A high-confidence extraction looks like: the NOI figure appears in the financial table, the executive summary, and the broker pro forma, and all three are within rounding of each other. The value is unambiguous.
A low-confidence extraction looks like: the T12 expense ratio is mentioned once in a footnote as a percentage, but the absolute dollar figure used to derive it doesn’t match the line items in the expense table. Something is inconsistent. That field needs analyst attention before it goes into a model.
Medium confidence is the hardest case: the value was found, it seems plausible, but it only appeared once and there’s no corroborating data. Treat it as a spot-check—verify the source quickly, confirm it makes sense in context, and move on.
Designing a review queue around confidence tiers
The practical application is a tiered review workflow:
- High confidence: no review required. The extraction is reliable. Fields in this category—property address, unit count, asking price, tax assessments—should move directly into the model without analyst time.
- Medium confidence: spot-check. The analyst verifies the source document, confirms the value looks correct in context, and marks it reviewed. This takes 30–60 seconds per field, not 10 minutes.
- Low confidence: full review. The analyst goes back to the document, finds the correct value, and records the source. These are the fields where errors are most likely and where the cost of an error is highest—usually income and expense line items that drive NOI.
The result: analyst time concentrates on low-confidence fields, which are typically a minority of the total extracted data. The high-volume, high-confidence extractions—which on a clean document might be 60–70% of all fields—move through without consuming any review time.
Confidence enables defensible documentation
The secondary benefit of confidence scoring is documentation quality. When a field is extracted with a high-confidence signal and a citation pointing to its source in the document, you have a clear record of where that number came from.
In an LP meeting where someone asks “how did you underwrite the vacancy?”, the answer “we used 8% based on the T12 actuals for the trailing 24 months, which showed an average of 7.4% occupied” is a different answer than “that’s what we modeled.” The former is auditable. The latter is a guess dressed as analysis.
This matters especially on deals that don’t perform as projected. When you need to explain to investors why a pro forma didn’t hold, being able to show exactly what assumptions you made, where they came from, and what the documents actually said is the difference between a difficult conversation and a damaging one.
What changes operationally when confidence is built in
For a lean underwriting operation—one or two analysts covering a high volume of deals—the math changes meaningfully. If a deal has 40 extracted fields and 28 of them are high confidence, the analyst is reviewing 12 fields instead of 40. Multiply that across 30 deals a quarter and the time savings are material.
More importantly, the saved time goes to the right place: the uncertain extractions that actually need judgment. Analysts aren’t spending time verifying that the property address is correct. They’re spending time on the expense normalization question that will actually affect the model output.
Throughput increases not because analysts are working faster, but because they’re spending their time on the work that requires them specifically—judgment, context, and skepticism—rather than manual transcription.
The honest limitation
Confidence scoring reduces review burden. It doesn’t eliminate the need for review. Even high-confidence extractions can be wrong when the underlying document is wrong—sellers do misstate financials, intentionally or otherwise. The discipline is using confidence as a triage tool, not an approval mechanism. The analyst who treats a high-confidence score as permission to skip verification entirely is making the same category of error as the analyst who trusts a broker pro forma without checking the T12.
Confidence scoring makes AI underwriting trustworthy enough to use in a real workflow. What makes it consistently accurate is the analyst who knows which questions to ask when something doesn’t look right.
Related: From OM PDFs to structured data: the future of deal screening • How AI is changing commercial real estate underwriting.
Interested in investing alongside us?
We use disciplined, auditable underwriting on every deal. If you're an accredited investor looking for private real estate exposure, let's connect.