View all articles

Jun 12, 2025

Data Science

Beyond Automation: Best Practices for Trustworthy AI at Scale

9 min read

Kristen Kehrer

Live Event Producer & Mavens of Data Host

Currently Reading

Beyond Automation: Best Practices for Trustworthy AI at Scale

AI isn’t just about automating individual tasks anymore. It’s orchestrating intelligence across the enterprise. At IBM Think 2025, one big theme was: enterprise AI is shifting from siloed tools and isolated models to intelligent, governed agent ecosystems that can reason, act, and adapt. I hope to share a bit here about what this means and why it’s important.

In this article, we’ll go through some of the themes I noticed at IBM Think:

How to better use unstructured data for AI
Starting with an internal use case
The move from automation to orchestration
Extending data governance
Frictionless AI

The Missing Ingredient: Unstructured Data with Built-in Trust

As organizations accelerate AI adoption, one of the most overlooked and richest assets they possess is unstructured data. Think about the documents, customer emails, PDFs, meeting transcripts, chat logs, and intranet files that live across platforms like SharePoint or Google Drive. These aren’t just digital clutter; they’re information waiting to be utilized.

Traditionally, unstructured data was hard to work with at scale. Techniques like retrieval-augmented generation (RAG) helped by pulling relevant chunks of text into LLM prompts, but these methods often hit a ceiling when it came to:

Data lineage – Where did this insight come from? Can we trace it back to its original source?
Governance – Is the data being accessed and used in compliance with policies (think: PII, GDPR, internal controls)?
Compliance at scale – Can we automate data classification and policy enforcement across terabytes of messy, unlabeled content?

What’s changing now is how organizations are reframing unstructured data as a governed asset.

The emerging best practice is a unified architecture that brings together:

Batch and streaming integration (to keep data fresh)
Replication and synchronization (to align data across systems)
Built-in governance and quality controls (to ensure safe, trusted access)

Instead of manually wrangling files or guessing at what's compliant, modern platforms now support automated classification, lineage tracking, and policy enforcement across unstructured inputs. That means you can ingest raw files, identify sensitive data (like PII or financial details), and prepare it for AI use in minutes.

In one demo at Think 2025, an AI agent ingested enterprise documentation, applied compliance tagging, and produced an actionable recommendation, all executed locally, with traceability and control. A year ago, this would’ve been a complex, multi-tool workflow. Today, it’s nearly frictionless.

Stephanie Valarezo, Program Director, Product, for IBM Data Integration, will be diving deeper into this topic at the Databricks Data + AI Summit. Her talk will focus on why unstructured data is now foundational for building accurate, trustworthy AI agents, and more importantly, why handling it properly requires orchestration, not just indexing or scraping. Stephanie will also be joining the Mavens of Data live show on June 17th to discuss this in detail.

If you’re building AI systems and haven’t yet rethought how you treat your unstructured data, now is the time. Because the more trust and governance you build in at the start, the more powerful and safe your AI can be.

The Internal Use Case Advantage: Start Where You Have Control

One of the most grounded and practical insights from IBM Think 2025 came from Arvind Krishna, IBM’s Chairman and CEO: start with internal use cases.

It’s deceptively simple advice, but also incredibly impactful. When you build AI systems for internal users first, a few things work heavily in your favor:

You control the environment. There’s less variability, and you’re not dealing with the unpredictability of customer-facing systems.
You get real-time feedback. Employees become active participants in shaping and stress-testing the solution. It’s like having a built-in human-in-the-loop system.
You’re working with domain experts. These are the people who know the work, recognize when something’s off, and can spot hallucinations or incorrect logic long before they become a problem.

This setup creates a safer, faster feedback loop. Instead of experimenting in front of customers, where the cost of a mistake could be significant, you’re refining your agents with people who are more forgiving, more knowledgeable, and more invested in the outcome.

There’s another upside: internal use cases tend to focus on productivity. They’re not about replacing employees; they’re about freeing them up. Whether it’s automating routine documentation, surfacing key insights from emails, or reducing back-and-forth across departments, these tools help clear the mental clutter and unlock time for deeper, more strategic work.

As Alon Cohen put it so perfectly:

Everybody in a proactive role at work has a long hopper of things they wish they could get to. That’s the AI unlock.

Starting internally also gives teams the chance to practice and prepare for broader, external deployments. You get to experiment, refine, and prove value in a way that’s low-risk but high-reward. And when the time comes to launch externally, you’ve already worked out the kinks and built organizational confidence.

AI doesn’t need to start big. It just needs to start wisely.

From Automation to Orchestration

One of the most exciting shifts I saw highlighted at IBM Think was this: we’re moving beyond task automation and stepping into the era of intelligent orchestration.

Whereas traditional automation handles isolated, repetitive tasks (often in a vacuum), today’s AI agents are built to be much more collaborative and dynamic. They’re not just reacting to prompts. They’re operating across systems, synthesizing information, and driving outcomes.

These next-gen agents are capable of:

Coordinating across tools and platforms, breaking down silos in the tech stack
Sharing context across actions, which makes their responses smarter and more relevant
Acting on governed data, which builds trust into every decision they make

What stood out to me the most is the growing momentum around prebuilt, interoperable agents. These tools dramatically reduce setup time, eliminate the need to reinvent the wheel, and support consistent outcomes across even the most complex environments.

As someone who’s spent time building, I love this approach. There’s something empowering, and honestly relieving, about not having to start from zero every time. It’s about being able to stand on the shoulders of what's already working and build upward from there.

Even more exciting is how lightweight, modular frameworks are making AI more accessible. Whether you're running in the cloud, on-prem, or even at the edge, these systems are designed to be flexible, scalable, and efficient. That means you can start where you are and grow into what you need.

Extending Governance Beyond Data: A Mindset Shift

One of the most meaningful insights I took away from IBM Think was this: governance can’t stop at the data layer. It has to include the AI systems themselves. That means taking a closer look not just at what information we feed into our models, but how those models operate, make decisions, and evolve over time.

It’s about asking harder questions:

Who’s accountable when an agent makes a decision?
Can we explain how a model reached a conclusion?
Are we confident it operates within ethical and regulatory boundaries?

Transparent oversight becomes key - not just for compliance, but for building trust. AI can’t be a black box. People need to understand and feel confident in the systems they’re using, especially when those systems are making decisions that affect customers, employees, or business strategy.

That’s where lineage comes in. It’s not enough to know what data was used. We need to track how that data was processed, how models were trained, and what influenced a specific output from an AI agent. Think of it as an audit trail for decision-making.

And of course, compliance and ethics have to be woven in, not layered on. It’s far easier to design for trust from the beginning than to retrofit it later.

This broader view of governance may feel like a shift, but it’s a necessary one. Because as AI grows more powerful, so does the responsibility we have to get it right.

What really stood out, too, was the emphasis on openness and collaboration. Innovation doesn’t happen in silos. The most exciting ideas are emerging in ecosystems where data scientists, developers, domain experts, and even regulators can contribute, stress-test, and refine systems together.

As one leader put it, “No single platform is going to rule the world. But openness, that’s how we move the whole industry forward.”

The Real Shift: Toward Frictionless AI

One of the most resonant ideas from IBM Think 2025 came from Alon Cohen, who said something that really stuck with me:
“When technology is at its highest and best use, it becomes invisible.”

That’s the north star for AI - not to take over workflows or draw constant attention to itself, but to quietly and effectively remove the barriers that slow us down. The goal isn’t just automation for automation’s sake. It’s about clearing the backlog, surfacing insights at the right moment, and giving people back the mental space to focus on what actually matters.

In other words: the best AI doesn’t feel like AI. It feels like flow.

AI is there to give us more time for strategy, creativity, and solving problems that don’t have playbooks.

Another quote that captured this spirit came from Ash Jhaveri at Meta:

“If you give the community good models, they’ll make them great.”

This speaks to a bigger cultural shift happening in enterprise AI: one of openness, collaboration, and co-creation. It’s not just about pushing out new tools, it’s about empowering communities of builders, thinkers, and doers to adapt, improve, and evolve those tools in ways no single company could accomplish on its own.

And that’s where real innovation lives. Not in silos or proprietary stacks, but in ecosystems where ideas are shared, models are tested, and improvements come from the many - not just the few.

The future of AI isn’t loud or flashy. It’s frictionless. It’s built with trust, shaped by its users, and designed to get out of the way, so people can get to the work that really matters.

Want to Go Deeper?

Stephanie Valarezo will be joining us on the Mavens of Data live show on Tuesday, June 17th, at 11 am EDT to share about “leveraging unstructured data to build more accurate, trustworthy AI agents” and answer community questions.

If you’re curious how to think about data for your AI systems, accelerate data intelligence through built-in observability and governance, and simplify your tech stack while improving trust and traceability in AI outputs. Be sure to tune in live!

Weekly free LIVE show!

Mavens of Data

Each week, you'll have the chance to join live, hear some great stories, amazing perspectives, and actionable advice, and participate in LIVE Q&A with some of your favorite data experts!

Explore shows

Share this article with your friends

Kristen Kehrer

Live Event Producer & Mavens of Data Host

I love building coding demos and educating others around topics in AI and machine learning. This past year I've leveraged computer vision to build things like a school bus detector that I use during the school year to get my kids on the bus. I've most recently been playing with semantic video search, vector databases, and building simple chatbots using OpenAI and LangChain.

View profile

Hidden Costs of Not Upskilling Your Team (And How to Fix It!)

Analysts You Can Hire: Spring 2025

Weekly free LIVE show!

Mavens of Data

Each week, you'll have the chance to join live, hear some great stories, amazing perspectives, and actionable advice, and participate in LIVE Q&A with some of your favorite data experts!

Explore shows