Run your data operations on a single, unified platform.

  • Easy setup, no data storage required
  • Free forever for core features
  • Simple expansion with additional credits
cross-icon
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Opportunities And Challenges When Using LLMs In The Data Space

Product
September 2, 2025
Updated on
5 min read
Opportunities And Challenges When Using LLMs In The Data Space
Martin Fišer
Martin Fišer
Field CTO for Americas
Download for Free
First name *
Last name *
Business email *
Phone number *
By submitting this contact form you are asking Keboola Czech s.r.o. to get in touch with you and you agree with Privacy policy.
Fields marked with * are mandatory
Oops! Something went wrong while submitting the form. Try it again please.

Large Language Models (LLMs) are transforming how organizations interact with their data infrastructure, offering unprecedented capabilities for both technical and business users. However, this transformation brings unique opportunities and challenges that vary significantly based on user personas, security requirements, and implementation approaches. This writeup explores these dimensions through the lens of practical implementation using tools like Keboola MCP and various client interfaces.

The Persona Divide: Technical vs. Business Users

Technical Users: The Builders and Maintainers

Technical users—data engineers, analysts, and developers—approach LLMs with a fundamentally different mindset than their business counterparts. Their primary intent revolves around:

Intent and Use Cases:

  • Creating and Debugging complex data pipelines
  • Writing and optimizing SQL queries and transformations
  • Automating repetitive configuration tasks
  • Data processing optimization

For these users, LLMs serve as intelligent coding assistants that understand context, suggest optimizations, and accelerate development cycles. They seek precision, control, and the ability to review and modify generated code before execution.

However, data engineering operates largely on a binary principle: configurations either function flawlessly and reliably, or they fail. Engineers are unlikely to accept 90% functionality, such as an incomplete SQL query or a partially defined data extraction.

Preferred Interfaces: Technical users gravitate toward IDE-integrated solutions like VSCode or Cursor, where LLMs enhance their existing workflows. These tools allow them to maintain version control, leverage syntax highlighting, and access debugging capabilities while benefiting from AI assistance. The integration feels natural—it's an enhancement of their familiar environment rather than a replacement.

Suggested scopes of work:

  • (To understand)
    • Project documentation & descriptions
    • Data validation
    • Debugging
  • (To build)
    • Component configurations
    • Custom Python
    • Transformations
    • Data Apps
    • Whole Pipeline (flows)

Business Users: The Consumers and Decision Makers

Business users approach data with questions rather than queries. Their relationship with LLMs is fundamentally different:

Intent and Use Cases:

  • Extracting insights from data without writing code
  • Creating reports and visualizations
  • Understanding data relationships and trends
  • Making data-driven decisions quickly
  • Asking natural language questions about their data
  • Understanding data processing setups (who, what, how)
  • Add business/semantic context to the data

These users need abstraction from technical complexity. They want to ask "What were our top-performing products last quarter?" rather than write JOIN statements and GROUP BY clauses.

However, using non-deterministic systems like LLMs presents increasing challenges in the reliability and reproducibility of results. Overcoming these technical challenges requires further assistance from more technical-savvy users.

Preferred Interfaces: Business users prefer conversational interfaces like Claude or custom chat applications. These tools provide a familiar, accessible entry point to data exploration without requiring technical knowledge. The conversation history becomes their audit trail, and the natural language interaction removes barriers to data access.

Suggested scopes of work:

  • (To understand)
    • New user onboarding
    • Explain complicated logic
    • Code comprehension
    • Explore data
  • (To improve)
    • Project documentation
    • Descriptions
    • (Data validation)
    • (Debugging)

Security Considerations: The Critical Foundation

The integration of LLMs into data infrastructure introduces novel security challenges that organizations must address comprehensively:

Data Privacy and Exposure

When LLMs interact with sensitive business data, several risks emerge:

Context Leakage: LLMs require context to provide accurate responses, but this context often includes sensitive information. Every query, every piece of data shared with the model, potentially exposes confidential information. Organizations must implement data policies,  handling protocols and rules for using only approved LLM technology

Audit and Compliance: Every interaction between users and LLMs touching production data must be logged and auditable. This includes not just the queries but also the responses, the data accessed, and the transformations applied. Compliance with regulations like GDPR, HIPAA, or SOC 2 requires careful consideration of how LLM interactions are recorded and retained.

Access Control and Authorization

The democratization of data access through LLMs must not compromise security:

Role-Based Access: The LLM interface must respect existing data governance policies. A business user asking about data outside of their scope should receive different responses based on their authorization level. This requires sophisticated integration between the LLM layer and existing identity and access management systems or access management based on explicit dataset permission such as specific data catalogs.

Query Validation: For technical users, LLMs might generate complex queries that could potentially access unauthorized data or perform unintended operations. Implementing query validation and sandboxing mechanisms becomes crucial to prevent accidental or malicious data exposure.

Client Diversity: Matching Tools to Users

The choice of client interface significantly impacts adoption and effectiveness:

VSCode/Cursor for Technical Teams

These IDE-integrated solutions excel for technical users because they:

  • Maintain familiar workflows and keyboard shortcuts
  • Provide immediate code validation and syntax highlighting
  • Enable seamless collaboration through version control integration
  • Support debugging and testing within the same environment
  • Allow gradual adoption—users can choose when to engage AI assistance

The integration feels like a natural evolution of existing tools rather than a disruptive change, leading to higher adoption rates among technical teams.

Claude and Custom Chat Interfaces for Business Teams

Conversational interfaces succeed with business users because they:

  • Remove technical barriers to data access
  • Provide explanations alongside results
  • Build confidence through natural language interaction
  • Create a self-documenting trail of analysis through conversation history
  • Enable iterative exploration through follow-up questions

Custom chat interfaces can further enhance this experience by incorporating organization-specific terminology, branded experiences, and integrated visualization capabilities.

Challenges and Mitigation Strategies

The Hallucination Problem

LLMs may generate plausible-looking but incorrect queries or analyses. Mitigation strategies include:

  • Implementing validation layers that check generated SQL against schema
  • Requiring human review for critical operations
  • Providing clear confidence indicators for generated responses
  • Training users to verify results against known data points

Performance at Scale

As usage grows, performance challenges emerge:

  • Query optimization becomes crucial when LLM-generated queries hit large datasets
  • Caching strategies must balance freshness with response time
  • Resource allocation needs to account for varying query complexity
  • Token limits may restrict complex analytical tasks

Change Management

Successful adoption requires thoughtful change management:

  • Training programs tailored to different user personas
  • Clear governance policies for LLM usage
  • Gradual rollout with pilot groups
  • Continuous feedback loops for improvement
  • Documentation of best practices and limitations

Future Directions

The convergence of LLMs and data infrastructure is still evolving. Future developments likely include:

Advanced Reasoning: Next-generation models will better understand complex business logic and data relationships, reducing errors and improving insight quality.

Multimodal Capabilities: Integration of visual data analysis, voice interfaces, and even video explanations will further lower barriers to data access.

Autonomous Agents: LLMs will evolve from assistants to autonomous agents capable of monitoring data, identifying anomalies, and proactively suggesting optimizations.

Federated Learning: Organizations will be able to benefit from collective learning while maintaining data privacy through federated approaches.

Complete the form below to get your complimentary copy.
Oops! Something went wrong while submitting the form.

Implementing with Keboola MCP

Keboola's Model Context Protocol (MCP) server provides a practical framework for addressing mentioned challenges while maximizing opportunities:

Architecture and Integration

Keboola MCP acts as an intelligent middleware layer between LLMs and data infrastructure. It provides:

Standardized Communication: The MCP server establishes a consistent protocol for LLM interactions with data platforms, regardless of the client interface. This standardization simplifies security auditing and ensures consistent behavior across different user personas.

Context Management: Rather than exposing raw database connections to LLMs, Keboola MCP manages context intelligently. It understands the data model, relationships, and business logic, providing rich context to the LLM while maintaining security boundaries.

Check our GitHub.

Persona-Specific Benefits

For Technical Users:

  • Direct integration with development environments through MCP-compatible clients
  • Access to component configurations and transformation logic
  • Ability to generate, test, and deploy code within existing workflows
  • Preservation of version control and collaboration features

For Business Users:

  • Simplified natural language interface to complex data operations
  • Automatic translation of business questions into technical implementations
  • Guided data exploration without technical knowledge requirements
  • Consistent results regardless of technical proficiency

Security Implementation

Keboola MCP addresses security concerns through:

Abstraction Layer: By sitting between the LLM and data infrastructure, it can enforce security policies consistently, validate queries, and ensure appropriate data access.

Data Access: Utilizing the concepts of Keboola project and Workspace allows strict isolation of the data access of selected “certified” data catalogs.

Audit Trail: All interactions flow through the MCP server, creating a comprehensive audit log for compliance and security monitoring.

Controlled Execution: Rather than allowing direct database access, the MCP server can validate and sanitize all operations before execution.

Conclusion

The integration of LLMs into data infrastructure represents a paradigm shift in how organizations interact with their data. Success requires careful consideration of user personas, robust security implementations, and thoughtful tool selection. Solutions like Keboola MCP provide a framework for navigating these challenges while maximizing opportunities.

By acknowledging the different needs of technical and business users, implementing comprehensive security measures, and choosing appropriate client interfaces, organizations can harness the transformative power of LLMs while maintaining data integrity and security. The key lies not in wholesale adoption but in strategic implementation that enhances existing workflows while opening new possibilities for data interaction.

As this space continues to evolve, organizations that thoughtfully balance innovation with security, accessibility with control, and automation with human oversight will be best positioned to realize the full potential of LLMs in their data operations.

Subscribe to our newsletter
Have our newsletter delivered to your inbox.
By subscribing to our newsletter you agree with Keboola Czech s.r.o. Privacy Policy.
green check icon
You are now subscribed to Keboola newsletter
Oops! Something went wrong while submitting the form.
Download for Free
First name *
Last name *
Business email *
Phone number *
By submitting this contact form you are asking Keboola Czech s.r.o. to get in touch with you and you agree with Privacy policy.
Fields marked with * are mandatory
Oops! Something went wrong while submitting the form. Try it again please.

Recommended Articles

No items found.
>