Large Language Models (LLMs) are transforming how organizations interact with their data infrastructure, offering unprecedented capabilities for both technical and business users. However, this transformation brings unique opportunities and challenges that vary significantly based on user personas, security requirements, and implementation approaches. This writeup explores these dimensions through the lens of practical implementation using tools like Keboola MCP and various client interfaces.
Technical users—data engineers, analysts, and developers—approach LLMs with a fundamentally different mindset than their business counterparts. Their primary intent revolves around:
Intent and Use Cases:
For these users, LLMs serve as intelligent coding assistants that understand context, suggest optimizations, and accelerate development cycles. They seek precision, control, and the ability to review and modify generated code before execution.
However, data engineering operates largely on a binary principle: configurations either function flawlessly and reliably, or they fail. Engineers are unlikely to accept 90% functionality, such as an incomplete SQL query or a partially defined data extraction.
Preferred Interfaces: Technical users gravitate toward IDE-integrated solutions like VSCode or Cursor, where LLMs enhance their existing workflows. These tools allow them to maintain version control, leverage syntax highlighting, and access debugging capabilities while benefiting from AI assistance. The integration feels natural—it's an enhancement of their familiar environment rather than a replacement.
Suggested scopes of work:
Business users approach data with questions rather than queries. Their relationship with LLMs is fundamentally different:
Intent and Use Cases:
These users need abstraction from technical complexity. They want to ask "What were our top-performing products last quarter?" rather than write JOIN statements and GROUP BY clauses.
However, using non-deterministic systems like LLMs presents increasing challenges in the reliability and reproducibility of results. Overcoming these technical challenges requires further assistance from more technical-savvy users.
Preferred Interfaces: Business users prefer conversational interfaces like Claude or custom chat applications. These tools provide a familiar, accessible entry point to data exploration without requiring technical knowledge. The conversation history becomes their audit trail, and the natural language interaction removes barriers to data access.
Suggested scopes of work:
The integration of LLMs into data infrastructure introduces novel security challenges that organizations must address comprehensively:
When LLMs interact with sensitive business data, several risks emerge:
Context Leakage: LLMs require context to provide accurate responses, but this context often includes sensitive information. Every query, every piece of data shared with the model, potentially exposes confidential information. Organizations must implement data policies, handling protocols and rules for using only approved LLM technology
Audit and Compliance: Every interaction between users and LLMs touching production data must be logged and auditable. This includes not just the queries but also the responses, the data accessed, and the transformations applied. Compliance with regulations like GDPR, HIPAA, or SOC 2 requires careful consideration of how LLM interactions are recorded and retained.
The democratization of data access through LLMs must not compromise security:
Role-Based Access: The LLM interface must respect existing data governance policies. A business user asking about data outside of their scope should receive different responses based on their authorization level. This requires sophisticated integration between the LLM layer and existing identity and access management systems or access management based on explicit dataset permission such as specific data catalogs.
Query Validation: For technical users, LLMs might generate complex queries that could potentially access unauthorized data or perform unintended operations. Implementing query validation and sandboxing mechanisms becomes crucial to prevent accidental or malicious data exposure.
The choice of client interface significantly impacts adoption and effectiveness:
These IDE-integrated solutions excel for technical users because they:
The integration feels like a natural evolution of existing tools rather than a disruptive change, leading to higher adoption rates among technical teams.
Conversational interfaces succeed with business users because they:
Custom chat interfaces can further enhance this experience by incorporating organization-specific terminology, branded experiences, and integrated visualization capabilities.
LLMs may generate plausible-looking but incorrect queries or analyses. Mitigation strategies include:
As usage grows, performance challenges emerge:
Successful adoption requires thoughtful change management:
The convergence of LLMs and data infrastructure is still evolving. Future developments likely include:
Advanced Reasoning: Next-generation models will better understand complex business logic and data relationships, reducing errors and improving insight quality.
Multimodal Capabilities: Integration of visual data analysis, voice interfaces, and even video explanations will further lower barriers to data access.
Autonomous Agents: LLMs will evolve from assistants to autonomous agents capable of monitoring data, identifying anomalies, and proactively suggesting optimizations.
Federated Learning: Organizations will be able to benefit from collective learning while maintaining data privacy through federated approaches.
Keboola's Model Context Protocol (MCP) server provides a practical framework for addressing mentioned challenges while maximizing opportunities:
Keboola MCP acts as an intelligent middleware layer between LLMs and data infrastructure. It provides:
Standardized Communication: The MCP server establishes a consistent protocol for LLM interactions with data platforms, regardless of the client interface. This standardization simplifies security auditing and ensures consistent behavior across different user personas.
Context Management: Rather than exposing raw database connections to LLMs, Keboola MCP manages context intelligently. It understands the data model, relationships, and business logic, providing rich context to the LLM while maintaining security boundaries.
For Technical Users:
For Business Users:
Keboola MCP addresses security concerns through:
Abstraction Layer: By sitting between the LLM and data infrastructure, it can enforce security policies consistently, validate queries, and ensure appropriate data access.
Data Access: Utilizing the concepts of Keboola project and Workspace allows strict isolation of the data access of selected “certified” data catalogs.
Audit Trail: All interactions flow through the MCP server, creating a comprehensive audit log for compliance and security monitoring.
Controlled Execution: Rather than allowing direct database access, the MCP server can validate and sanitize all operations before execution.
The integration of LLMs into data infrastructure represents a paradigm shift in how organizations interact with their data. Success requires careful consideration of user personas, robust security implementations, and thoughtful tool selection. Solutions like Keboola MCP provide a framework for navigating these challenges while maximizing opportunities.
By acknowledging the different needs of technical and business users, implementing comprehensive security measures, and choosing appropriate client interfaces, organizations can harness the transformative power of LLMs while maintaining data integrity and security. The key lies not in wholesale adoption but in strategic implementation that enhances existing workflows while opening new possibilities for data interaction.
As this space continues to evolve, organizations that thoughtfully balance innovation with security, accessibility with control, and automation with human oversight will be best positioned to realize the full potential of LLMs in their data operations.