Experience Summary

5–10 years of Data Engineer specialized in building document and knowledge-oriented data pipelines for regulatory/compliance domains, with strong capabilities in structured transformations, knowledge graphs, and containerized platform integration.

Core Responsibilities / Focus

Build and operate data ingestion and transformation pipelines for legal/regulatory content
Normalize and transform heterogeneous source formats (e.g., XML/HTML/structured exports) using tools such as XSLT
Implement pipelines for embeddings generation, indexing, and enrichment for downstream AI/RAG systems
Design and manage RDF-based knowledge representations and SPARQL-accessible datasets
Integrate storage and processing components across containerized/cloud environments
Support event-driven or integration-heavy workflows (e.g., via Apache Camel, message brokers)
Ensure reproducibility, maintainability, and operational handover of data pipelines

Core Skills (Must-Have)

Python/
Java
Docker / Docker Compose
Kubernetes
Knowledge Graphs (RDF)
SPARQL
XSLT
Embeddings pipelines / vector preparation
Azure Storage (or equivalent cloud storage services)
Apache Camel
Git

Preferred / Nice-to-Have

Docling (or similar Document conversion)
CloudEvents
Kafka (or other message brokers)
Event-based systems / event-driven architecture
Dev Containers
GitOps
Documentation practices

Domain Advantage

Experience processing legal/regulatory source documents and preserving semantic structure / provenance

Familiarity with content domains such as EU regulation, privacy, ESG, and compliance frameworks

Sr Data Engineer

Job Description

About Bosch Group

Similar Jobs

Praktikum im Sondermaschinenbau - Bereich Konstruktion

Estágio Curricular em Segurança e Proteção de dados para Sistemas de Informação (f/m/div.)

IT Support Specialist II

Security Analyst