---
title: "ETL Pipeline for Real Estate Listing Syndication (alias Ligneurs)"
description: "ETL pipeline from PIM Akeneo to real estate portals - multi-format delivery (XML, CSV, JSON) over 4 years of continuous operation."
locale: "en"
canonical: "https://portfolio.josedacosta.net/en/achievements/pipeline-etl-syndication-immobiliere"
source: "https://portfolio.josedacosta.net/en/achievements/pipeline-etl-syndication-immobiliere.md"
html_source: "https://portfolio.josedacosta.net/en/achievements/pipeline-etl-syndication-immobiliere"
author: "José DA COSTA"
date: "2019"
type: "achievement"
slug: "pipeline-etl-syndication-immobiliere"
tags: ["PHP", "Symfony", "Akeneo PIM v2", "REST API", "XML", "CSV", "JSON", "FTP/SFTP", "GitLab CI", "Docker", "Kubernetes", "MySQL"]
generated_at: "2026-06-02T15:37:53.145Z"
---

# ETL Pipeline for Real Estate Listing Syndication (alias Ligneurs)

ETL pipeline from PIM Akeneo to real estate portals - multi-format delivery (XML, CSV, JSON) over 4 years of continuous operation.

**Date:** January 2019 - 2023  
**Duration:** ~4 years  
**Role:** Technical Lead then Project Manager  
**Technologies:** PHP, Symfony, Akeneo PIM v2, REST API, XML, CSV, JSON, FTP/SFTP, GitLab CI, Docker, Kubernetes, MySQL

### Key Metrics

- Partner Portals: **-** - Migrated, integrated and maintained
- Several dozen: **-**
- Export Formats: **-** - XML, CSV, JSON
- Project Duration: **-** - Continuous evolution
- GitLab Branches: **-** - Features and hotfixes documented
- Daily Volume: **-** - Listings processed across all portals
- Availability: **-** - Over 4 years of continuous operation

## Presentation

_Project definition and scope_

### Nature

Automated ETL pipeline (Extract-Transform-Load) for multi-channel real estate ad distribution

### Domain

Real Estate / PropTech - B2B (internal teams, partner portals) and B2C (indirect, end buyers)

### Functional Scope

- Automated data extraction from PIM Akeneo v2 REST API
- Per-partner format transformation (XML, CSV, JSON)
- FTP/SFTP automated delivery to several dozen partner platforms
- Multi-format image adaptation (4/3, 16/9, panoramic, square)
- Property typology mapping (apartment, house, duplex, triplex, studio, T1-T5+)
- Execution monitoring with email alerts and centralized monitoring system
- Individual partner activation/deactivation capability
- SKU matching algorithm for real vs. manually-created PIM programs

### Technology Choices & Rationale

- State of the art in 2019 - Stack aligned with the B2B integration standard at the time: batch ETL and FTP/SFTP were the norm before webhooks and event-driven architectures became mainstream.
- PHP / Symfony - Consistent with the existing backend ecosystem. Symfony Console provided a solid framework for scheduled batch command execution.
- Akeneo PIM v2 - Strategic company choice for product catalog management. Its REST API provided structured access to all program and lot data with versioned endpoints.
- Docker / Kubernetes - Each export job isolated in its own container, preventing resource conflicts between partner modules. K8s on AWS EKS handled scheduling and auto-recovery of failed jobs.
- GitLab CI - Automated the build-test-deploy cycle for each partner module independently, allowing targeted deployments without impacting other active feeds.

### System Overview

### System Architecture

The **"Export Ligneurs"** system is the **automated real estate listing distribution engine** of the Groupe Pichet. It extracts program and lot data from the PIM Akeneo, transforms it into the specific format required by each partner (XML, CSV, or JSON), and automatically exports it to real estate distribution platforms.

The system serves as the **critical link between the company's product data and its commercial visibility**: every property listing published on major French real estate portals (SeLoger, LeBonCoin, BienIci, LogicImmo...) passes through this pipeline. Any interruption or data inconsistency directly translates into **lost leads and missed sales opportunities**.

As the **sole technical owner** of this system, I was responsible for all architecture decisions, development, deployment, monitoring, and incident response - with full accountability for a pipeline feeding an estimated *****K euros/month in lead acquisition**.

Export Ligneurs - System Architecture Overview

## Objectives, Context, Stakes & Risks

_Strategic vision and constraints_

### Objectives

- Migrate all export feeds from legacy PIM v1.4 to the new PIM v2 Akeneo
- Execute migration partner by partner with business validation at each step
- Verify data consistency between source PIM and feeds sent to portals
- Handle each portal's specificities (image formats, typologies, required fields)
- Automate feed supervision (error alerts, execution reports)

### Context

The project was initiated during the **knowledge transfer from Andoni L.** in January 2019. The existing system ran on the legacy PIM v1.4 and needed to be fully migrated to PIM v2 Akeneo while maintaining continuous service to all partner portals.

The migration had to be performed **portal by portal** - each with its own format specifications, required fields, image constraints, and property typology mappings - making it impossible to execute as a single "big bang" migration. Each partner required individual validation by the business teams before going live.

The system was embedded in a larger data ecosystem: upstream data came from the accounting software and in-house ERPs feeding the PIM, while downstream the feeds connected to around a hundred lead suppliers generating an estimated **1 lead every 2 seconds** across all portals.

### Stakes

The partner portals (SeLoger, LeBonCoin, BienIci...) are **major lead acquisition channels** in the real estate market. Any interruption or error in the feeds directly translates into **lost leads and reduced commercial pipeline**. With several dozen partners to migrate individually, the project required sustained attention over multiple years while maintaining zero downtime on active feeds.

### Risks

- Data Inconsistency - Risk of publishing incorrect prices, wrong images, or missing properties on partner portals - directly impacting buyer trust and commercial results.
- Service Interruption - Any feed failure means properties disappear from partner portals, causing immediate lead loss for the commercial teams.
- Format Divergence - Each portal has unique requirements (image ratios, typology codes, required fields) - a generic approach was impossible.
- API Instability - PIM Akeneo API connection issues could block all exports simultaneously, requiring solid error handling and retry logic.

### Key Architecture Decisions

- Modular per-partner architecture - One isolated module per portal instead of a generic engine - Fault isolation: a bug in one module cannot affect other partners. Independent deployment and testing per feed.
- Progressive migration over big-bang - Portal-by-portal migration with business validation at each step - Blast radius limited to one partner at a time, with immediate rollback capability if issues arise.
- ETL batch processing over real-time streaming - Scheduled batch exports via CRON jobs rather than event-driven publishing - Partners consumed data via FTP/SFTP drops, not webhooks. Real-time would have added complexity without benefit.
- Multi-format image pre-generation - Pre-generate all image variants centrally rather than on-demand per partner - Avoids redundant processing of the same image across portals and ensures upstream compliance.

### ETL Data Pipeline

Decision

Rationale

Extract-Transform-Load pipeline for partner feed generation

## The Steps - What I Did

_Chronological progression of the project_

- Phase 1 - Knowledge Transfer & Initial Migration - January 2019 - I became the sole technical owner within 2 weeks after the handover from Andoni L. - On the migration side, I shipped the first batch: SeLoger Neuf, LogicImmo, TULN, Paru Vendu - On the project management side, I framed the migration roadmap with the business teams and defined partner-by-partner validation milestones - To secure the subsequent migrations, I established an acceptance checklist that I reused throughout the project
- Phase 2 - Feature Development & New Integrations - June - September 2019 - As Technical Lead, I prioritized the integration backlog by arbitrating between business requests, partner constraints and technical capacity - I integrated BienIci with a dedicated image adaptation layer - On ImmoNeuf, I adapted the feed with a 16/9 to 4/3 image conversion - On the reliability side, I stabilized the SeLoger and Knock feeds
- Phase 3 - Stabilization & Critical Fixes - January 2020 - I added pricing validation guardrails before publication - To absorb API error spikes, I introduced a circuit breaker and exponential backoff on PIM API calls - On the observability side, I added structured logging to reduce incident diagnosis time - On the project management side, I ran the incident post-mortems and reported the corrective actions and timelines to the steering committee
- Phase 4 - New Partners & Continuous Evolution - June 2020 - 2023 - On the new-partner side, I built the Investimeo and BienIci integrations from scratch - As Technical Lead / Project Manager, I framed the multi-year roadmap with management and negotiated the technical scope and integration SLA with each new partner - For Marketshot, I drove the clean removal of the partner without side effects on the other feeds - On the incident side, I resolved the NEEDOCS, BienIci and Green Valley anomalies

## Actors & Interactions

_Who I interacted with directly and how I collaborated_

### Coordination and Collaboration

### People Involved

- Andoni L. - Predecessor - I took over the complete knowledge transfer with him. Within 2 weeks, I reached autonomous operation of the full export system and became the sole technical reference for the entire scope.
- Gaetan B. - Business referent - Together, we co-defined and enforced the acceptance criteria for each migration: data accuracy, image compliance, typology mapping. I then formalized a reusable validation checklist with him.
- Leslie A. - Business referent - I drove the functional acceptance process with her, coordinating technical fixes and business priorities to keep migration velocity up.
- Franck C. - Manager (N+1) - I reported migration progress, risk assessment and resource needs to him, and brought him my technical recommendations for vendor coordination decisions.
- Sebastien B. - Vendor team - I coordinated the production deployment schedule with his team. We agreed on a shared protocol: preprod validation, business sign-off, prod deployment, 24h monitoring window.

As the **sole technical owner**, I coordinated directly with business stakeholders, external vendors and partner portals. Each migration led me to define acceptance criteria, pilot validation cycles and make the go/no-go call for production deployment. I learned to translate technical constraints into business terms and vice versa to keep everyone aligned.

## Results

_Impact for me and for the company_

### For Me

- I took full technical ownership of a business-critical system directly impacting revenue
- I made autonomous architecture decisions, with full accountability for reliability and data accuracy
- Over 4 years, I piloted this project with business teams, external vendors and several dozen partner portals
- I held the end-to-end lifecycle: architecture, development, deployment, monitoring and incident response
- This project changed the way I work: it placed me in cross-functional leadership on validation processes and partner onboarding

### For the Company

- Several dozen partner portals migrated from PIM v1.4 to v2 Akeneo with zero service interruption
- 2 new partner integrations built from scratch (BienIci, Investimeo)
- Several thousand listings processed daily across all partner portals
- 99.5%+ availability over 4 years, average incident resolution under 4 hours
- Standardized property typology across all feeds, reducing data inconsistency reports

## Project Aftermath

_What happened after delivery_

### System Evolution

**Immediate aftermath**: After the 2019 migration wave, the system entered a **continuous maintenance phase** with new partner additions and anomaly resolution as needed.

**Medium term**: Resilience proven over 4 years, handling partner format changes and internal data model evolutions without disruption.

**Long-term perspective**: Became a **foundational piece of infrastructure** feeding the commercial pipeline. Modular architecture allowed scaling across several dozen portals without fundamental redesign, and any developer could add a new partner by following the established patterns.

## Critical Reflection

_With hindsight, how I judge this project_

### What worked well

- With hindsight, the portal-by-portal migration proved its worth: minimal risk, business validation at each step, immediate rollback
- I stand behind the modular architecture I put in place: I could add, modify or deactivate feeds without side effects
- Thanks to the onboarding I standardized, new-partner integration dropped from weeks to days

### What could have been better

- With hindsight, I would have built a centralized monitoring dashboard rather than watching individual email alerts
- I would also have set up per-partner automated integration tests earlier to catch regressions upstream

### If I had to redo it today

- In 2026, I would pick an event-driven approach (Kafka/RabbitMQ) instead of CRON batch, with observability-first monitoring (OpenTelemetry, Grafana Tempo)
- I would set up a partner specification registry from day one to halve onboarding time
- I would add automated integration tests against each partner schema before deployment
- I would build a centralized real-time monitoring dashboard instead of relying on email alerts

### The lasting lessons this project brought me

- I take away that in multi-partner systems, there is no "one size fits all" - each integration has its own constraints
- I learned that long-running projects require a maintenance mindset from day one
- For revenue-critical systems, I measured that observability matters more than preventing every single failure

### Additional context

- Cumulative Partner Migration Timeline
- Partner Type Distribution
- Export Format Distribution
- Migration Status Breakdown
- Technical Effort Distribution
- Property Types Handled

## Skills applied

_Technical and soft skills applied_

- **[Project Piloting & Agile Methodologies](https://portfolio.josedacosta.net/en/skills/project-piloting-agile.md)** - Piloted the portal-by-portal migration over 4 years with standardized onboarding process, acceptance checklists and zero downtime, Coordinated business referents, external vendors and partner portals - go/no-go decisions and translation between technical and business constraints
- **[Problem Solving & Adaptability](https://portfolio.josedacosta.net/en/skills/problem-solving-adaptability.md)** - Same-day pricing fixes, defensive patterns (circuit breaker, exponential backoff retry), unique per-partner constraints, structured logging for fast diagnosis
- **[Software & System Architecture](https://portfolio.josedacosta.net/en/skills/system-architecture-design.md)** - Designed the complete ETL architecture: modular per-partner pipeline, batch over real-time, centralized multi-format image pre-generation - Integrated PIM Akeneo v2 REST API for data extraction and several dozen partner endpoints (REST APIs and FTP/SFTP file drops)
- **[Fullstack Development](https://portfolio.josedacosta.net/en/skills/fullstack-development.md)** - Sole technical owner over 4 years: PHP/Symfony, Akeneo PIM v2 integration, image processing, format generators, monitoring tooling
- **[DevOps, Cloud & Production Industrialization](https://portfolio.josedacosta.net/en/skills/devops-cloud-production.md)** - Docker/Kubernetes deployment with GitLab CI per partner module, enabling zero-downtime migration from PIM v1.4 to v2
- **[Data, AI & Machine Learning](https://portfolio.josedacosta.net/en/skills/data-ai-machine-learning.md)** - End-to-end ETL pipeline from PIM Akeneo to several dozen portals: extraction, multi-format transformation (XML/CSV/JSON), FTP/SFTP delivery, monitoring
- **[Tech & Field Versatility](https://portfolio.josedacosta.net/en/skills/tech-field-versatility.md)** - Sole owner across PHP/Symfony, Akeneo APIs, image processing, FTP/SFTP, Docker/Kubernetes, GitLab CI, covering data, code, infra and partner integrations end-to-end

## Related journey

_Professional experience linked to this achievement_

- **Technical Lead · Flows and Products: content and enterprise integration**

## Image gallery

_Project screenshots and visuals_

## Need an ETL syndication pipeline designed?

I delivered a multi-portal ETL syndication pipeline: PIM extraction, multi-format transformation (XML/CSV/JSON), FTP/SFTP delivery and monitoring over 4 years of continuous operation. Let's talk about your context.

**Contact me**
