From Scraper to Inbox: Building an AI-Powered Real Estate Sourcing Tool
Finding good investment property in Germany is a slow, manual process. You're cross-referencing three or four listing platforms, filtering by your own criteria, calculating yields, and doing all of it again the next morning because new listings appeared overnight. I recently built a system to automate most of this for a client, and I am genuinely proud of the result.
The Brief
The client needed a way to monitor the real estate market here in Berlin across multiple platforms simultaneously, filter listings against their specific investment criteria, and surface only the most relevant opportunities daily, without spending hours doing it manually. The output needed to be both a persistent dashboard for browsing and a daily email digest so nothing good slipped past.
The Stack
I built the system by connecting four tools that each do one thing well:
- Apify: scraping Immoscout24, Immowelt, and Kleinanzeigen on a scheduled basis
- n8n: orchestrating the entire processing pipeline: deduplication, filtering, enrichment, AI analysis, and email dispatch
- Airtable: serving as the data store and structured backend
- Next.js: the client-facing dashboard, pulling data from Airtable
No single piece of this is novel. The value is in how they fit together, and the rapidity with which they allow the goal to be reached.
The Pipeline

Scraping and Ingestion
Apify handles the scraping. Each of the three platforms gets its own actor running on a schedule triggered by n8n, and results land in n8n via an API call. At this stage the data is raw and messy. Duplicate listings are common because the same property often appears on multiple platforms, and the field schemas differ drastically between sites and require normalization before anything useful can happen.
The first n8n steps normalize fields into a common schema and deduplicate across sources. Deduplication by listing ID alone doesn't work across platforms, so the logic uses a combination of address, price, and size to catch cross-site duplicates.
Filtering and Enrichment
Once deduplicated, listings pass through a filter stage based on the client's criteria: price range, size, location, property type, heating, rental status, and energy class. Anything that doesn't fit the client criteria is logged to the Airtable with the reason for its declined status.
Listings that survive filtering get enriched with yield estimates calculated from the asking price and local average rental data. This gives each listing with enough input data available a rough return figure without requiring the client to do that math manually.

