Scaling Lead Intelligence with Leadreaper - AWS Lambda, Serverless, & dbt
Scaling Lead Intelligence with Leadreaper - AWS Lambda, Serverless, & dbt
SUMMARY
Facet developed Leadreaper, a scalable lead intelligence platform for identifying qualified lead opportunities for technology-driven companies. Leadreaper is built with the goals of powering large-scale sales teams who need near real-time, high quality data about their client’s detectable technology and operations.
- Hyper-scalable via Serverless framework deployed on AWS Lambda.
- Crawls scale up to 100,000 domains per day.
- Performs dynamic segmentation and account scoring.
- Layers data enrichment based on qualifying factors to better allocate costs.
- Intelligently enrich, segment, and qualify sales opportunities.
Challenge
- Facet, like many web development studios, faced the challenge of inconsistent quality with lead intelligence and lead signals for qualifying customers.
- Many lead databases incorrectly qualified prospective accounts with incorrect technologies, leading to time wasted doing out reach to accounts with a poor matching criteria.
- Lead signals must be highly trusted, as timing is everything in determining the appropriate approach for a given client opportunity. Not only is the technology detection important, but the transition of technologies is a key signal.
Insight
Even if a company was reported as using a technology, there’s no convention in lead databases for identifying which website uses a particular technology — Facet needed a way to target a specific department if they were a user of Drupal plus other advertising or marketing technologies which would qualify them as spending on customer acquisition.
Solution
Facet engineered a scalable, serverless application with elastic infrastructure on AWS Lambda, enabling progressive profiling of target enterprise accounts and collect lead intelligence on prospective accounts.
Facet developed various web scraping plugin architectures, including:
Technology Detection
- Passive technology detection with Wappalyzer
- IP Whois Lookup for Hosting Infrastructure Detection or WAF detection
- Drupal Version Detection
- WordPress Version Detection
Metadata Capture
- HTML Metatags Scraping
- JSON-LD / Schema Metadata Scraping
User Experience Capture
- Google Lighthouse score detection including PageSpeed, accessibility, security, and progressive web app scores.
Querying Third-Party APIs for Account Enrichment
- Apollo Organizations API
Developing data transformation pipelines with dbt
- By leveraging dbt (”data build tool”), Leadreaper can incrementally update data marts and fact tables through dbt’s own directed acyclic graphs (DAGs).
- dbt ships with a number of tools for testing the data sources, transformations, and allowing Leadreaper’s data models to continually be updated and shipped with documentation as models change.