Getting started
There are two ways to use DataHelm Crawler: install the package into an existing Laravel app, or run the full Docker stack from the reference environment. Pick whichever fits.
Install the package
composer require datahelm/crawler
Publish the config (optional — only if you want to tweak detectors, transports, etc.):
php artisan vendor:publish --tag=crawler-config
The package auto-registers DataHelm\Crawler\CrawlerServiceProvider. Requirements:
- PHP
^8.3 - Laravel
^11.0 || ^12.0 || ^13.0 guzzlehttp/guzzle ^7.0,symfony/dom-crawler ^7.0 || ^8.0
The default transport is plain HTTP (guzzle) and needs no extra infrastructure.
Quick start
# auto-detect a listing and scaffold a site robot
php artisan datahelm:scrap:generate "https://example.com/listing" --get-detail=true --robot
# run the scaffolded robot, capped at 10 items
php artisan datahelm:robot:example --limit=10
A complete, public example you can run immediately:
php artisan datahelm:scrap:generate https://books.toscrape.com/ --get-detail=true --save
php artisan datahelm:scrap:run books.toscrape.com --limit=20
Docker invocation
In the reference Docker stack, Artisan runs through an on-demand artisan service, so every command above is prefixed with docker compose run --rm:
docker compose run --rm artisan datahelm:scrap:generate https://books.toscrape.com/ --get-detail=true --save
This documentation uses the bare php artisan … form; prefix it when running in Docker.
Run the full Docker stack
The reference environment bundles everything (nginx, PHP, PostgreSQL, Redis, Supervisor, browserless, FlareSolverr). Use the public datahelm/environment repository:
git clone https://github.com/datahelm/environment.git
cd environment
cp .env.example .env
export UID=$(id -u) GID=$(id -g) # containers own files as your user
docker compose up -d
Common stack commands:
docker compose up -d # start the stack
docker compose build # rebuild images after Dockerfile changes
docker compose ps # status
docker compose down # stop
# on-demand tools (profile: tools)
docker compose run --rm artisan migrate
docker compose run --rm artisan tinker
docker compose run --rm composer install
docker compose run --rm npm install
See the Docker stack reference for the full service/port table.
Optional: anti-bot services only
If you only need headless Chrome and Cloudflare solving (not the whole stack), start just those two services from the package:
docker compose -f vendor/datahelm/crawler/docker/compose.services.yml up -d
Stop them when done — each runs a full Chromium and uses RAM/CPU:
docker compose -f vendor/datahelm/crawler/docker/compose.services.yml stop
You only need these for the browser, flaresolverr or auto transports — see HTTP transports & bot protection.
Next: Core concepts →

