MIT License · Open Source

The spider orchestration platform built in the open.

Estela gives you a production-grade web scraping platform built for teams that need full control over their web data infrastructure.

Free to deploy Yours to modify Built to scale
📦
bitmakerla / estela
Elastic web scraping cluster running on Kubernetes
196
⭐ Stars
18
🍴 Forks
454
📝 Commits
Stack
Built in Python
Django · Celery · Scrapy
Scrapy compatible
native integration
MIT License ✓ Build passing v0.1 Scrapy Kubernetes Django
196
GitHub Stars
454
Commits & counting
18
Forks
5+
Years in active development

Open source isn't a strategy.
It's how we work.

"We built Estela around a single principle: open exchange of information, technology, and collaborative development."

— Bitmaker team

Bitmaker has been building web data infrastructure for over 12 years. Making Estela open source was a natural extension of that work — a way to give back to the developer community we've been part of, and to build something better through transparency and collaboration than we ever could behind closed doors.

🔓

Transparency by default

Every line of Estela's code is public. Audit it, fork it, and understand exactly how your data pipeline works.

🤝

Collaborative development

The best ideas for Estela have come from the people using it. We build in public so the community shapes what comes next.

🧱

No lock-in, ever

MIT licensed. Deploy it anywhere, modify it freely, move on without friction. That's the deal.

From your code
to your data.

Estela is built as a small set of independent pieces that work together to take your spiders from deployment to running data extraction at scale.

1
🧑‍💻

You deploy your spider

Push your Scrapy project with the CLI or the Web UI. Estela packages it and stores it as a versioned image, ready to run on demand.

CLI · Web UI
2

The API orchestrates

The brain of Estela. It manages your projects, schedules jobs, and tells the cluster when and how to run each spider.

Django REST · Celery
3
🕷

Spiders run in the cluster

Each spider runs isolated as a Kubernetes job, with its own resources. Scale to many parallel runs without spiders stepping on each other.

Kubernetes Jobs
4
📦

Data flows to storage

As spiders scrape, items, requests, and logs stream through a high-throughput pipeline and land in storage — ready to query or export.

Queue pipeline · Storage
deploy orchestrate run collect
Built on Kubernetes
Every spider runs as a containerized job — fully isolated, automatically scheduled, and scaled across the cluster. No noisy neighbors, no manual VM management, and no scaling limits beyond the cluster you give it.

Deploy Estela on your own cluster.

Estela runs on any Kubernetes cluster — on-premise, AWS, GCP, or Azure. For on-premise and local environments, the CLI handles configuration and deployment on its own. Deploying on AWS, GCP, or Azure requires Estela Commercial Support.

Estela Commercial Support Estela can provide assistance for custom configuration on your preferred cloud provider.
Contact Estela Commercial Support →
Prerequisites
Docker + Compose >= 20.10
minikube or k3s ≥1.25 / v1.28+
Python >= 3.9
kubectl >= 1.23
Helm >= 3.9
Node.js >= 18.0
Yarn >= 1.22
Required services
MySQL 8.3 API metadata
MongoDB 6.0 Spider data
Kafka + Zookeeper Real-time queue
MinIO Object storage
Docker Registry v2 Spider images
bash · estela self-hosted setup
# Install the Estela CLI
pip install estela

# Clone the repository
git clone https://github.com/bitmakerla/estela.git
cd estela/installation

# Check requirements
./check_requirements.sh

# Install Estela
make resources

# Configure your variables:
# https://estela.bitmaker.la/estela/installation/helm-variables.html

make images
make install

# Start Minikube tunnel
minikube tunnel

# Open a new terminal and continue with:
make setup
make setup-minio-bucket
make createsuperuser
make build-web
make run-web

# Deploy your spider
estela login
estela init <project-id>
estela deploy

Built with the community.
Improved by it too.

Estela is MIT-licensed and open to contributions of all kinds — from bug fixes to new features to documentation improvements.

🐛

Report a bug

Found something broken? Open a GitHub issue with steps to reproduce, expected behavior, and a minimal sample when possible.

Open an issue →
🔀

Submit a pull request

Fork the repo, create a branch from main, and follow the PR template. All submissions go through review.

Read CONTRIBUTING.md →
💡

Propose a feature

Have an idea for Estela? Open an issue tagged enhancement and describe the use case. We read every proposal.

Start a discussion →
Contributor workflow
1
Find an issue
look for `good first issue` or `help wanted`
2
Discuss the approach
comment on the issue before coding
3
Open a pull request
follow the PR template
4
Get reviewed & merged
a maintainer responds within ~3 business days
What we look for in a PR
Scope
One PR = one change. Smaller is better.
Tests
New code needs tests. We won't merge without them.
Style
Clean, readable code consistent with the rest of Estela.
Docs
If behavior changes, update the docs in the same PR.
Estela Enterprise

Need Estela running on your own infrastructure?

Bitmaker deploys, configures, and maintains Estela on your servers. SLA-backed support, dedicated engineering access, and custom development — so your team ships data pipelines, not DevOps.

Installation & configuration on your cluster
SLA-backed response — P1 in 30 min (24/7)
Security patches & ongoing maintenance
Dedicated Technical Account Manager
Custom feature development available