MIT License · Open Source

The spider orchestration platform built in the open.

Estela gives you a production-grade web scraping platform built for teams that need full control over their web data infrastructure.

Free to deploy Yours to modify Built to scale

Estela Commercial Support → View on GitHub Read the Docs →

📦

bitmakerla / estela

Elastic web scraping cluster running on Kubernetes

196

⭐ Stars

🍴 Forks

454

📝 Commits

Stack

Built in Python

Django · Celery · Scrapy

Scrapy compatible

native integration

MIT License ✓ Build passing v0.1 Scrapy Kubernetes Django

// Why open source

Open source isn't a strategy.
It's how we work.

"We built Estela around a single principle: open exchange of information, technology, and collaborative development."

— Bitmaker team

Bitmaker has been building web data infrastructure for over 12 years. Making Estela open source was a natural extension of that work — a way to give back to the developer community we've been part of, and to build something better through transparency and collaboration than we ever could behind closed doors.

🔓

Transparency by default

Every line of Estela's code is public. Audit it, fork it, and understand exactly how your data pipeline works.

🤝

Collaborative development

The best ideas for Estela have come from the people using it. We build in public so the community shapes what comes next.

🧱

No lock-in, ever

MIT licensed. Deploy it anywhere, modify it freely, move on without friction. That's the deal.

// Architecture

From your code
to your data.

Estela is built as a small set of independent pieces that work together to take your spiders from deployment to running data extraction at scale.

🧑‍💻

You deploy your spider

Push your Scrapy project with the CLI or the Web UI. Estela packages it and stores it as a versioned image, ready to run on demand.

CLI · Web UI

⚡

The API orchestrates

The brain of Estela. It manages your projects, schedules jobs, and tells the cluster when and how to run each spider.

Django REST · Celery

🕷

Spiders run in the cluster

Each spider runs isolated as a Kubernetes job, with its own resources. Scale to many parallel runs without spiders stepping on each other.

Kubernetes Jobs

📦

Data flows to storage

As spiders scrape, items, requests, and logs stream through a high-throughput pipeline and land in storage — ready to query or export.

Queue pipeline · Storage

deploy → orchestrate → run → collect

☸

Built on Kubernetes

Every spider runs as a containerized job — fully isolated, automatically scheduled, and scaled across the cluster. No noisy neighbors, no manual VM management, and no scaling limits beyond the cluster you give it.

Want the full architecture in detail? See the technical breakdown at estela.bitmaker.la →

// Self-hosted installation

Deploy Estela on your own cluster.

Estela runs on any Kubernetes cluster — on-premise, AWS, GCP, or Azure. For on-premise and local environments, the CLI handles configuration and deployment on its own. Deploying on AWS, GCP, or Azure requires Estela Commercial Support.

Prerequisites

Docker + Compose >= 20.10

minikube or k3s ≥1.25 / v1.28+

Python >= 3.9

kubectl >= 1.23

Helm >= 3.9

Node.js >= 18.0

Yarn >= 1.22

Required services

MySQL 8.3 API metadata

MongoDB 6.0 Spider data

Kafka + Zookeeper Real-time queue

MinIO Object storage

Docker Registry v2 Spider images

            
bash · estela self-hosted setup
# Install the Estela CLI

            pip
            
            install
            
            estela
            
# Clone the repository

            git
            
            clone
            
            https://github.com/bitmakerla/estela.git
            
            cd
            
            estela/installation
            
# Check requirements

            ./check_requirements.sh
            
# Install Estela

            make
            
            resources
            
# Configure your variables:
# https://estela.bitmaker.la/estela/installation/helm-variables.html

            make
            
            images
            
            make
            
            install
            
# Start Minikube tunnel

            minikube
            
            tunnel
            
# Open a new terminal and continue with:

            make
            
            setup
            
            make
            
            setup-minio-bucket
            
            make
            
            createsuperuser
            
            make
            
            build-web
            
            make
            
            run-web
            
# Deploy your spider

            estela
            
            login
            
            estela
            
            init
            
            <project-id>
            
            estela
            
            deploy

// Contribute

Built with the community.
Improved by it too.

Estela is MIT-licensed and open to contributions of all kinds — from bug fixes to new features to documentation improvements.

🐛

Report a bug

Found something broken? Open a GitHub issue with steps to reproduce, expected behavior, and a minimal sample when possible.

Open an issue →

🔀

Submit a pull request

Fork the repo, create a branch from main, and follow the PR template. All submissions go through review.

Read CONTRIBUTING.md →

💡

Propose a feature

Have an idea for Estela? Open an issue tagged enhancement and describe the use case. We read every proposal.

Start a discussion →

Contributor workflow

Find an issue

look for `good first issue` or `help wanted`

Discuss the approach

comment on the issue before coding

Open a pull request

follow the PR template

Get reviewed & merged

a maintainer responds within ~3 business days

What we look for in a PR

Scope

One PR = one change. Smaller is better.

Tests

New code needs tests. We won't merge without them.

Style

Clean, readable code consistent with the rest of Estela.

Docs

If behavior changes, update the docs in the same PR.

Estela Enterprise

Need Estela running on your own infrastructure?

Bitmaker deploys, configures, and maintains Estela on your servers. SLA-backed support, dedicated engineering access, and custom development — so your team ships data pipelines, not DevOps.

Contact for Estela Enterprise → See support tiers →

✓Installation & configuration on your cluster

✓SLA-backed response — P1 in 30 min (24/7)

✓Security patches & ongoing maintenance

✓Dedicated Technical Account Manager

✓Custom feature development available