Skip to content
Permalink
Browse files

API Refactor / Browser Profiles Support (#15)

* api refactor:
- '/crawls' create api accepts all options as POST body params, 'start' indicates if also starting
- '/crawl/<id>/start' api simply starts crawl
- 'cache' enum added to (always, default, never) to indicate aggressive cacheing (every request), default, or no cacheing
- browsertrix_cli: move crawl commands to 'crawls' subcommand and crawls.py to allow other subcommands
- tests: update tests for new api, fix name in test compose
- update /crawl/<id>/urls to return 'scopes' and 'queue' as with lists of dict instead of lists of dict as string.

* support for browser profiles as per #14
update configs:
- enable auto fetch for pywb
- add fixed-pool to pools
- enable default fixed pool, no-proxy/live mode for creating profile browser
- proxy set via PROXY_HOST env var, defaults to 'pywb'
- profile: list, create, remove
- crawl create: can specify --profile flag, also --coll and --mode overrides
- consistently use 'behavior_time' param for behavior time

* tests: update test-docker-compose
- add install-browsers.sh, use for travis install (with headless)
- remove unused decoding functions
  • Loading branch information...
ikreymer committed Apr 23, 2019
1 parent db0367e commit 83ebf10d1a8efdaa481909718921a5c6387ff9b6
@@ -237,3 +237,4 @@ README.md
mypy.ini
.flake8
frontend
webarchive
@@ -27,9 +27,7 @@ jobs:
- DOCKER_COMPOSE_VERSION=1.23.2

before_install:
- docker pull oldwebtoday/base-browser
- docker pull oldwebtoday/chrome:73
- docker pull webrecorder/autobrowser
- ./install-browsers.sh --headless
- sudo rm /usr/local/bin/docker-compose
- curl -L https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE_VERSION}/docker-compose-`uname -s`-`uname -m` > docker-compose
- chmod +x docker-compose
@@ -6,7 +6,8 @@ COPY requirements.txt ./

RUN pip install --no-cache-dir -r requirements.txt

COPY . ./
COPY browsertrix ./browsertrix
COPY static ./static

CMD uvicorn --reload --host 0.0.0.0 --port 8000 browsertrix.api:app

@@ -4,27 +4,15 @@
from starlette.staticfiles import StaticFiles

from .crawl import CrawlManager
from .schema import (
CrawlDoneResponse,
CrawlInfoResponse,
CrawlInfoUrlsResponse,
CrawlInfosResponse,
CreateCrawlRequest,
CreateNewCrawlResponse,
FullCrawlInfoResponse,
OperationSuccessResponse,
QueueUrlsRequest,
StartCrawlRequest,
StartCrawlResponse,
)
from .schema import *

app = FastAPI(debug=True)
crawl_man = CrawlManager()
crawl_router = APIRouter()


# ============================================================================
@app.post('/crawls', response_model=CreateNewCrawlResponse, content_type=UJSONResponse)
@app.post('/crawls', response_model=CreateStartResponse, content_type=UJSONResponse)
async def create_crawl(new_crawl: CreateCrawlRequest):
return await crawl_man.create_new(new_crawl)

@@ -68,11 +56,11 @@


@crawl_router.post(
'/{crawl_id}/start', response_model=StartCrawlResponse, content_type=UJSONResponse
'/{crawl_id}/start', response_model=CreateStartResponse, content_type=UJSONResponse
)
async def start_crawl(crawl_id: str, start_request: StartCrawlRequest):
async def start_crawl(crawl_id: str):
crawl = await crawl_man.load_crawl(crawl_id)
return await crawl.start(start_request)
return await crawl.start()


@crawl_router.post(
Oops, something went wrong.

0 comments on commit 83ebf10

Please sign in to comment.
You can’t perform that action at this time.