Webrecorder | Tools

All Tools

In addition to the above key tools, we maintain a numerous other smaller tools as part of the web archiving ecosystem. Select one of the categories to further filter this list. Take a look at these tools if you are interested in deploying web archiving tools on your, or integrating into other projects.

All currently maintained Webrecorder tools are listed below. Select one of the categories to further filter this list.

archiveweb.page

A Chrome extension and desktop app for capturing and replaying pages directly using a browser

browsertrix-behaviors

A set of automated behaviors for automating interactions with the browser, including generic (playing video, scrolling) and site-specific behaviors, such as for social media

browsertrix-crawler

A self-contained crawling system that runs a high-fidelity crawl in a single Docker container

oldweb.today

An integrated browser emulation system for running in-browser emulators connected to web archives

pywb

The core web archive toolkit, includes web archive replay, access and collection management

pywb-remote-browsers

Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives

remote-desktop-server

A set of Docker contains for VNC and WebRTC streaming. A component for pywb-remote-browsers.

replayweb.page

A serverless web and desktop app for viewing web archives directly in the browser

shepherd

A system Docker containiner orchestration system for launch 'flocks' on Docker contains on-demand. Part of the Remote Browser system.

shepherd-client

A JS frontend for embedding remote browsers in Conifer. Part of the Remote Browser system

wabac.js

A service-worker based web archive replay system. Backend for ReplayWeb.page

wacz-format

A new specification for a portable Web Archive Collection Zip (WACZ) format and python library

warcio

A fast, standalone way to read and write WARC Format commonly used in web archives

warcio.js

A port of python warcio to Javascript. Supports reading/writing WARC files in the browser and in Node.

warcit

A command-line tool to convert on-disk directories of web documents (commonly HTML, web assets and any other data files) into an ISO standard web archive (WARC) files.

wombat.js

The client-side rewriting Javascript rewriting system used in pywb and wabac.js