From ff212b45613e62b7e806e1caf761855a483b2c3b Mon Sep 17 00:00:00 2001
From: Henry Wilkinson <henry@wilkinson.graphics>
Date: Fri, 29 Nov 2024 12:50:38 -0500
Subject: [PATCH 1/2] Updates Webrecorder tools

- Expands WARC section to include WACZ files
- Adds ArchiveWeb.page
- Adds Browsertrix
- Adds other WARC related command line utilities
---
 README.md | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index 767dc12..9fcab4b 100644
--- a/README.md
+++ b/README.md
@@ -1760,16 +1760,19 @@ Google lens is not too user friendly for investigations. But this tool will help
 
 
 
-### [](#warc)Tools for working with WARC (WebARChive) files
+### [](#warc)Tools for working with WARC (WebARChive) and WACZ (Web Archive Collection Zipped) files
 
 | Link | Description |
 | --- | --- |
-| [Warcat](https://github.com/chfoo/warcat) | My favorite (because it's the easiest) tool for working with Warc files. It allows you to see the list of files in the archive (command "list") and unpack it (command "extract").  |
-| [Replayweb](https://github.com/webrecorder/replayweb.page) | If the warc file is small, you can view its contents with this extreme simple online tool. Also it's possible to deploy ReplayWeb on your own server  |
-| [Metawarc](https://github.com/datacoon/metawarc) | Allows you to quickly analyze the structure of the warc file and collect metadata from all the files in the archive |
-| [Webrecorder tools](https://webrecorder.net/tools) | Archiving various interesting sites is a noble and useful activity for society. To make it easier for posterity to analyze your web archives, save them in Warc format with an online tool|
-| [GRAB SITE](https://github.com/ArchiveTeam/grab-site) | Af you need to make a Warc archive out of a huge site with a lot of different content, then it is better to use this #python script with dozens of different settings that will optimize the process as much as possible.|
-| [har2warc](https://github.com/webrecorder/har2warc) | Convert HTTP Archive (HAR) -> Web Archive (WARC) format|
+| [Warcat](https://github.com/chfoo/warcat) | My favorite (because it's the easiest) tool for working with WARC files. It allows you to see the list of files in the archive (command "list") and unpack it (command "extract").  |
+| [Browsertrix](https://webrecorder.net/browsertrix/) | Browser-based crawling service that saves websites as WACZ files (containing WARCs). Hosed as SaaS by Webrecorder but can alternatively be self-deployed on your own infrastructure |
+| [ArchiveWeb.page](https://webrecorder.net/archivewebpage/) | Create WARC and WACZ files interactively as you navigate sites in your web browser. Good for saving high-fidelity |
+| [ReplayWeb.page](https://webrecorder.net/replaywebpage/) | If the WARC file is small, you can view its contents with this extremely simple online tool / desktop app. WACZ files of any size will load much faster due to their built-in index. Also it's possible to deploy ReplayWeb.page on your own server  |
+| [Metawarc](https://github.com/datacoon/metawarc) | Allows you to quickly analyze the structure of the WARC file and collect metadata from all the files in the archive |
+| [warcit](https://github.com/webrecorder/warcit) | Command line utility to convert a local directory containing website files into a WARC file |
+| [unwarcit](https://github.com/emmadickson/unwarcit) | Command line utility to convert a WARC or WACZ file to a local directory containing website files |
+| [GRAB SITE](https://github.com/ArchiveTeam/grab-site) | Af you need to make a WARC archive out of a huge site with a lot of different content, then it is better to use this #python script with dozens of different settings that will optimize the process as much as possible.|
+| [har2warc](https://github.com/webrecorder/har2warc) | Convert HTTP Archive (HAR) → Web Archive (WARC) format|
 
 
 [](#archives-of-documentsnewspapers)Archives of documents/newspapers

From 497aaaa96cd73de00980c8b8df02cdaabdd9ff35 Mon Sep 17 00:00:00 2001
From: Henry Wilkinson <henry@wilkinson.graphics>
Date: Fri, 29 Nov 2024 12:51:33 -0500
Subject: [PATCH 2/2] Adds Ghostarchive

- Adds description to Arquivo.pt
---
 README.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 9fcab4b..efcda75 100644
--- a/README.md
+++ b/README.md
@@ -1726,11 +1726,11 @@ Google lens is not too user friendly for investigations. But this tool will help
 | Link | Description |
 | --- | --- |
 | [Quick Cache and Archive search](https://quickcacheandarchivesearch.onrender.com/) | quick search website old versions in different search engines and archives (21 source) |
-| [Trove](http://trove.nla.gov.au/search/category/websites) | australian web archive |
+| [Trove](http://trove.nla.gov.au/search/category/websites) | Australian web archive |
 | [Vandal](https://chrome.google.com/webstore/detail/vandal/knoccgahmcfhngbjhdbcodajdioedgdo/related) | extension that makes working with [http://archive.org](http://archive.org) faster, more comfortable, and more efficient. |
 | [TheOldNet.com](https://theoldnet.com/) |     |
 | [Carbon Dating The Web](http://carbondate.cs.odu.edu/) |     |
-| [Arquivo.pt](https://arquivo.pt/) |     |
+| [Arquivo.pt](https://arquivo.pt/) | Portuguese web archive     |
 | [Archive.md](https://archive.md/) |     |
 | [Webarchive.loc.gov](http://webarchive.loc.gov/) |     |
 | [Swap.stanford.edu](https://swap.stanford.edu/) |     |
@@ -1739,6 +1739,7 @@ Google lens is not too user friendly for investigations. But this tool will help
 | [web.archive.bibalex.org](http://web.archive.bibalex.org/) |     |
 | [Archive.vn](https://archive.vn/) |     |
 | [UKWA](https://www.webarchive.org.uk/) | archive of more than half a billion saved English-language web pages (data from 2013) |
+| [Ghostarchive](https://ghostarchive.org/) | Free web archive that uses ReplayWeb.page for viewing archived sites |
 
 ### [](#tools-for-working-with-web-archives)Tools for working with web archives