I can't code to save my life, but that doesn't stop me from trying. My latest creation is a case in point. Since stuff tends to disappear unceremoniously from the Web, I usually save local copies of interesting articles. Up until recently, I used the SingleFile Firefox add-on for that, but the process involved too many manual steps for my liking. After several failed attempts to make Archivebox work, I decided to roll out my own tool based on monolith. It's a simple command-line utility that saves complete web pages as single HTML files. It took me a few hours to cobble together a crude but usable tool that I named Hako (it means box in Japanese, and it sounds a bit like hacky, which I find somewhat appropriate).
Here's how Hako works. To archive the currently opened web page, select the title and click on the Hako bookmarklet. This sends the URL and the title of the page to the Hako PHP page that passes the received values to monolith. The latter then saves the page using the title as its file name. The very same page also shows a list of all archived pages. So it also acts as a no-frills read-it-later tool. That's all there is to it, really.
To deploy Hako on your machine, you need to install monolith first. This can be done using the following
commands (note that this installs the x86_64 version of monolith and it uses the curl and
curl -s https://api.github.com/repos/Y2Z/monolith/releases/latest | jq -r ".assets | \ select(.name | contains(\"x86_64\")) | .browser_download_url" | wget -i - sudo mv monolith-gnu-linux-x86_64 /usr/local/bin/monolith sudo chown root:root /usr/local/bin/monolith sudo chmod 755 /usr/local/bin/monolith
Install PHP as well as the
php-mbstring packages on your system. To do
this on Debian and Ubuntu-based systems, run the
sudo apt install php php-xml php-mbstring command.
Clone then the project's Git repository using the
git clone https://github.com/dmpop/hako.git
command. Switch to the resulting hako directory, open the index.php file for editing, and
replace the default value of the
$KEY variable with the desired password. Save the changes and
start the PHP server using the
php -S 0.0.0.0:3000 command.
Next, add the following bookmarklet to the Bookmark toolbar of your browser (replace 127.0.0.1 with the
actual IP address of the machine running Hako and
secret with the string that matches the value of
$KEY variable in the hako/index.php file):
Now navigate to the page you want to archive, select the title, and click on the Hako bookmarklet. If the page has been archived successfully, you should see it in the list of saved pages.
If everything works properly, you might want to create a system service to start Hako automatically. Run the
sudo nano /etc/systemd/system/hako.service command and add the following definition (replace
/path/to/hako with the actual path to the hako directory):
[Unit] Description=Hako Wants=syslog.service [Service] Restart=always ExecStart=/usr/bin/php -S 0.0.0.0:3000 -t /path/to/hako ExecStop=/usr/bin/kill -HUP $MAINPID [Install] WantedBy=multi-user.target
Enable and start the service:
sudo systemctl enable hako.service sudo systemctl start hako.service
Keep in mind that Hako is a very simple tool with its fair share of shortcomings. It doesn't provide any
feedback, so the only indication that an archival action has been completed successfully is a created HTML file.
Anyone with the Hako bookmarklet (or basic knowledge of creating HTTP requests) and IP address of your Hako
instance can archive pages on your server. The web UI just lists the saved files, and that's all. And
since there is no password protection, all the saved web pages are publicly accessible. I run Hako on a local
server that is not exposed to the outside world, and I manage saved pages using standard Linux tools. I
recommend you do the same.
keyvalue in the bookmarklet matches the value of the
$KEYvariable in the index.php file.
© Dmitri Popov