DevConMru – Backup in the cloud for the Paranoid by

At Hackers Mauritius we work on several projects and code for fun. One of the interesting project we have look at is an application called Tarsnap which is use to perform secure backup on the cloud. At Hackers Mauritius, myself (@TheTunnelix) and Codarren (@Devildron) recently send codes to Tarsnap and same were approved. Thats really cool when someone's code is approved and used world wide by thousand of companies. Thanks to Selven (@eldergod) and Loganaden (@loganaden_42 ) who are the creators of Hackers Mauritius who inpired us. Today, i have the privilege to speak on Tarsnap at the DevConMru 2016 which was held at Voila hotel, Bagatelle. On reaching there, i was impressed on the number of people already waiting inside the conference room who were curious about Tarsnap. Some were entrepreneurs whilst others were students. I should say around 30 people attended the conference. Since it was a sunday at 11:30 am, Selven did not hesitate to bring some beer to the little crowd present there. I was busy setting up my laptop for the presentation.

As usual i like to get the attention of my audience before the presentation. My first slide showed the logo of Tarsnap upside down.

Screenshot from 2016-05-22 19-05-41

Everyone was turning their head and making the effort to read the content. And here we go. I noticed that they are all ready and curious about it.

Check out the Slide here. Please wait some minutes. Its loading..

The basics of Tarsnap were explained. Tarsnap take streams of archive data and splits then into variable-length blocks. Those blocks are compared and any duplicate blocks are removed. Data de-duplication happens before its uploaded to the Tarsnap server. Tarsnap does not create Temporary files but instead create a cache file on the client. The cache file is the files that are being back up to the Tarsnap server. After deduplication, the data is then compressed, encrypted, signed and send to the Tarsnap server. I also explained that the archived are saved on an Amazon S3 with EC2 server to handle it. Another interesting point raised was the concept of Tarsnap which uses smart Rsync-like block oriented snapshot operations that upload only data which is charged to minimise transmission costs. One does not need to trus any vendor cryptographic claims and you have full access to the source codes which uses open-source libraries and industry vetted protocols such as RSA, AES and SHA.

Getting on to the other part of Tarsnap and Bandwidth, emphasis was made on Tarsnap which synchronised blocks of data using very intelligent algorithm. Nowadays, there are companies that still uses tapes for backups. Imagine having so many tapes and when restoration time has arrived, this would take tremendous time. Tarsnap compresses, encrypts and cryptographically signs every byte you send to it. No knowledge of crytographic protocols is required. At this point, i asked a question about volunteers who are thinking to look at the Tarsnap code. There were three persons who raised their hands. The importance of the Keyfile was raised up as some companies secure their private key in a safe. Tarsnap also support division of responsibilities where an explanation was laid out where a particular key can only be used to create archive and not delete them.

An analogy between google drive compared to Tarsnap was given. Many already understood the importance of Tarsnap compared to Google Drive. The concept of deduplication was explained using examples. For the network enthusiasts, i laid emphasis on the port 9279 which should not be blocked on the firewall as Tarsnap runs on the following port number. Coming to confidentiality, the matter was made clear enough to the audience how much the data is secured. If it happens someone lost the key there is no way of getting back the data. 

Tarsnap is not an open source product. However, there client code is open to learn, break and study. I laid emphasis on the reusable open source components that come with Tarsnap for example the Scrypt KDF (Key derivation function). KDF derives one or more secret keys from a secret value such as a master key, a password or passphrase or using a pseudo random function. The Kivaloo data store was briefly explained. Its a collection of utilities which togather form a data store associating keys up to 255 bytes with value up to 255 bytes. Writes are accepted until data has been synced. If A completed before B, B will see the results of A. The SPIPED secure pipe daemon which is a utility for creating symmetrically encrypted and authenticated pipes between socket addresses so that one may connect to one address. 

I also explained to the audience the pricing mechanism which was perceived rather cheap for its security and data deduplication mechanisms. Tarsnap pricing works similar as a prepaid utility-metered model. A deposit of $5 is needed. Many was amazed when i told them that the balance is track to 18 decimal places. Prices are paid exactly what is consumed.

Other interesting features such as regular expression support and interesting stuffs with the dry run features of Tarsnap was given. The concept of Tar command compared to Tarsnap was also explained. Commands, hints and tricks explained.

Some members of hackers Mu
Some members of hackers Mu

At the end, i consider it really important to credit Colin, the author of Tarsnap and i have been strongly inspired by the work of Michael Lucas on Tarsnap. Indeed, another great achievement of Hackers Mauritius at the DevConMru 2016.