Tag Archives: distributed filesystems

A few notes on Camlistore / Perkeep

edit 2020-11-30: A few years ago, Camlistore was renamed to Perkeep. I won’t go out of my way to update the article below in hopes that most of it applies. Please comment if this is not the case.

(Useful as of April 7 2014, git commit 6842063f7c1ff6ae47e302b50b6a5030867cc2cd)

I was playing a bit with Camlistore Perkeep and I like what I see. The presentation of Camlistore on FOSDEM 2014 done by Brad Fitzpatrick is quite insightful.

Here’s just a few notes so I don’t forget the caveats etc. I didn’t use a completely clean machine, so these are not instructions completely vetted up for public consumption. I presume a Debian Wheezy machine. Leave comments below, or mail me at blogのvucica.net.

Installing Go

Camlistore depends on Go language, at least Go 1.1. To install on Debian, as of early 2014 you need to use testing or unstable distributions. Here’s my /etc/apt/sources.list:

deb http://ftp.nl.debian.org/debian wheezy main
deb http://security.debian.org/ wheezy/updates main

deb http://ftp.nl.debian.org/debian testing main
deb http://ftp.nl.debian.org/debian unstable main

Here’s my /etc/apt/preferences.d/prioritize-stable (I prefer to use stable):

Package: *
Pin: release a=stable
Pin-Priority: 950

Package: *
Pin: release a=testing
Pin-Priority: 900

Package: *
Pin: release a=unstable
Pin-Priority: 800

Then use aptitude to choose golang version from testing. Hit e to use automated conflict resolver. With r, mark unsatisfactory solutions, then use , and . to browse different resolutions. At some point you’ll be offered correct installation of dependencies for golang. Hit ! to apply, then hit g twice to install.

Installing Camlistore

Nearly straight from the website:

git clone https://camlistore.googlesource.com/camlistore
cd camlistore/
go run make.go

If instructed to install any packages, do so. I know I was requested to install sqlite3 libraries.

Start camlistored:

./bin/camlistored

You’ll automatically get default config placed at ${HOME}/.config/camlistore/server-config.json and secret keyring at ${HOME}/.config/camlistore/identity-secring.gpg.

Interesting locations

  • ${HOME}/camlistore – you installed Camlistore here
  • ${HOME}/.config/camlistore – configuration files
  • ${HOME}/var/camlistore – blobstore

Preparing clients

Again taken from GettingStarted document and modified.

Upon startup, camlistored should have output the GPG key to use in the command below. For me, it was third line:

2014/03/04 23:26:24 Generated new identity with keyId "F300546B" in file /home/ivucica/.config/camlistore/identity-secring.gpg

Use this here:

./bin/camput init --gpgkey ${REPLACE_WITH_GPG_KEY}

Your new client configuration file should be at ${HOME}/.config/camlistore/client-config.json.

You’ll get output like this:

2014/03/04 23:29:29 Your Camlistore identity (your GPG public key's blobref) is: sha1-0b40618f0b2f6ff90ede6dc4ca7c5231eedf508b
2014/03/04 23:29:29 Wrote "/home/ivucica/.config/camlistore/client-config.json"; modify as necessary.

Web UI

Web UI will be available at http://${hostname}:3179. If accessing from local machine, use http://localhost:3179.

If you are visiting from a remote server, edit ${HOME}/.config/camlistore/server-config.json and under auth set this: userpass:alice:secret:+localhost (replace ‘alice’ and ‘secret’). This uses the completely insecure PLAIN authentication; use this only as a last resort. If you absolutely need this, you should consider turning on HTTPS, too, or reverse-proxying using nginx with HTTPS, so that you’re at least not leaking the password over the wire. Your password is also, obviously, stored locally in plaintext. This is all a BAD idea, but okay for testing.

Uploading blobs, files and directories

./bin/camput blob bin/README  # uploads a raw blob with no metadata; no filename, no nothing
sha1-4a8b6d44e3030bf36f39f0e0415209f8319ce019

./bin/camput blob bin/README  # note -- same output
sha1-4a8b6d44e3030bf36f39f0e0415209f8319ce019

BLOBHASH=$(./bin/camput blob bin/README)  # store hash in a variable...
./bin/camget ${BLOBHASH}  # ... then get this blob and print it out:
This is where Camlistore binaries go after running "go run make.go" in
the Camlistore root directory.

./bin/camtool list  # observe: just one blob, size 102
sha1-4a8b6d44e3030bf36f39f0e0415209f8319ce019 102

./bin/camput file bin/README  # uploads file data + creates metadata blob
sha1-129583b2870e357b17468508c4b8345faf228695

./bin/camtool list  # Note, no new content blob (content is the same) -- but there is the metadata blob
sha1-129583b2870e357b17468508c4b8345faf228695 356
sha1-4a8b6d44e3030bf36f39f0e0415209f8319ce019 102

./bin/camget sha1-129583b2870e357b17468508c4b8345faf228695
{"camliVersion": 1,
  "camliType": "file",
  "fileName": "README",
  "parts": [
    {
      "blobRef": "sha1-4a8b6d44e3030bf36f39f0e0415209f8319ce019",
      "size": 102
    }
  ],
  "unixGroup": "ivucica",
  "unixGroupId": 1000,
  "unixMtime": "2014-03-04T23:21:29.152341812Z",
  "unixOwner": "ivucica",
  "unixOwnerId": 1000,
  "unixPermission": "0644"
}

This file is not a permanode so it didn’t appear in the web UI.

./bin/camput file --permanode bin/README
sha1-129583b2870e357b17468508c4b8345faf228695
sha1-55e987dc3d81da6e2188b45320d7ee9e29c1366e
sha1-60a25fc96f7ee1830a9e95ecfc4c2fc80da65371

Now if you visit the WebUI, you’ll see the file appeared there. (You may also see it appear live if you had the web UI open.)

First hash is the file metadata, second is a claim, and the third one is the permanode itself, which you can use in the web UI to reach the content.

What is a claim? Quoting:

If you sign a schema blob, it’s now a “signed schema blob” or “claim”. The terms are used pretty interchangeably but generally it’s called a claim when the target of the schema blob is an object’s permanode (see below).

Signing involves claiming interest by a user over a blob. Signing is done with the previously created GPG key. It is my probably incorrect understanding that blobs without owners could be garbage collected. I can’t find the documentation to verify this; please leave a comment or email me whether this is correct or incorrect.

To upload a directory and simultaneously provide a title and some tags, as well as create a permanode:

./bin/camput file --permanode --title="Camlistore Documentation" --tag=documentation,camlistore ./doc/
sha1-ff227e6af6647d2c3958824a036d7a362a89d675
sha1-36483f956312bc82bfbca307739e1d2e225ccc08
sha1-6a57222b2a1a5338a11a02ba3a34fd3b43bb92ee
sha1-8cd8a87d82af1dd12fb7dce1c21510420ec8b419
sha1-3f22e0468104c908292d07271d744cd95f925d52
sha1-75d3db43c8b07bf82d601d65375ec64f7769a4bf

Search system’s camtool describe

Ask the search system about an object:

{
  "meta": {
    "sha1-75d3db43c8b07bf82d601d65375ec64f7769a4bf": {
      "blobRef": "sha1-75d3db43c8b07bf82d601d65375ec64f7769a4bf",
      "camliType": "permanode",
      "size": 562,
      "permanode": {
        "attr": {
          "camliContent": [
            "sha1-ff227e6af6647d2c3958824a036d7a362a89d675"
          ],
          "tag": [
            "documentation",
            "camlistore"
          ],
          "title": [
            "Camlistore Documentation"
          ]
        },
        "modtime": "2014-03-04T23:55:05.941276174Z"
      }
    }
  }
}

Mounting Camlistore filesystem

You won’t be mounting the Camlistore filesystem; you’ll be mounting a Camlistore filesystem. But first, we need to prepare you a bit.

cammount uses FUSE. Mounting (and FUSE) needs appropriate root-level permissions to, well, mount a filesystem. So, let’s use sudo -E (-E to maintain ${HOME} et al so Camlistore can find configuration)?

Not so fast. The localhost auth that is still used here is doing checks on several levels. First, you need to be on the local host (check!). Second, the socket connecting to camlistored needs to come from the same UID as the one that is running camlistored. Your sudo‘ed cammount would be running with UID=0; your camlistored runs under 1000 or more. (Presuming Debian Wheezy here.) So, camlistored rejects the auth. Don’t bother sudo‘ing camlistored; it doesn’t want to be run as root.

Solution?

sudo -E adduser ${USER} fuse

Yes, you just add the user to the fuse group. You now need to create a new login session (e.g. log out and log in is fine) so the appropriate subsystems notice you are a member of a new group. (“Hey, you never need to reboot Linux.”)

Alright, now we’re good to go. Find the hash of a permanode you want to mount. You can use web UI to do so. It’ll even have a handy instruction on what to do! Quoting from web ui:

./bin/cammount /some/mountpoint sha1-ff227e6af6647d2c3958824a036d7a362a89d675

Or more sanely:

mkdir -p mountpoint/
./bin/cammount mountpoint/ sha1-ff227e6af6647d2c3958824a036d7a362a89d675

Open a new session, and observe how there is actual content in the mountpoint/ directory!

After shutting down cammount with ctrl+c, you might need to unmount using:

fusermount -u mountpoint/

Use mount to see if you need to do that.

If you skip the permanode, you’ll get a special filesystem with virtual directories such as at, date, recent, roots, tag… and even though you can’t see that directory, you can cd sha1-ff227e6af6647d2c3958824a036d7a362a89d675 and see the same directory as before.

Closing notes

  • I haven’t looked at alternative indexing engines or alternative blobstores. For example, Amazon S3, Google Drive, Google Cloud Storage, …
  • I haven’t looked at blobstore sync.
  • I haven’t managed to correctly use ‘publishing’, especially for /pics/ (which is an example given in the documentation).
  • I haven’t shown you camput attr which sets custom attributes on permanodes.
  • I find the idea of importers, especially IMAP importers, very appealing. The Flickr one is already there. An IMAP importer (at 51:40) and an IMAP server (at 51:57) was mentioned in the FOSDEM 2014 talk.
  • I also find the idea of a built-in SMTP server quite appealing (mentioned at 32:12).

Given that the Camlistore filesystem is mountable and splits files into chunks a-la GFS (although in chunks that are far smaller in size than 64MB quoted in the GFS whitepaper), and given that the talk was given as recently as February 2014, and given that there is enthusiasm on author’s side, I’m really looking forward to something that could truly grow into a personalized distributed filesystem.