Blogging, The UNIX Way

“I like WordPress.”

Ok, now let me try this again:

“WordPress sucks!”

Phew, feels like I just threw away a heavy burden. What a relief. Now if you’ll allow me to explain:

Current blogging software is limiting, in the sense that most of them are just data entry front-ends using some RDBMS. This makes it difficult to edit content without a client (or a front-end), it makes it more difficult to migrate from one software to another, and every other software uses its own schema.

Add to that the jungle that is WordPress internals, the slowly-creeping bloat, the endless amounts of identical plugins, the lack of unit tests, this annoying rant, and what do you get? /bin/blog.

Ok, maybe not. But that’s the best name I could come up with for my new brainfart. Anyhow, for now I’ll call it /bin/blog, say it “Binblog”, and name my child after it.

The Big Idea

Remember how UNIX is supposed to be about integrating tons of tools together? Every tool is supposed to do one thing and do it well. You can use pipes for communicating and redirecting data, maybe signals every now and then, but that’s about it. UNIX is just like lego, in an operating system kind of way. And that’s precisely what I love about it.

Now, I want to have my blogging software to be as simple as that, but not simpler. I want to follow a free-spirit approach, where my data isn’t bound to the software that’s running it and I can easily manipulate it without restraint.

Plain text’s the keyword here.

Regardless of how I implement /bin/blog, it’s going to be very minimalistic and completely extensible, not in a WordPress kind of way, but in a UNIX kind of way. It might even be a command-line utility to construct blogs, in similar fashion to MoveableType’s rebuild. Yes, everybody hated it, but you have to admit, MT’s performance, after generating the blog, runs circles around WordPress.

The Pros and Cons

Here are my reasons for why I mostly want plain text:

  • rsyncability. I’ll be able to save my drafts and published posts, then simply rsync them to my server. Compare that to logging in to administration, choosing “Write”, digging my way through all the fields to fill them, fighting with the tab button, and cursing and swearing.
  • Most FTP clients can also sync directories, in case rsync isn’t available, so that leaves me with another option.
  • With plain text, I can grep, find, sort, count words and paragraphs, and sorely compress my data to move it to another server.
  • No more mysqldump and reload process.
  • Easier to figure out the structure than learning a database schema.
  • Cheap updates (I haven’t benchmarked yet, but I think fopen() is faster than mysql_connect, I’ll have to check).
  • Self-contained posts. When every detail of each posts (including comments) resides in a single text file, it’s much easier to email.
  • Creation and modification time are controlled by the system.
  • Use whatever editor you like. You can even use Vim over SSH to fix that typo.
  • Can be implemented as part of a minimalistic plain-text wiki.

On the other hand, plain text introduces a few issues, but I don’t find them big compared to working with a database:

  • Performance. Searching plain text is much slower than searching and indexed database. However, using a database to index posts is trivial.
  • System-dependent permissions. Of course that’s not completely different with databases either, but a database server does use an extra level of authentication.
  • A bit steeper learning curve, especially for those who aren’t familiar with the system.
  • Can be implemented as part of a minimalistic plain-text wiki.

Just An Idea

/bin/blog is just an idea. I can start working on it just after my Reboot. I don’t want to split my energy between the two, since /bin/blog is going to take some time to implement.

What do you think can go wrong with a plain text-based blog?



8 Responses (Add Your Comment)

  1. While I understand where you’re comming from, talking about
    the load and mysql_connect vs. fopen, hmm… while files could
    be cached all the time, your connection could be a presistant
    connection and db query results could be cached if necessary.
    I agree with text being a better competent to db for small
    tasks and little data, think about when data grows and you
    need linking and lucking, there you go, a big mess. Add to
    that all the stuff that been solved by dbs but not solved with
    files such as sizes, binary data, and encoding and add to that
    the file handling extensions that comes with scripting
    languages such as PHP or Perl.

    A huge catch is, while u have tables in one db connection
    you’ll have to make many fopen queries to link different sets
    of files :D suppose you have a tree data type with different
    parents lying in different files….. oh my…

    Or did you say you’re thinkin of having a command line bloging
    tool? Man that’s crazy but nice to have a command line login
    and remotely do the work now that’s crazy…. that’s ill!

    ReplyReply
  2. The thing is, I don’t want to be using files to solve what
    databases already did solve. In fact, I might be using a
    database for caching and searching, but make that optional.

    My problem is very specific: a blogging software. That alone
    should give me enough information about the data structure,
    possible shortcomings, bottlenecks, etc.

    The “real” backend doesn’t have to be files at all, since in
    this case, files are acting as another front-end. Make a file
    and voila, /bin/blog knows what to do with it, it process it
    and store it in a database and use the cached version for
    display. This is sort of like MT’s build process, except in
    /bin/blog only the first request is slow, the rest would be
    cached.

    The whole idea is /bin/blog should be very simple, requires
    very little, and is completely self-contained. You should be
    able to back up your whole blog by just tarring a directory,
    and restoring it on another server by a simple untar. Heck,
    you’d even be able to pipe through tar, gzip over an SSH
    connection to multiple servers at a time, something you can’t
    do with WordPress.

    The command line idea came out of necessity, you don’t always
    have a chance to use a web browser, sometimes I’m even too
    lazy. Plus, it’s WordPress (and many others) aren’t
    scriptable, think WP-Cron vs. crond and you’ll know what I
    mean. With /bin/blog being available as a command-line tool,
    you’ll be able to write scripts to do many interesting things,
    you’d even be able to hook SpamAssassin to filter your comments.

    Damn, I think I got a little ahead of myself. We’ll see how it
    goes.

    ReplyReply
  3. Sounds interesting, waiting for the cookout.

    BTW, I’m on Textpattern, n I’m lovin’ it.

    ReplyReply
  4. Thanks, I’ll be sure to let you know when I release the prototype.

    ReplyReply
  5. You might want to check out http://www.blosxom.com/

    I found your page while searching for plain-text blogging software. Ended up finding Blosxom, so thought i’d pass it along.

    ReplyReply
  6. I’m also really interested in hugely minimalistic, filesystem-driven blogs/CMSes. This is primarily because I am extremely lazy when designing my personal website (just look at it! http://nil.cjb.net) but want there to be some sort of dynamic functionality so it’s even EASIER for me to be lazy about working on my website.

    I am willing to help you with this if you want. I don’t really care what language you choose, because I’m always willing to learn new ones (though I am decent at Python and Perl, and know some PHP). I think it would be neat to allow users to use their OpenID to leave comments, too, and completely avoid internal logins. What I’m looking for is blog software that fits all of your requirements above that’s oriented towards a single blogger with no time to fiddle around with things: only enough time to type a blog entry.

    ReplyReply
  7. Wanted to compliment on your site, it looks really good .

    Hank

    ReplyReply
  8. Look at Ruby program called nanoc!

    ReplyReply

Leave a Reply

Formatting: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Other Entries

Tweets from