Rebooting: Content Structure

A blog post is like a micro chapter in a mega book. I think that’s the reason why I can’t stand email-style posts, I think they’re a little too informal. Email or IM-style has nothing to do with the friendliness or the tone of writing. As far as I know, John Gruber, Paul Graham and Joel Spolsky haven’t had any trouble with making friends. Their tones might vary, but they all love their readers.

So no email-style posting for me.

Post Elements

Taking apart a post’s structure, I ended up with these elements that I need to take care of:

  • Titles: Like I discussed before, post titles are wrapped in <H1>, all other titles follow. The site title itself isn’t going to be wrapped in an <H> anymore, I’ll stick to <p>. In the current design, when the post title is followed by a subtitle it just looks horrible, so I’ll have to distinguish the post title some more.
  • Paragraphs: Well, I just want them to be “smart”. WordPress’s Texturize already does that, it replaces the quotes, dashes and hyphens, etc. I’m going to switch to Markdown anyway, so it’ll be a good combination. A little more line spacing a little bigger font would be fine.
  • Lists: Well, the only thing that I have to worry about is multi-paragraph lists, but I think that issue can be easily resolved with CSS selectors. Paragraph font rules and spacing will also apply to lists.
  • Citations: Inline citations shouldn’t be a problem because we already have <cite>, but blockquotes are a place for long discussions. I’m always use the cite attribute for blockquotes and I’ll probably add some JavaScript code that extracts the URLs and displays them underneath the quote. I’m might add real quotes around citations and hide them using CSS. I do care about how the post looks like in plain text.

    There are some interesting references on Microformat citation page.

  • Links: Both local and external links will look the same, with a few exceptions. I’ll have to distinguish my internal wiki links, interwiki links and links to definitions. Definitions will be looked up in my wiki first, then Wikipedia, then a list of abbreviations. I know it’s a little complicated, but I’ll see how it works.

  • Code: The problem is Markdown doesn’t support anything except blocks, but I want to add syntax highlighting on the fly, so here’s my trick:

    All code will begin with the shebang and the path to the interpreter! So if I’m posting Perl code, the first line will be #!/usr/bin/perl. Now, Markdown already generates <pre><code>, I’ll patch it a little so it adds a lang and xml:lang attributes, so the tag will look like this: <code lang="x-perl" xml:lang="x-perl">. The x- prefix is used because programming languages aren’t registered so I’ll mark them as experimental.

  • Images: Markdown’s image support is ok, but it looks ugly when the image is supposed to be a link, especially with thumbnails. I’ll have to patch Markdown so if it finds an image prefixed with tn_ it automatically surrounds it with a link to the bigger image, and add width and height attirbutes. Other than that, I don’t think I can do much with images.

  • Footnotes: Markdown doesn’t support footnotes, but I’ve read about adding footnote syntax here and here. Footnotes are essential or longer articles, especially those that imitate Paul Graham’s style.

What’s Up With Markdown

One of the reasons that I consider posting a hassle is that it involves too much unnecessary work. I’m a detail freak, I can’t stand a post without proper XHTML, I can’t stand non-structured content. On the other hand, XHTML tags force me to stop and think when I’m writing, they clog my thoughts, and they look dead ugly.

I’ll leave reviewing all the minimalistic mark-ups to another day, but judging by the few I’ve seen, Markdown seems to be the most suitable one. What I like most about it is that I can wrap paragraphs at column 80 and it still looks fine.

The problem is, my older posts are all filled with tags, and I don’t intend to leave them that way, so please forgive me if you see some really old posts being update without an apparent reason.

I didn’t talk about my recent switch to Markdown because it’s already been praise more than enough, so there you have it: I’m using Markdown now for all my posting.

Underlying Code

It might not be important to many readers, but I’d rather see my code (and text) properly formatted, whether in Markdown, HTML or the final output. I want my readers to be able to get the exact same article in as many formats as possible, that includes at least plain text, HTML, PDF and paper prints. I want my fellow geeks to be able to take a post and apply XSL stylesheets as simply as possible. I want to be able to collapse the whole blog into a single book if I wanted.

I know that’s a little difficult to accomplish, but it’s just an idea. I can pass Markdown’s output to Tidy and have pretty prints, or I can patch Markdown to generate proper markup. We’ll see about that later on.

JavaScript will be kept to a minimum, and only when absolutely necessary. I probably want to highlight keywords for visitors referred by a search engine, and have the ability to turn it off. DokuWiki already does that, WordPress has a plugin, so I think I’m ok here.

It should be trivial to paginate long posts, I don’t know how many people prefer pagination with a table of contents, I’ll have to look more into that issue, though I’m not a big fan. If I’m going to paginate posts, I’m going to do it with JavaScript rather than a WordPress plugin, I’m assuming my readers would rather get the whole post and be able to read offline.

Next: Behind The Scenes

In my next Reboot post, I’m going to discuss the URL scheme of the site, what I’m need in WordPress’s administration interface, and generally, how I’m going to implement the technical stuff.

Markdown has given me an idea on writing a new blogging software, something that’s more “UNIXy” than WordPress and MoveableType. I’ll leave that for another time though, it’s still very rough on the edges.