Blogging and Controlled Vocabulary
02/07/2014 1 Comment
So I am considering a blog redesign over on Comparative Geeks. I did a redesign on this blog, and changed the home page to be more like a website than a blog front-page. In the end, I think that makes this more like my portfolio or personal site than just a personal blog, or a re-purposed blog from school. Which to be fair, it was. So I went through each individual post, added the “more” function to most, moved my categories into tags, and added new categories.
Which was a lot of work, but it did what I wanted, which is changed the look and feel of this site. To be fair, I don’t think anyone actually uses it as I have it set up – I think mainly people read individual posts as shared from the WordPress reader or social media, but hey. Not like I’m putting up a ton of content here.
…Unlike Comparative Geeks. Where there are now over a year’s worth of 6-posts-a-week. And I’m thinking about similar sorts of changes, at least on the individual post levels: moving some categories to tags, and cleaning up the categories. We currently have 32 categories, and a number of them are things that more likely function as tags on other blogs – and that’s how people browsing through blogs might find us! What you define as these tags also carry over to other social media sites, especially, I have found, Tumblr.
So let me explore a few of the challenges I see in store for Comparative Geeks, dealing with the tags and categories in particular – more to come on some of my other plans for Comparative Geeks! Read on for some good blogging food for thought.
Controlled Vocabulary
So, here I am using a Library Science term to talk about blogging. And mentioning it in the title and waiting this long to explain it. What the heck…
A Controlled Vocabulary is, to quote the vast knowledge of Wikipedia,
A way to organize knowledge for subsequent retrieval… Controlled vocabulary schemes mandate the use of predefined, authorised terms that have been preselected by the designer of the vocabulary, in contrast to natural language vocabularies, where there is no restriction on the vocabulary.
So while I had hoped to keep to a more controlled vocabulary on Comparative Geeks, and have tried to and done better at it here on this blog, it’s nowhere near perfect. Instead, I think that tags tend towards “natural language” instead, and also get bogged down in proper nouns – people and things – which is also problematic. So let me take this apart further.
“Categories” and Controlled Vocabulary
In many ways, the Categories function as a Controlled Vocabulary for the blog. They are specific terms that you use constantly on the blog to organize your information. Their main function is to have at least one term that covers any given post. Otherwise, the post is “Uncategorized,” and as I look at it, if that’s the case, does it belong on the blog?
However, tags are used to compare between different WordPress blogs, and while it might pull from Categories, I’m not sure if this is true. The main page to check this out is here: http://wordpress.com/tags/ As you can see, these are important sorts of terms to be involved in, because they can help you be found by people browsing content. The link to “Explore” is right there on the WordPress reader, as is a search of tags.
So really, the actual, and important, and larger-than-your-own-blog Controlled Vocabulary is this list. Check it out. Are you using these terms? We are, a number of them. However, as Categories. So, part of my plan is to move these sorts of Categories – the largest on the blog, such as “movies” and “books” – into the tags. Which is a painstaking process of changing posts one-by-one, or at least quick-editing them from the “All Posts” listing.
“Tags” and Natural Language
Meanwhile, our tags, as I imagine many tags for many blogs, has ended up much more as a listing that more resembles “natural language.” Which is okay, really, because these sorts of things still help with Search Engine Optimization, or people searching for more specific things, or to link up posts on your own blog on a specific topic.
Comparative Geeks now has over 1,400 tags, so there’s really no getting in the way of this or stopping it. Instead, I propose a small, Controlled Vocabulary of words that are major tags used on other WordPress blogs, and then a natural language, whatever-seems-relevant, nothing-actually-different-from-now sort of tagging otherwise.
Once these major tags are moved from categories, they should still be easily assigned. The handy “Most Used Tags” button can help make this happen, for this most common ones. I use this constantly here on DBCII and it has kept me down to just over 200 tags – most of which are ones I have used for Sunday Re-Blog posts. Otherwise, there are some major, re-used tags about writing, blogging, social media, and library science.
This quickly got away from us on Comparative Geeks because, as a review sort of site, the number of proper nouns got away from us…
Those Pesky Proper Nouns
So this reminds me of another idea from library science, the “Authority Control.” Basically, the large, overarching sort of thing that can tell that “George RR Martin” and “George R.R. Martin” and “George Martin” are, when appropriate, all the same person. However, with our tags, there’s no such luck.
We do indeed have at least those first two examples of tags – it happens. But if one is used more than another, then the one using the ugly duckling term gets lost in the shuffle. Then later, when you’re going to add a tag, you see both options. So you add both, or maybe just one, randomly. And overall, it loses the usefulness of the tags. Then again, maybe having more is better – lots of iterations of a name, especially, to increase your searchability.
So while I don’t think I can go through and get rid of tags, there are some I could clean up and combine – to fix these subtle variances. If the spelling and syntax don’t match exactly, WordPress makes it a whole new tag. The “Authority Control” is solely in that exact spelling match.
I’ve actually done this cleanup once before on Comparative Geeks – at about the six month mark, with probably about half as many tags. And it’s tough – you want to look at all the posts with both tags, make sure that they all have your surviving tag, then delete the one you’re getting rid of. I can only imagine if I did decide to keep more, rather than less – going back and adding tags to posts, or even potentially creating more, exponentially?
A Whole Lot Of Work… For What?
So the question is, I suppose, is this sort of work worth it once a blog is up and running? Sure, this would all be great to know before you start a blog – and take note, if you are starting up a blog, or getting ready to start a new one! – but once you’re at this point? And while it might seem like something you could change a little bit at a time, by adding more content on a daily basis like we do, the potential for creating problems in sections you’ve already cleaned up are high. As long as a tag exists still in your list of tags, it can pop up as a recommended tag as you type in your tags – the good ones and the bad ones.
There’s also some question as to whether user-selected Meta Data matters for searching and search engines anymore. Do they consider it? Do they ignore it? Obviously, it matters in some places – the hashtags on Twitter are an obvious place, as is the list of tags on WordPress that I linked to above. However, does Google take them into account? Or do other factors – titles, headings, keyword usage early on – matter more? The jury is out.
So really, it’s a question of what you want to accomplish. I want to mesh with the WordPress tags, so the project of changing over the categories into tags seems worthwhile. And, if I am going through the posts anyway… nope, doesn’t help a tag cleanup project. You have to start that from the Tags listing in your dashboard, really. Find the duplicates.
So I may go with the one project for now – going through each post and making changes – but not the other – correcting and weeding the tags. If I did go through the tags, I would probably just look for errors – typos and misspellings and the like. Rather than trying to enforce a Controlled Vocabulary on 1,400 terms. Meanwhile, I have some reasons to go through all the posts anyway… But that’s a tale for another day.
I just found a tool in WordPress that allows you to convert Categories into Tags and Tags into Categories. You can find it under Dashboard -> Tools -> Import -> Categories and Tags Converter. That is amazing. I will probably be using this.
LikeLike