Art Kavanagh

Eluding the algorithm

Web discovery in the aftermath of social media

I’ve always had a preference for straightforward search over other methods of organizing and finding information. Even in the days before Google, I was more likely to try my luck with Alta Vista than to use a directory site like Yahoo or Netscape’s own “portal” to find new things to read on the web. When I first read about Google (in Salon, more than 20 years ago now) with its algorithmically ranked search results, I found it easy to overcome my scepticism because I wanted search to be the answer to the problem of navigating the web’s vastness.

I had some residual suspicion that whatever Google’s wholly automated process was turning up was not always the best result, not necessarily the perfect match for my query. How could I be sure that the algorithm wasn’t missing a better answer than the millions it offered in page after page of search results (or hiding some good ones on page 8)? For years, this niggle was little more than a theoretical qualm.

Then, maybe eight or nine years ago, it began to feel as if getting Google to divulge the results I was actually looking for was becoming more of a struggle than it had been. However much I refined my search using quote marks to find a particular phrase or the minus sign to exclude certain terms, I didn’t seem to be able to find my way any deeper into the web’s presumably teeming ocean of information. It’s as if I’m saying “Haven’t you got anything else?” and the search engine replies “No, no, you’ll like these results, I promise. Trust me, they’re very popular.”

It’s my impression that the situation hasn’t really improved in the intervening years, it’s just that I’ve got used to it. For example, I found out just a year ago that I’ve got aphantasia and I’ve been Googling that term a lot in conjunction with others which may or may not be related. Lots of hits for pages that explain what aphantasia is, for example. (I know what it is now, thanks. I’m ready to move on to the next stage.)

The problem isn’t just with search engines. The thing that led me to delete my Facebook account wasn’t so much their assault on privacy, their manipulation of mood or their carelessness with security, as my growing conviction that they were never going to show me anything that wasn’t similar to something else that I’d previously liked or “engaged” with. Their algorithms seem relentlessly to penalize and erase dissimilarity, novelty and surprise. This may also be true of Pinterest which positions itself as a “visual search engine”. I can’t tell whether it’s true of Amazon because I never paid much attention to the recommendations they were pushing at me. (I recently discovered that I’m not the only one who doesn’t find Amazon’s recommendations helpful.)

Indeed, maybe that’s the problem: I don’t see the point of recommendation engines. I used to listen to Pandora in the days before they started to check your IP address to make sure you were really in the US. Some of the music was good but I always had the sense that I could do a better job on my own of finding music I liked. I’d do that by reading John Fordham in The Guardian, John Kelman and others on All About Jazz and even by buying the odd magazine (like Jazz News when I lived in France).

I’ve never had any problem finding books to read, so a book recommendation engine would need to be utterly stupendous before I’d begin to find it useful. And selection algorithms, impressively efficient as they are, tend not to reach the level of utter stupendicity.

So is that it? Is Google as much a recommendation engine as a search tool? In a sense, the answer’s obviously “yes”, if you’re using Google to find new things to read on the web in the same way you might use Amazon recommendations to choose a new book or author. And this perhaps explains the unease I used to feel about Google not turning up the “perfect” search result. The problem with Pandora and Amazon wasn’t that there was some elusive, “ideal” result that they were missing, just that the results they were producing weren’t any better than those I’d have found for myself, the old fashioned way.

It would be absurd to expect to find, in the vastness of the world-wide web, the unique site or page that best met my requirements. But on the other hand, as the web continues to expand exponentially, the bits of it that we find, or see, increasingly all seem to look the same, all tailored to attract eyeballs that have already read and liked the same kind of “content” many times before. That, I think I’m beginning to see, is the fundamental paradox of the web.

When I search for a new author to read or a new band to listen to, I don’t formulate search criteria in my head, nor do I expect to find somebody who exactly matches a set of carefully chosen terms. It’s a looser, imprecise, heuristic process. I’m becoming increasingly convinced that web discovery should work in a similar way.

Over a year ago, I signed up for Mix, Garrett Camp’s successor to StumbleUpon. Mix is a web discovery engine that seeks to combine elements of algorithmic selection and conscious, deliberate curation. Reviewing my use of Mix, I find I’ve been tending to use it more as a place to post links I’d like to recommend than as a source of recommendations for my own reading. Having said that, I do flick through the “For You” feed every few days. I usually find something of interest but I’m nearly always reminded that there are two ways in which I find Mix less than satisfactory:

A lot of the recommended material comes from rather obvious sources, such as Slate, The Guardian or the New York Times. Since I visit the first two of these (I haven’t got a subscription to the NYT) at least daily, a lot of the recommendations are for stories I’ve already seen, or would expect to turn up in the normal course of my web browsing. Mix describes itself as a “discovery engine”, so ideally I’d expect it to suggest sites that are more esoteric, out of the ordinary, less likely to be found by one’s own unaided efforts. (Having said that, I must admit that I often post links to Slate or Guardian stories there myself!)
There’s still rather too much “snackable content”, or junk food for the mind. The day before I started to write this post, I made a note of the listicles that turned up in my “For You”. They included

19 Incredibly petty tweets about incredibly petty people;
25 Tips to becoming a writer;
Top 10 Pilot carrier takeoffs and landings;
Top blockchain art projects (OK, this one doesn’t have a number in the title — but blockchain!);
10 Obviously queer things people did before they realized they were queer.

I found these while scrolling through about 20 items, so let’s say 25% are listicles. That’s twice as good as Pinterest, where it seemed to me that at least every second recommendation was a listicle. The tentative conclusion I drew from this is that, even when mixed with active, conscious curation, the selection algorithms tend inexorably to push lightweight, insubstantial confections to the top. If that’s true, perhaps we need to start placing more emphasis on curation and much less on algorithms.

(By the way, I dislike the term “curation” and have been trying to think of an alternative as I’ve been writing this. It comes from the same root as “curate” and has to do with care: as a curate has the care of souls, a curator is someone who is expected to take care of, and carefully select for display, the exhibits in a gallery or museum. It seems to me that a lot of the supposed curation on the web is really too indiscriminate to deserve this description.)

The prominence of well known news sites like Slate and the Guardian, on the one hand, and listicles on the other mean that Mix is coming to resemble a typical news feed on social media. So maybe “curated discovery” is not the effective alternative to algorithmic recommendations that I’d hoped. In any case, I think we need to exercise more discrimination in the selection of websites (etc.) to recommend. I suspect we need to concentrate less on search and more on exploration. With that in mind, I’ve added a page to my website, listing blogs and personal sites that I like, and that I recommend to you. As yet, the page doesn’t have a large number of links. It’s admittedly a modest start and my aim is to grow it slowly, not to deluge potential readers with a link dump.

The first link on that page is to a directory maintained by someone I follow on Micro.blog, Brad Enslen. Brad argues that we need to revive some of the old tools we used for web discovery before Google rose to dominance: tools like manually assembled directories, webrings, blogrolls and RSS feeds. Such small scale tools aren’t going to put the whole of web at our disposal but (as it’s time we realized) neither are Google or Facebook.

First, algorithmic search and, later, social media have made us passive. We expect the web to come to us. Garret Camp set up Mix partly in an attempt to address that problem, but (as I noted above) it’s still too easy just to treat it as another news feed. There used to be a slogan on the web: “You are the browser”. It’s time we remembered that, and got (actively) browsing.