Tagged: web Toggle Comment Threads | Keyboard Shortcuts

  • feedwordpress 17:00:00 on 2018/03/22 Permalink
    Tags: html, , web   

    The Missing Building Blocks of the Web 


    Warning: preg_match_all(): Compilation failed: invalid range in character class at offset 7 in /homepages/23/d339537987/htdocs/ec/wp-content/themes/p2/inc/mentions.php on line 77
    The Missing Building Blocks of the Web

    At a time when millions are losing trust in the the web’s biggest sites, it’s worth revisiting the idea that the web was supposed to be made out of countless little sites. Here’s a look at the neglected technologies that were supposed to make it possible.

    Though the world wide web has been around for more than a quarter century, people have been theorizing about hypertext and linked documents and a global network of apps for at least 75 years, and perhaps longer. And while some of those ideas are now obsolete, or were hopelessly academic as concepts, or seem incredibly obvious in a world where we’re all on the web every day, the time is perfect to revisit a few of the overlooked gems from past eras. Perhaps modern versions of these concepts could be what helps us rebuild the web into something that has the potential, excitement, and openness that got so many of us excited about it in the first place.

    [An aside: Our team at Glitch has been hard at work on delivering many of the core ideas discussed in this piece, including new approaches to View Source, Authoring, Embedding, and more. If these ideas resonate with you, we hope you’ll check out Glitch and see how we can bring these abilities back to the web.]

    View Source

    For the first few years of the web, the fundamental way that people learned to build web pages was by using the “View Source” feature in their web browser. You would point your mouse at a menu that said something like “View Source” (nobody was browsing the web on a touchscreen back then) and suddenly you’d see the HTML code that made up the page you were looking at. If you squinted, you could see the text you’d been reading, and wrapped around it was a fairly comprehensible set of tags — you know, that <p>paragraph</p> kind of stuff.

    It was one of the most effective technology teaching tools ever created. And no surprise, since the web was invented for the purpose of sharing knowledge.

    These days, View Source is in bad shape. Most mobile devices don’t support the feature at all. And even on the desktop, the feature gets buried away, or hidden unless you enable special developer settings. It’s especially egregious because the tools for working with HTML in a browser are better than ever. Developers have basically given ordinary desktop web browsers the potential to be smart, powerful tools for creating web pages.

    But that leads to the other problem. Most complicated web pages these days aren’t actually written by anyone. They’re assembled, by little programs that take the instructions made by a coder, and then translate those instructions into the actual HTML (and CSS, and JavaScript, and images, and everything else) that goes to your browser. If you’re an expert, maybe you can figure out what tools were being used to assemble the page, and go to GitHub and find some version of those tools to try out. But it’s the difference between learning to cook by looking over someone’s shoulder or being told where a restaurant bought its ingredients.

    Bringing View Source back could empower a new generation of creators to see the web as something they make, not just a place where big companies put up sites that we all dump our personal data into.

    The Missing Building Blocks of the Web

    Authoring

    When Tim Berners-Lee invented the world wide web, he assumed that, just like in earlier hypertext systems, every web browser would be able to write web pages just as easily as it read them. In fact, that early belief led many who pioneered the web to assume that the format of HTML itself didn’t matter that much, as many different browsing tools would be able to create it.

    In some ways, that’s true — billions of people make things on the web all the time. Only they don’t know they’re making HTML, because Facebook (or Instagram, or whatever other app they’re using) generates it for them.

    Interestingly, it’s one of Facebook’s board members that helped cause this schism between reading and writing on the web. Marc Andreessen pioneered the early Mosaic web browser, and then famously went on to spearhead Netscape, the first broadly-available commercial web browser. But Netscape wasn’t made as a publicly-funded research project at a state university — it was a hot startup company backed by a lot of venture capital investment.

    It’s no surprise, then, that the ability to create web pages was reserved for Netscape Gold, the paid version of that first broadly consumer-oriented web browser. Reading things on the web would be free, sure. But creating things on the web? We’d pay venture-backed startup tech companies for the ability to do that, and they’d mediate it for us.

    Notwithstanding Facebook’s current dominance, there are still a lot of ways to publish actual websites instead of just dumping little bits of content into the giant social network. There are all kinds of “site building” tools that let you pick a template and publish. Professionals have authoring tools or content management systems for maintaining big, serious websites. But these days, there are very few tools you could just use on your computer (or your tablet, or your phone) to create a web page or web site from scratch.

    All that could change quickly, though—the barriers are lower than ever to reclaiming the creative capability that the web was supposed to have right from its birth.

    The Missing Building Blocks of the Web

    Embedding (Transclusion!)

    Okay, this one’s nerdy. But I’m just gonna put it out there: You’re supposed to be able to include other websites (or parts of other websites) in your web pages. Sure, we can do some of that — you’ve seen plenty of YouTube videos embedded inside articles that you’ve read, and as media sites pivot to video, that’s only gotten more commonplace.

    But you almost never see a little functional part of one website embedded in another. Old-timers might remember when Flash ruled the web, and people made simple games or interactive art pieces that would then get shared on blogs or other media sites. Except for the occasional SoundCloud song on someone’s Tumblr, it’s a grim landscape for anyone that can imagine a web where bits and pieces of different sites are combined together like Legos.

    Most of the time, we talk about this functionality as “embedding” a widget from one site into another. There was even a brief fad during the heyday of blogs more than a decade ago where people started entire companies around the idea of making “widgets” that would get shared on blogs or even on company websites. These days that capability is mostly used to put a Google Map onto a company’s site so you can find their nearest location.

    Those old hypertext theory people had broader ambitions, though. They thought we might someday be able to pull live, updated pieces of other sites into our own websites, mixing and matching data or even whole apps as needed. This ability to include part of one web page into another was called “transclusion”, and it’s remained a bit of a holy grail for decades.

    There’s no reason that this can’t be done today, especially since the way we build web pages in the modern era often involves generating just partial pages or only sending along the data that’s updated on a particular site. If we can address the security and performance concerns of sharing data this way, we could address one of the biggest unfulfilled promises of the web.

    The Missing Building Blocks of the Web

    Your own website at your own address

    This one is so obvious, but we seem to have forgotten all about it: The web was designed so that everybody was supposed to have their own website, at its own address. Of course, things got complicated early on — it was too hard to run your own website (let alone your own web server!) and the relative scarcity of domain names made them expensive and a pain for everybody to buy.
    If you just wanted to share some ideas, or talk to your friends, or do your work, managing all that hassle became too much trouble, and pretty soon a big, expensive industry of web consultants sprung up to handle the needs of anybody who still actually wanted their own website—and had the money to pay for it.

    But things have gotten much easier. There are plenty of tools for easily building a website now, and many of them are free. And while companies still usually have a website of their own, an individual having a substantial website (not just a one-page placeholder) is pretty unusual these days unless they’re a Social Media Expert or somebody with a book to sell.

    There’s no reason it has to be that way, though. There are no technical barriers for why we couldn’t share our photos to our own sites instead of to Instagram, or why we couldn’t post stupid memes to our own web address instead of on Facebook or Reddit. There are social barriers, of course — if we stubbornly used our own websites right now, none of our family or friends would see our stuff. Yet there’s been a dogged community of web nerds working on that problem for a decade or two, trying to see if they can get the ease or convenience of sharing on Facebook or Twitter or Instagram to work across a distributed network where everyone has their own websites.

    Now, none of that stuff is simple enough yet. It’s for nerds, or sometimes, it’s for nobody at all. But the same was true of the web itself, for years, when it was young. This time, we know the stakes, and we can imagine the value of having a little piece of the internet that we own ourselves, and have some control over.

    It’s not impossible that we could still complete the unfinished business that’s left over from the web’s earliest days. And I have to imagine it’ll be kind of fun and well worth the effort to at least give it a try.

    The Missing Building Blocks of the Web

    In a similar vein, you may also enjoy this look at the lost infrastructure of the early era of social media.

     
  • feedwordpress 16:29:00 on 2017/11/29 Permalink
    Tags: google, search, web   

    Underscores, Optimization & Arms Races 


    Warning: preg_match_all(): Compilation failed: invalid range in character class at offset 7 in /homepages/23/d339537987/htdocs/ec/wp-content/themes/p2/inc/mentions.php on line 77
    Underscores, Optimization & Arms Races

    A dozen years ago, the web started to reshape itself around major companies like Google. We can understand the genesis of today’s algorithmic arms race against the tech titans just by looking at a single character.

    Underscores, Optimization & Arms Races

    This is all ancient tech history now, but content management systems used to be one of those competitive markets that tech people watched avidly. (CMSes are the tools people use to publish stuff on the Internet — Medium, where you’re reading right now, is one, and some of the big ones people use today are WordPress or Drupal.)

    Back in the early 2000s, I helped create two then-popular CMS tools, Movable Type and TypePad; pretty soon, WordPress and Drupal and other tools came onto the scene solving a similar set of problems. All of these apps basically did the same thing they do today: You type in a box, and hit publish, and it makes a nice-looking web page with whatever you wrote. At first they were used by individual bloggers to keep personal sites, but they quickly took over publishing for almost every media outlet on the web. It was a booming market, and the people working on these tools were some of the first wave of high-profile social media startup founders.

    Friendster was around then, and MySpace was growing in prominence. (Facebook didn’t come around until a little later, and was still just for Ivy League kids for a long time.) But the biggest player on the rise in that era was Google. They’d bought Blogger, one of the earliest popular social media tools, in early 2003 and then launched their AdSense advertising platform a few months later. All of a sudden, Google was massively influencing content and monetization in the new world of social media.


    Underscores, Optimization & Arms Races

    Drawing a line

    Just as we see Instagram and Snapchat going back and forth today one-upping each other’s features, in the early 2000s, people were constantly making new features for publishing in the then-new format of blogging. Todays social apps might distinguish themselves based on who has the best photo filters, but the technological distinctions between content management systems were a lot nerdier, like really esoteric and detailed technical controls over the design of your website.

    The early era of the social web was a time of incredible advancement in web design. There was a revolution in aesthetics, focused around simplicity and white space and advances in typography and styling, and this was matched by huge leaps in accessibility and conformance with the open technical standards that defined the web itself. Basically, the web got a lot more pleasant really quickly, driven in large part by the influence of the people who were creating the early social media platforms. Things got good enough that it was worth the time to sweat little things like the formatting of web addresses.

    Yep—we got so picky about design that one of the elements of a website that people wanted to control was the web address (URL) of the webpages themselves. At first, each of the posts on your blog would live at an address that was something like example.com/00000002.html, with the number going up each time you wrote a new post. But that long, nerdy-looking number offended a lot of people’s aesthetic sensibilities, so pretty soon addresses started to look like example.com/2004/04/story.html and that was a little better.

    Eventually, people wanted to have the whole title of their article show up in the web address. Part of this was just because it looked cool, but some folks had started to suspect that having those words in the address might help a blog post rank higher on Google. (Google was still a smaller player in the overall web search market at the time, but it was already by far the most popular search engine amongst internet geeks.)

    But here’s the thing: web addresses can’t have spaces in them. To include a full title with spaces in a web address for a blog, the spaces would either have to be removed (ugly!) or converted into something equivalent. Since we were one of the first to encounter this issue, our team designed to have our content management system use underscores, based on the rationale that underscores were the character that most closely resembled a blank space.

    The end result? Anybody who used our tools could write a a blog post entitled “My Great Cookie Recipe” and it would live at an address that looked like example.com/2005/04/my_great_cookie_recipe.html. By contrast, the WordPress team thought that hyphens looked better, so blog posts published on their tool would look more like example.com/2005/04/my-great-cookie-recipe. Sure, these different tools made slightly different choices about which character to use, but such a subtle distinction couldn’t be meaningful, right?

    As it would turn out, we’d stumbled across a harbinger of how the entire web was about to change.

    Underscores, Optimization & Arms Races

    The rise of SEO

    Just as the social media era of the web was taking off, Google’s rapidly-growing platform radically changed the nature of content and sharing on the web. Anybody with a website was starting to understand that ranking highly on Google was immensely valuable, and as Google’s ad platform boomed, seeing those paid results alongside “organic” search results made it even clearer that a high ranking had monetary value.

    Initially, information of how to rank better on Google was exchanged almost as folk knowledge—half urban legends or myth, half insights that were gleaned through painful experience but not documented anywhere. Soon, the dark arts of earning Google’s favor came to be known as “search engine optimization”, and what began as informal sharing of guesses about Google’s function started to grow into what became a multi-billion-dollar industry.

    Everybody Loves Dashes

    Even as SEO matured and formalized, Google had very little documentation and no designated ombudsman to handle questions about how to be in their search engine’s good graces. Eventually, early Googler Matt Cutts took up the mantle of representing the company to the community as advocate for best practices in search optimization, using his personal blog to explain company policies that had heretofore been opaque or inscrutable. There was a feel of Kremlinology to the way his minor public utterances would be parsed for any hints that outsiders could glean about Google’s inner workings. But just as often, Cutts would make clear pronouncements of What To Do, and these were received by the SEO community almost as religious edicts.

    One such declaration in the summer of 2005 came like a lightning bolt, a proclamation on Dashes vs. underscores:

    I often get asked whether I’d recommend dashes or underscores for words in urls. For urls in Google, I would recommend using dashes.

    There was a lot of nuance in Matt’s post, but pretty soon the perception for a lot of SEO people became “dashes good, underscores bad”. (The punctuation in URLs are hyphens, technically known as Hyphen-minus, but sure, let’s call them dashes.) To most people in the industry, this settled things. Google had told us all what they preferred, and everybody wanted to rank highly in Google, so SEO experts fell in line. Everything was to be dashes, forevermore.

    But once you’ve trained a community that they constantly need to guess at the secret machinations of your algorithm, they’re not going to stop doing so just because you’ve made a public pronouncement.

    For years, despite Cutts’ clear statement, the choice of punctuation remained such a point of contention and debate that countless stories were written about how best to appease the fickle Googlebot. Eventually, discussion around hyphens and underscores in web addresses became so fraught and so persistent that six years after that initial blog post, Cutts made an entire YouTube video just about punctuation in web addresses on one of Google’s official channels. About 125,000 people have watched the whole video.

    Underscores, Optimization & Arms Races

    Indexing the web, as it is

    While the burgeoning SEO community was debating how best to please Google, amongst our team of people building a content management system, we were having a completely different philosophical debate: should we be trying to appease Google?

    You see, the theory of how we felt Google should work, and what the company had often claimed, was that it looked at the web and used signals like the links or the formatting of webpages to indicate the quality and relevance of content. Put simply, your search ranking with Google was supposed to be based on Google indexing the web as it is.

    But what if, due to the market pressure of the increasing value of ranking in Google’s search results, websites were incentivized to change their content to appeal to Google’s algorithm? Or, more accurately, to appeal to the values of the people who coded Google’s algorithm?

    We found ourselves resistant to what felt like a coercive effect of Google’s rising domination, especially since Google’s own Blogger platform was a competitor of ours. Our expression of that frustration was expressed by a debate over a single character: We were using _ because we thought it looked nicer, so why should we change to - just because Google liked it better? Weren’t they supposed to adapt to what we published on the web?

    Holding the line

    For a while, the team I was working on resisted changing our software to use dashes instead of underscores. My rationale was simple — Google has tons of money, why should we change the design of our tools for free, just to make things easier for a big company like Google? The WordPress community made a more pragmatic call, figuring (quite reasonably!) that users wanted to rank well in Google, they made sure their tool’s default was to use the punctuation that the search engine preferred.

    At a literal level, the technical differences here were trivial. But the different choices of punctuation reflected very different philosophies about how the web should work. Dashes vs underscores represented a profound question: Would we change our apps and our content to suit big companies like Google, or should those big companies accommodate us?

    Underscores, Optimization & Arms Races

    Caving In

    Eventually, most people who were publishing on the web said they didn’t want to do anything to risk diminishing their Google ranking, and our team had pretty much no choice but to switch to letting people publish web addresses using dashes. I genuinely felt like we had caved in. Caring so much about a single punctuation mark was, of course, an absurd hill to die on, but having Google coerce us into changing our software, and our aesthetics, felt like the first step toward a slippery slope of further concessions.

    But the truth was even worse. Despite my misgivings about Google, I didn’t notice a more nefarious pattern that was established at the same time. A whole community had formed around trying to guess how Google’s algorithm worked, and that community very quickly built an entire infrastructure around reverse-engineering the algorithms that drive attention and popularity on the web.

    Google was teaching us that the way to win on the web is to game the algorithms of big companies.

    A few years later, Google changed their mind and said we could use either dashes or underscores, and people should use whatever they want. But by then it was too late, we’d all already fallen in line.


    Finally, the algorithmic arms race

    In that old era of the social web, the community’s shared knowledge of how to game algorithms was mostly used for harmless things. People would try to get more readers for their personal blogs, or pull off silly stunts like “Google bombing”, which was essentially just playing with getting a certain site to rank high in Google’s results for a particular term. It’s no wonder we thought it was no big deal if we changed our apps to make content that suited Google’s arbitrary rules. None of this stuff mattered that much, right?

    But by attaching monetary value to search ranking, what Google ended up catalyzing was a never-ending arms race, where they constantly updated their algorithm and each community on the web constantly tried to learn how to exploit the new mechanics. The stakes of the algorithmic arms race kept going up; instead of being about pulling off silly pranks, understanding how to appease Google became the cornerstone of multi-million-dollar marketing campaigns. Instead of being about one character in a web address, it became about publishing content that suited the algorithm, whether it was true or not. At first, the only people paying attention were nerds making content management systems, then a broader audience of people trying to optimize their search engine positioning.

    Eventually, though, movements across the political spectrum came to understand that knowledge of how to appease the algorithms that govern social media had profound social and cultural power. It wasn’t just marketers who figured out the best way to promote their ideas, it was trolls and activists and harassers and people on the fringes who wouldn’t have had any way to get the word out before—both for better and for worse. At that point, the rise of fake media markets was inevitable.


    By the time we realized that we’d gotten suckered into a neverending two-front battle against both the algorithms of the major tech companies and the destructive movements that wanted to exploit them, it was too late. We’d already set the precedent that independent publishers and tech creators would just keep chasing whatever algorithm Google (and later Facebook and Twitter) fed to us.

    Now, the challenge is to reform these systems so that we can hold the big platforms accountable for the impacts of their algorithms. We’ve got to encourage today’s newer creative communities in media and tech and culture to not constrain what they’re doing to conform to the dictates of an opaque, unknowable algorithm. We have to talk about the choices we made in those early days, even at risk of embarrassing ourselves by showing how naive we were about the influence these algorithms would have over culture.

    And ultimately, we have to use the chance we’ve got now to underscore the lessons that we learned from the earliest days of the social web, that still resonate on billions of screens today. So much can come from a decision about just one character on the screen.

    Underscores, Optimization & Arms Races

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
esc
cancel