When in Middle School I discovered that “Allan-A-Dale has no faggots for burning” I became quite fascinated with how the meanings of words change over time. I didn’t exactly become an etymologist, but I did make an effort to understand words in their original context in order to better appreciate our reading assignments in school. It turns out that Shakespeare is much funnier if you get more of the jokes and puns.
Every social group from the smallest school clique to entire social castes twists word meanings to create their own slanguage and group identity. The coinages of influential groups often make it into mainstream usage, replacing old word meanings along the way. Eventually the new meaning overtakes the original so completely that few people, if any, recognize the original usage.
Consider the word decimate which today is almost synonymous with devastate. Most people are surprised to be told they are using it wrong and that it really means “to kill every tenth man.” Of course, most people making that claim are surprised to be told it originally meant “collect a tithe” and that they are using it wrong. Though the word remains in common use, few non-scholars recognize its original meaning.
If we substitute time for distance, which Einstein assures us is valid, then this total replacement of word meanings over time can be thought of as a sort of red-shift. The word remains visible in common usage but what we observe is distorted over time. But not all words are distorted to the same degree and the vector for that difference is societal change. The amount of progress and rate of societal change between any two points in time can be measured by the red-shift in word meanings as older language recedes into the past.
In school I relied on my English Lit classes to feed my interest in language. Nearly all the red-shift I observed was from before my lifetime. Within another decade, red-shifted language from within my own lifetime was still so unusual that it remained remarkable, but the effects were becoming commonplace.
For example, Disney parks used to sell ticket books with many A-Tickets for the rides nobody wanted to go on, a precious few E-Tickets for the premiere attractions, and a smattering of intermediate level tickets. In common parlance, an E-Ticket plane ride referred to First Class. I’ve held actual E-Tickets in my hands and watched the term morph into a colloquialism for luxury.
Eventually, E-Ticket came to mean electronic issuance of any type of ticket. Though that meaning is still widely recognized, ubiquitous network access and smart phone ownership will soon make E-Ticket an anachronism. With few exceptions it’s the only way tickets are issued anymore and we no longer need the qualifier. They are just tickets now. But along the way they displaced the original term E-Ticket completely. Try telling someone younger than about 50 that the new Tesla coupe is “an E-Ticket ride” and watch for the blank stare. That usage has red-shifted into history.
Fast forward to 2015 and root around the IT section of the dictionary and you will find plenty of examples of terms that were red-shifted within a decade of their coinage, some even faster. Not only is this unremarkable today, but it’s happening so quickly that meanings change before the words even make it to the mainstream. Such new coinages are like Tribbles – born pregnant and on their 2nd or 3rd generation before anyone notices them. The scope and rate of societal change are unprecedented in human history and it is awesome* to be here to witness it.
(Awesome as used here in the original meaning “that which inspires awe” rather than the current meaning roughly equal to “congratulations” as in “My new TV gets Netflix” whence the reply “Really? That’s awesome! Netflix and chill?” would be considered appropriate. Awesome, even.)
Consider the term “Cloud.” I was a programmer when I first heard the term and at the time it referred to several specific capabilities…
- Dynamic, run-time resolution of resource addresses on the network
- Consistent results regardless of query or responder location
- Ubiquitous availability of the cloud service within a given scope
- High availability of the service due to multiple providers being online
The global network of DNS servers were among the first-ever cloud services. You didn’t know or care where the DNS servers were, the IP address you get back is the same no matter which DNS server returns it, you can make the DNS query from anywhere in the network, and if any one DNS server dies you don’t know it and still get a response because of redundancy.
But we never referred to DNS as cloud and we didn’t design commercial business applications along those principles for many years. For a long time software was designed to run in one place. By the time the term Cloud was applied to software architecture, the idea that you might process data remotely, at an undetermined location, but that this would be more reliable than the old way, was a complete reversal of the then-prevailing approach. In the 2000’s the term cloud was in common usage where I worked and at conferences I was attending. It was always understood to mean a redundant, reliable, ubiquitous network service but it hadn’t yet gone mainstream.
Today “cloud” is roughly equivalent to “on the Internet.”
If it seems to have lost a little something in translation to the mainstream, that’s because it has. From a marketing perspective, knowing what cloud means is a lot less important than knowing that whatever it means is good. Or perceived as good. Since things like redundancy, reliability, and dynamic address resolution cost money, the meaning of the term is on a death-spiral towards bare-bones network connectivity. All we can reliably say today is that cloud has something to do with the Internet and the finer details are optional, if you are lucky. Some of that original meaning remains intact if you look hard enough, but it’s fading fast.
The same thing has happened to another IT term, Big Data. It too has red-shifted to the point that the original meaning is unknown in common usage. Co-option by marketers has stripped the term of all meaning. So it’s a bit disappointing but not at all surprising to see it referred to as pure branding with no substance.
In a post to the VRM List today, Doc Searls writes:
Big Data is IBM’s and McKinsey’s re-branding of what we’ve had since the Mainframe Age. Google Trends is revealing on the topic.
Alas, Google’s Ngram Viewer (which looks at the prevalence of words and phrases in books) only goes to 2008; but it does show how huge the term “data processing” was for decades, and puts in context why IBM needed to re-brand it as something buzz-able: link.
Hats off to them. It worked.
Well, that’s a slight bit of an oversimplification.
In 1993 we were well into the Mainframe Age and I was a programmer at Equifax. At the time the California Driver License File came on 40+ high-capacity data cartridges. Sorting had to be done in multiple passes because mainframe JCL (Job Control Language) was limited to no more than 99 temporary sort work files and not even Equifax had sufficient DASD (what we now call “disk” was Direct Access Storage Device back then) space to load the file from tape for sorting. Depending on your hardware and OS, file sizes were limited generally to 2GB and that was state of the art.
Data storage on tape was expensive, on disk even more so, and anything that reduced the data storage requirement was desirable. The highest recognition I received during my time at Equifax was for figuring out a way to torture Syncsort into sorting down all 40+ carts of the California Driver License File in one pass. The technique saved us so much time in our batch processing cycle that it was considered proprietary technology and I was told not to reveal it even to IBM lest our competitors get it.
Then, as throughout the early history of computing, the predominant approach to data was to store as little of it as possible. CPU cycles, storage and bandwidth were so expensive that any non-essential data or processing were stripped away.
The term Big Data in its original sense represented a complete reversal of the prevailing approach to data. Big Data specifically refers to the moment in time when the value of keeping the data exceeded the cost and the prevailing strategy changed from purging data to retaining it.
Of course, you had to make the data that you kept pay enough to offset the cost of keeping it. The original Big Data implementations were selective and whatever data was retained was expected to earn its keep. It was impractical to keep everything so experts who could guide projects through the selection and implementation were critical to success.
CPU cycles, storage and bandwidth are now so cheap that the cost of selecting which data to omit exceeds the cost of storing it all and mining it for value later. It doesn’t even have to be valuable today, we can just store data away on speculation, knowing that only a small portion of it eventually needs to return value in order to realize a profit. Whereas we used to ruthlessly discard data, today we relentlessly hoard it. Even if we don’t know what the hell to do with it. We just know that whatever data element we discard today will be the one we really need tomorrow when the new crop of algorithms comes out.
In the same way E-Tickets are becoming tickets through ubiquity, Big Data is fast becoming data, as the Techcrunch article that prompted the VRM discussion explains:
But marketing materials, like fishermen, exaggerate. Most companies only have a fraction of the data they claim. And typically, only a small fraction of that fraction is useful for generating any non-trivial insight.
(From Big Data Doesn’t Exist by Slater Victoroff).
From our present vantage point so many orders of magnitude beyond the Big Data threshold, we’ve completely lost sight of the original meaning and significance of the term. Talking IT managers into flipping from a predominant strategy of aggressive data minimization to one of aggressive data retention used to take one hell of a business case and almost unbelievable ROI projections. The technology was new and not well understood and even the most conservative claims were unprecedented. It was a really big deal at the time.
So how do we get to Big Data is nothing but buzz and marketing as Searls claims? Or doesn’t exist at all as Victoroff claims? The historical importance of that pivot point would be hard to overstate but like cloud, Big Data was into its second or third iteration before it became mainstream. Perhaps the red-shift it seems to have suffered tells us less about the technologists overlooking the historical context and more about the times in which we live:
The extent to which technological change is reshaping society, and the increasing rate of that change, is outstripping the ability of humans to comprehend it.
These really are amazing times and the one thing we can predict is that we will all be able to see the world change before our eyes. Gone are the days when the span of one lifetime might witness a few transformational inventions. These days, you blink and you miss something important. Even specialists today often have trouble keeping up with change in their own fields.
We may not quite have reached the singularity that Ray Kurzweil talks about, but once we get to the point where our technology advances faster than our collective understanding of it is there any real difference? When change is so rapid and so broad that we can’t observe it directly and it is easier viewed as societal red-shifts perhaps the answer is there’s no difference at all.