Digital Data, an American History

October 05, 2025

Many believe that computers and software knowing and monitoring everything about us is a recent phenomenon. Not so, says Rahul Matthan in The Third Way:

“Since they were first created, computers have been designed to monitor, categorize and classify us. Everything that followed from there was just the natural consequence of that original objective.”

He elaborates on that. It started in 1890, when the US wanted to conduct its census. By then, it was taking too long to conduct one. By the time they could finish conducting and collating it, it was time for the next census! The task needed to be automated and Herman Hollerith gave it a shot. He first reduced the data into a standard format – age, sex, religion, occupation etc – and second called for it to be digitized by giving census takers a card in which they had to punch holes to indicate answers. He then fed each card into a machine that “read” the information by pressing a set of electrical pins on the card.

“The pins that passed through the (holes)… completed an electrical circuit, while those that were stopped by the paper did not.”

Not only did this speed up census taking enormously, it also allowed for the first time ever for enormous data to be sliced and diced with “incredible specificity”. One could filter data by specific categories. People could be grouped on various attributes. Today that idea scares many, but back then it was just an incidental capability of the solution to the very real problem of speeding up the census.

“Tabulating machines made it possible to process vast amounts of information relatively quickly.”

It would, as Matthan puts it, turn out to be the “dawn of the Information Age”. And its first big use was to identify Japanese origin citizens in America after Pearl Harbour (World War II). Nazi Germany would use the same setup of punch card based information to identify Jews.

Fast forward to the Vietnam War. By now, computers (not personal, but room sized mainframes) existed. The US realized human intelligence was needed in Vietnam. All kinds of information was collected from people on the ground – soldiers, villagers, city dwellers etc. The volume of information was far too much for humans to process, so computers began to be used to crunch that data.

When domestic unrest began in the US (anti-Vietnam war, rights for blacks etc), the US government panicked. What if there was a secret communist/Soviet backed conspiracy to overthrow the system? Well, those computer systems designed for processing human data in Vietnam could now be trained on US citizens to evaluate things. Some US news agencies raised concerns on the usage of computers for domestic surveillance, the spectre of Big Brother in America, but most people dismissed it as science fiction.

Why then didn’t Big Brother become all pervasive in America? Well, once the domestic unrest ended, politicians went back to spending government money on things that might win elections. The spending on systems of surveillance reduced.

Private companies though started spending more on computers. But even as they gathered more data, esp. with the rise of the Internet, individual companies realized that ensuring data was in proprietary format, un-readable by others, was best from their individual perspectives. Data in the US thus became silo’ed.

In the early days of the Internet, in 1995, an online service provider named Prodigy was held liable for defamatory comment by an anonymous user. A couple of law makers were aghast. If this was the precedent, they feared the Internet would never realize its potential as a place where information could flow freely. So they pushed for a US legislation that said that no provider of a an “interactive computer service” could be treated as the publisher or speaker of any content provided or created by some user of their site. You can see why this made sense. But it became the American philosophy for the Net, and since the US was the first on the Net, it practically became the global philosophy of the Net. It is why sites like Facebook or WhatsApp cannot be held liable for any content created or circulated by their users.

The early Internet started by having content free. Today, most people wouldn’t dream of paying for content of the Net. Again, the early American precedent became the international norm. But content providers had to make money – ads became that mechanism. Targeted ads, finetuned for individual preferences, were even better. Which has led to the rise of what is called “surveillance capitalism”, where sites monitor and record everything you do. This ad driven model also meant companies made more money the longer you stayed on the site. Which led to the rise of personalized feeds and constant notifications. And more recently, it has led to the rise of extreme polarization – if everyone only keeps getting to see what they already like/believe, well, nobody ever hears a variety of views.

That then is the history of data and its attendant flaws… in America. Europe has chosen a different approach, and India is taking a third approach. We will look at those next.

Search This Blog

Viswanathan K's Thoughts

Digital Data, an American History

Comments

Post a Comment

Popular posts from this blog

Need for an Informed Aadhar Debate

Nazis and the Physics Connection

1991 - Liberalization