Lunduke
News • Science & Tech
The Unlikely Story of UTF-8: The Text Encoding of the Web
Plan 9, Placemats, New Jersey Diners, and last minute ideas
June 22, 2023
post photo preview

If you are reading this on a computer -- of any kind -- odds are good that the words on the screen are all encoded using something called "UTF-8".

UTF-8 (or "Unicode Transformation Format - 8 bit") is, put simply, a format for encoding and storing text -- one which allows for far more text characters than the older "ASCII" encoding (which could only show a total of 95 printable characters).

And UTF-8 is, quite simply, everywhere.

Nearly every major computer operating system heavily uses UTF-8 for handling text... likewise it is the standard for websites, with close to 100% of all webpages explicitly using UTF-8 for the text on the page.

test
The source for Wikipedia.  Like most of the web, using UTF-8.

An argument could be made that UTF-8 is one of the most successful and widely adopted standards in all of computer history.

But this almost wasn't the case.

In fact, UTF-8 was created -- at the very last possible moment -- and it was first implemented in a computer system that most people don't even know existed.

X/Open's search for better text encoding

In the early 1990s, text encoding was... an issue.

While solutions for extended character sets (beyond simple ASCII characters) existed, they were less than ideal.  To put it mildly.  The most popular solution, known as UTF-1 (aka "ISO 10646"), suffered from serious performance issues... and often caused significant problems with software which used plain "ASCII" text (including UNIX file system paths).

Having a character encoding on UNIX systems that could cause problems with UNIX file systems?  Not good.

Obviously a new type of text encoding was needed.

So, in 1992, X/Open (originally known as the "Open Group for UNIX Systems", a consortium of UNIX vendors, including: Sun, HP, AT&T, IBM, and several others) set about the task of selecting a proper text encoding standard to be used across all of the UNIX world.

The proposal that gained the most traction was known as FSS/UTF (aka "File System Safe Universal Character Set Transformation Format").  Roills off the toungue, right?

This proposal was both faster than the old text encoding standard... and, as the name suggests, it was "File System Safe".  Which was a big win.

Enter: The Plan 9 Nerds

Which brings us to September 2nd, 1992.  Sometime in the early evening.

The X/Open group was meeting, in Austin, Texas, to formally decide on the file encoding standard.

Looking to get some feedback on the proposal, some members of X/Open made a call to two legendary programmers -- Ken Thompson and Rob Pike -- who were working on the Plan 9 Operating System project at Bell Labs in New Jersey.


A little background...

Ken Thompson was one of the creators of MULTICS, UNIX, the B programming language (the predecessor to C), among many other accomplishments.

Rob Pike, also a UNIX programmer, was the co-creator of Blit, writer of multiple UNIX and programming books, and the creator of the first UNIX windowing system.

To call these two "absolute legends" in the world of computing would be, perhaps, a bit of an understatement.  The two were currently working together on a research operating system, at Bell Labs, called Plan 9.  An attempt to fix some of the perceived shortcomings of UNIX... by the creators of UNIX, itself.


What happened next... after Ken Thompson and Rob Pike received that phone call?  Luckily, we have a detailed accounting... written by Rob Pike, himself.

"We had used the original UTF from ISO 10646 to make Plan 9 support 16-bit characters, but we hated it.  We were close to shipping the system when, late one afternoon, I received a call from some folks, I think at IBM - I remember them being in Austin - who were in an X/Open committee meeting.  They wanted Ken and me to vet their FSS/UTF design."

Asking two legendary engineers for their input?  You can probably guess what happened next...

"Ken and I suddenly realized there was an opportunity to use our experience to design a really good standard and get the X/Open guys to push it out.  We suggested this and the deal was, if we could do it fast, OK."

That's right.  Ken and Rob had some ideas.  And the X/Open folks agreedd to listen to those ideas... if they could get them something fast.

And, by fast, they really meant "immediately... like... right now."  Because the X/Open team were, quite literally, all gathered in Austin to decide on this... right then.

"Yeah.  I could eat."

Ken and Rob did what any good programmers would do when placed on an almost impossibly tight deadline -- and needed to come up with an amazing idea that could change the course of computing for decades to come... they went out to grab some grub.

"So we went to dinner, Ken figured out the bit-packing, and when we came back to the lab after dinner we called the X/Open guys and explained our scheme.  We mailed them an outline of our spec, and they replied saying that it was better than theirs (I don't believe I ever actually saw their proposal; I know I don't remember it) and how fast could we implement it?  I think this was a Wednesday night and we promised a complete running system by Monday, which I think was when their big vote was."

Remember.  This was 1992.

Which means, while laptops and such certainly existed, most people (even legendary programmers) did not have any sort of mobile, portable computers.  Certainly not the kind you could take out to a restaurant.

So what, pray tell, did they write their new text encoding design on?

A placemat from a New Jersey diner.

This is not the placemat that UTF-8 was designed on.

Seriously.

"UTF-8 was designed, in front of my eyes, on a placemat in a New Jersey diner."

The boys, Ken and Rob, now had just a few days to get all of this done -- before the big vote on the new text encoding standard.  And they sure as heck didn't waste any time.

They got back from dinner, placemat in hand, and got to work.

"So that night Ken wrote packing and unpacking code and I started tearing into the C and graphics libraries.  The next day all the code was done and we started converting the text files on the system itself.  By Friday some time Plan 9 was running, and only running, what would be called UTF-8.  We called X/Open and the rest, as they say, is slightly rewritten history."

They converted an entire operating system over to a brand new -- just designed on a placemat -- text encoding format... in less than two days.

Here's the rough time-line:

  • Wednesday (Sep 2) evening: Dinner at a New Jersey Diner.  Ken Sketches out the idea on a placemat.
  • Wednesday night: Coding begins.
  • Thursday: Coding complete.
  • Friday: Entire Plan 9 operating system is now using "UTF-8".
  • Monday (Sep 7): X/Open group votes to use the Ken/Rob encoding design.

On Tuesday, September 8th, 1992 (at 3:22am), mere hours after the official vote to accept their text encoding design, Ken Thompson sends out the following email regarding Plan 9 now using UTF-8:

"The code has been tested to some degree and should be pretty good shape.  We have converted Plan 9 to use this encoding and are about to issue a distribution to an initial set of university users."

That's right.

Ken and Rob got a call asking for feeback on a Wednesday.  By the next Tuesday (at 3am) they were ready to ship a version of their Plan 9 OS with all the changes, and their designs had been voted on by the largest UNIX companies in the world.

Like I said.

A recent picture of the two legends, themselves.

These guys are legends.

What about that placemat?

Considering the vast impact of UTF-8 on the world of computing... whatever happened to that original "design document" (aka "the placemat")?  It would certainly be of historic significance.

"I very clearly remember Ken writing on the placemat and wished we had kept it!"

Let this be a lesson to all of the programmers out there:

Keep all of you doodles, notes, and sketches you make for your projects... you never know when one of those projects will become critical to the entire world... making your quick sketch worthy of being in a museum.

Especially if it's on a placemat.  From a diner.  In New Jersey.


Copyright © 2023 by Bryan Lunduke.  All rights reserved.  The contents of this article are licensed under the terms of The Lunduke Content Usage License.

community logo
Join the Lunduke Community
To read more articles like this, sign up and join my community today
11
What else you may like…
Videos
Podcasts
Posts
Articles
Lunduke Journal Videos Now Subscriber Exclusives

All articles and audio podcasts remain 100% free for everyone.

The Article:
https://lunduke.substack.com/p/lunduke-journal-videos-now-subscriber

00:09:55
Open Source Orgs Pledge Fealty to United Nations

Linux Foundation, GNOME Foundation, others pledge to "support the needs of the United Nations", promote DEl discrimination & RISE.

The article:
https://lunduke.substack.com/p/open-source-orgs-pledge-fealty-to

00:30:10
Counter-Strike 2 Switched to Wayland (for One Day)

After a number of significant issues when running under Wayland, Valve's CS2 is now back to X11 as default. Wayland advocates blame everything but Wayland.

00:13:19
November 22, 2023
The futility of Ad-Blockers

Ads are filling the entirety of the Web -- websites, podcasts, YouTube videos, etc. -- at an increasing rate. Prices for those ad placements are plummeting. Consumers are desperate to use ad-blockers to make the web palatable. Google (and others) are desperate to break and block ad-blockers. All of which results in... more ads and lower pay for creators.

It's a fascinatingly annoying cycle. And there's only one viable way out of it.

Looking for the Podcast RSS feed or other links? Check here:
https://lunduke.locals.com/post/4619051/lunduke-journal-link-central-tm

Give the gift of The Lunduke Journal:
https://lunduke.locals.com/post/4898317/give-the-gift-of-the-lunduke-journal

The futility of Ad-Blockers
November 21, 2023
openSUSE says "No Lunduke allowed!"

Those in power with openSUSE make it clear they will not allow me anywhere near anything related to the openSUSE project. Ever. For any reason.

Well, that settles that, then! Guess I won't be contributing to openSUSE! 🤣

Looking for the Podcast RSS feed or other links?
https://lunduke.locals.com/post/4619051/lunduke-journal-link-central-tm

Give the gift of The Lunduke Journal:
https://lunduke.locals.com/post/4898317/give-the-gift-of-the-lunduke-journal

openSUSE says "No Lunduke allowed!"
September 13, 2023
"Andreas Kling creator of Serenity OS & Ladybird Web Browser" - Lunduke’s Big Tech Show - September 13th, 2023 - Ep 044

This episode is free for all to enjoy and share.

Be sure to subscribe here at Lunduke.Locals.com to get all shows & articles (including interviews with other amazing nerds).

"Andreas Kling creator of Serenity OS & Ladybird Web Browser" - Lunduke’s Big Tech Show - September 13th, 2023 - Ep 044

So, DistroTube uploaded a video about the Fedora Everything ISO. He installed the LXQt desktop environment. When he logged in for the first time, he went to adjust his monitor resolution and it said "LXQt monitor settings are currently unsupported under this Wayland compositor." He switched to the X11 session and it worked. Somewhere in the distance, a Wayland dev cries.

20 hours ago

Wikimedia loses their UK case against the odious Online Protection law.

I wonder if this UK law applies to AI agents as well. If it doesn't, it will make sites harder to use and AI much more attractive.

Unintended consequence or intentional move?

Wikimedia Foundation’s lead counsel, Phil Bradley-Schmieg, said in May that Category 1 duties, if enforced, “would undermine the privacy and safety of Wikipedia volunteer users, expose the encyclopedia to manipulation and vandalism, and divert essential resources from protecting and improving Wikipedia and the other Wikimedia Projects.”

https://www.theepochtimes.com/world/wikipedia-operator-loses-court-challenge-to-uk-online-safety-act-rules-5899615

23 hours ago

Rumble is moving into so-called artificial intelligence, AI. That means Locals will too. So, now YouTube and Rumble (with Locals) are in an AI arms race. Everything we all say on all platforms will be leveraged for their profit.

Rumble shares climb amid talks of $1.17 billion deal for Northern Data
https://www.msn.com/en-us/money/technology/rumble-shares-climb-amid-talks-of-1-17-billion-deal-for-northern-data/ar-AA1KiAfE

post photo preview
Linux Foundation’s New Banned Words: Hung, Pow-wow, & Sanity Check
The Academy of Motion Picture Arts & Sciences, Netflix, Apple, & Intel teamed up with The Linux Foundation to say "don't use HUNG when talking about software."

The Linux Foundation has announced the release of a new “Inclusive Language Guide” — which adds a handful of new words you are not allowed to say.

And it’s even more ridiculous than you might expect.

 

This new “Inclusive Language Guide” is designed to “drive a more diverse, equitable, and inclusive culture” (read: DEI) and to replace “offensive language” with “acceptable language”.

Past iterations of the “Inclusive Language Guide” included “Socially Charged” words such as “Master / Slave”, “Black / White”, and even “Owner”.

This new revision officially adds “Pow-wow” to that list of death-causing words.

 

Of course, any “gendered language” remains firmly off limits. “Manpower”? Can’t say that. And definitely don’t use “gendered” pronouns like “he” or “she”.

Doing so is literally genocide.

 

Which brings us to my favorite new additions (to the “Ableist” and “Violent” language sections of the list).

  • Sanity Check

  • Dummy

  • Hung

That’s right. You can’t use the word “hung” anymore.

 

I deleted 3 different titles for this story containing the word “hung”. They were all very entertaining and very inappropriate. I would like credit for the restraint I am showing right now.

As crazy, insane, and abnormal (see what I did there?) as this list of “bad” words is… what’s even stranger is the group behind it.

This is a joint project between The Linux Foundation and — wait for it — the Academy of Motion Picture Arts & Sciences. Yes. The one that produces the Oscars.

 

The two organizations teamed up to create the Academy Software Foundation.

Which, apparently, ran out of worthwhile things to work on… and, instead, chose to add “hung” to a “don’t use this word in the software industry” list.

That organization also worked with the Alliance for OpenUSDanother Linux Foundation Project — to publish this list.

 

Who, exactly, is responsible for making all of this happen at the Alliance for OpenUSD?

Pixar, Nvidia, Adobe, Autodesk, and Apple.

 

And the leadership over at the Academy Software Foundation includes companies like Netflix, Sony, Adobe, Intel, Microsoft, and Epic Games.

 

Right about now you may be wondering why Epic Games and Amazon is so worried about you using the word “hung”.

I don’t have an answer for you.

It’s weird.


Thanks to all of the subscribers to The Lunduke Journal for making this work possible — without taking a single dime from Big Tech (or running a single ad). Check Lunduke.com for all the ways you can get the articles, podcasts, and videos.

Read full Article
post photo preview
50% Off Lunduke Journal for August
You save money. The Lunduke Journal gets more subscribers. Win-win.

This last weekend we had a “50% off Subscriptions” sale — and the response was nothing short of phenomenal. Amazing to see so many people supporting truly independent Tech Journalism!

The future looks bright.

You know what? Just for kicks, let’s extend that 50% off… for the entire month of August.

Take your time. Pick the subscription type (below) that makes the most sense for you (there are many, most excellent options).

Note: The 50% off discounts are available via Locals, Substack, & Itch (MP4 Downloads). Monthly subscriptions are also available on X, Patreon, & YouTube, but those platforms do not have the ability to provide these types of discounts.

If you’re ever unsure of where to grab the latest articles, podcasts, and videos from The Lunduke Journal, check out Lunduke.com.

50% Off Yearly or Monthly Subscription:

Available via both Locals and Substack. (This includes full access to all new videos & the community Forum.)

That means $3 / Month. Or $27 / Year (which works out to $2.25 / Month).

Via Lunduke.Locals.com:

Via Lunduke.Substack.com:

Note: You can also grab a Monthly subscription via X, YouTube, or Patreon. There’s no way to offer a discount on those platforms. But those are still good options!

The Famous Lifetime Subscription:

The "World Famous Lunduke Journal Lifetime Subscription" is exactly what it sounds like. Pay once and get full access to The Lunduke Journal. For life.

Now, through the entire month of August… you can snag one at a crazy discount. Normally these are $200… but you can grab one for $100. (You can also pay more if you’d like to donate a little extra.)

The Lifetime Subscription can be obtained via Locals, Substack, or using Bitcoin. All three options work great and are super easy (& all three include full access to both new videos & the community Forum). Scroll down and choose your option.

Note: The Lifetime Subscription only applies to Substack and Locals. Other platforms (such as X, Patreon, & YouTube) do not provide the functionality necessary to create Lifetime Subscriptions.

How to get a Lifetime Subscription via Locals:

  1. Go to Lunduke.Locals.com/support.

  2. Select "Give Once".

  3. Enter "100" (or more) into the amount field.

  4. After checking out, Lunduke will toss you an email once your account is set to full lifetime status. (This usually happens within a few hours.)

How to get a Lifetime Subscription via Substack:

  1. Go to Lunduke.Substack.com/subscribe.

  2. Select the “Lifetime Subscription” option.

  3. After checking out, Lunduke will toss you an email once your account is set to full lifetime status. (This usually happens within a few hours.)

If you would also like full, Lifetime access to Lunduke.Locals.com (which is included):

  1. Make a free account on Lunduke.Locals.com.

  2. Email “bryan at lunduke.com” with the email address you use on both Substack and Locals (can be different email addresses).

  3. Lunduke will toss you an email once your account is set to full lifetime status on Locals.

How to get a Lifetime Subscription with Bitcoin:

You can also obtain a Lifetime Subscription via Bitcoin.

bc1qyjakve8fywm8pz2v99v57yhjj0vzr2vjze6fcq

  • Email "bryan at lunduke.com" with the following information: What time you made the transaction, how much was sent (in Bitcoin), and the email address you use (or plan to use) on Locals.com or Substack.com (or both).

50% Off DRM-Free, MP4 Downloads:

Want to be able to download every show The Lunduke Journal releases (and watch them on whatever device you like)? Yeah. You can do that. For 50% off.

Note: This DRM-Free download option does not include access to the Forum. This option is strictly for downloading the episodes.

No matter which type of subscription you choose, thank you for your support! Every subscription goes directly towards keeping The Lunduke Journal running well into the future.

-Lunduke

Read full Article
post photo preview
NixOS Now Celebrates Pride Month… Year Round
First NixOS conducts a mass "purge" of Conservatives. Now the Linux distro has permanently changed their logo to reflect "LGBT Pride", banning contributors who ask why.

The wild, woke saga of NixOS continues.

Back in June — during “Pride Month” — the NixOS Linux project changed their logo to “stand with [their] LGBTQ+ friends”.

 

One developer inquired about this, by asking “Is NixOS now taking a stance on social political issues? If so, perhaps a written statement should accompany such changes.”

 

It was then made clear, by NixOS leadership, that this new “Pride” version of their logo was intended to be a semi-permanent thing.

“This isn’t just a June statement,” said the representative of NixOS in a post reinforcing their focus on LGBTQ+ pride. “It’s something we live year round.”

 

Immediately following this statement, NixOS leadership declared that they plan to “keep the pride-themed logo up longer”. Stating that, for NixOS, “the ongoing fight for equality and celebrating LGBTQ+ friends does not stop on June 30th.”

 

After which, that developer who inquired if NixOS was taking a political stance… was banned.

 

And he wasn’t simply banned from one platform. That developer was “permanently suspended on all platforms for trolling.” Forums, chat, bug tracking, code repositories… the works.

 

Of course, this sort of political extremism is nothing new for NixOS.

The NixOS Purge

Back in April of 2024, NixOS began mass suspending users and contributors under suspicion of having Conservative politics.

 

Quickly, many of those temporary suspensions turned into permanent bans from the entire NixOS project. An event which the NixOS moderation team affectionately called a “purge” of those who they called “Nazis” (but were, in fact, not actually Nazis).

They did so while waving the Antifa flag.

 

All of which culminated in the NixOS moderation team forcing the founder of NixOS to abdicate his role in the project.

This crew of political extremists even went so far as to draft an abdication letter on behalf of the NixOS founder… and they, somehow, convinced him to sign it.

One of the notes from the extremists — within the draft — noted that the NixOS founder must be forced to add himself as a signatory of the letter "for it to appear amicable".

It sounds wild, but it truly happened. Here’s a screenshot of a draft of the abdication letter — written, in Google Docs, by the extremists.

 

As we can see, banning — or, in NixOS parlance, “purging” — of those with “wrong” political opinions is nothing new.

And it appears that, even after the mass bannings of 2024, the NixOS extremists are not yet done with their “purge”.

Read full Article
See More
Available on mobile and TV devices
google store google store app store app store
google store google store app tv store app tv store amazon store amazon store roku store roku store
Powered by Locals