A Bit More Detail

Assorted Personal Notations, Essays, and Other Jottings

Posts Tagged ‘internet

[LINK] Aditya Mukerjee on the problems of Unicode on a multilingual Internet

leave a comment »

At Model View Culture, Aditya Mukerjee has a post with the arresting title of “I Can Text You A Pile of Poo, But I Can’t Write My Name”. Mukerjee makes the convincing case that the underpresentation of non-Western languages in Unicode, especially South and East Asian ones, is a serious problem for the Internet. That undeciphered scripts like Linear A are fully included is, well, odd.

My family’s native language, which I grew up speaking, is far from a niche language. Bengali is the seventh most common native language in the world, sitting ahead of the eighth (Russian) by a wide margin, with as many native speakers as French, German, and Italian combined.

And yet, on the Internet, Bengali is very much a second-class citizen – as are Arabic (#5), Hindi (#4), and Mandarin (#1) – any language which is not written with the Latin alphabet.

The very first version of the Unicode standard did include Bengali. However, it left out a number of important characters. Until 2005, Unicode did not have one of the characters in the Bengali word for “suddenly”. Instead, people who wanted to write this everyday word had to combine three separate, unrelated characters. For English-speaking teenagers, combining characters in unexpected ways, like writing ‘w’ as ‘\/\/’, used to be a way of asserting technical literacy through “l33tspeak” – a shibboleth for nerds that derives its name from the word “elite”. But Bengalis were forced to make similar orthographic contortions just to write a simple email: ত + ্ + ‍ = ‍ৎ (the third character is the invisible “zero width joiner”).

Even today, I am forced to do this when writing my own name. My name is not only a common Indian name, but one of the top 1,000 names in the United States as well. But the final letter has still not been given its own Unicode character, so I have to use a substitute.

A few other characters that were more common historically, though still used today, were also missing for the first decade of Bengali’s existence in Unicode. It’s tempting to argue that historical characters have no place in a character set intended for computers. On the contrary, this makes their inclusion even more vital: rendering historical texts accurately is key to ensuring their survival in the transition to the age of digital media.

Written by Randy McDonald

August 4, 2015 at 5:37 pm

[URBAN NOTE] “Why do people use Spanish more than Chinese to Google in the GTA?”

leave a comment »

The Toronto Star‘s Peter Edwards reports on patterns in Google searches in Toronto. I would suggest that another explanation might be that users of Chinese languages use search engines other than Google, Baidu for instance.

Toronto has a large Chinese community, but there’s not much Chinese-language Googling out of the GTA. But Toronto has a far smaller Spanish-language community, and Spanish is the most-used language for Googling from here, after the official languages of English and French.

Those are a few of the findings of a just-released study by the Google News Lab.

The study breaks down the estimated more than 3 billion searches a day globally by language and city for Berlin, Delhi, London, Madrid, New York, Paris, Sao Paulo, Shanghai and Toronto.

“It’s interesting,” McMaster University sociology professor Vic Satzewich said in an interview.
Satzewich, who has studied patterns of immigration, suggested ebbs and flows in immigration and tourism help explain the Googling patterns.

He suggested that the low number of Chinese-language Googlers from the GTA might be reflected in part by an effort by the government to attract immigrants who are strong in Canada’s two official languages.

The high Spanish-language Googling from the GTA could reflect an increase in temporary workers from Mexico and Guatamala over the past decade.

[WRITING] “The Web We Have to Save”

leave a comment »

Earlier today, I linked to famous Iranian-Canadian blogger Hossein Derakhshan‘s essay at Medium. Imprisoned in Iran for six years because of his blogging, to him the changes that hit the online world between 2008 and 2014 were all the more visible. Original content, exemplified by the hyperlink, no longer matters nearly as much.

Six years was a long time to be in jail, but it’s an entire era online. Writing on the internet itself had not changed, but reading — or, at least, getting things read — had altered dramatically. I’d been told how essential social networks had become while I’d been gone, and so I knew one thing: If I wanted to lure people to see my writing, I had to use social media now.

So I tried to post a link to one of my stories on Facebook. Turns out Facebook didn’t care much. It ended up looking like a boring classified ad. No description. No image. Nothing. It got three likes. Three! That was it.

It became clear to me, right there, that things had changed. I was not equipped to play on this new turf — all my investment and effort had burned up. I was devastated.

[. . .]

The hyperlink was my currency six years ago. Stemming from the idea of the hypertext, the hyperlink provided a diversity and decentralisation that the real world lacked. The hyperlink represented the open, interconnected spirit of the world wide web — a vision that started with its inventor, Tim Berners-Lee. The hyperlink was a way to abandon centralization — all the links, lines and hierarchies — and replace them with something more distributed, a system of nodes and networks.

Blogs gave form to that spirit of decentralization: They were windows into lives you’d rarely know much about; bridges that connected different lives to each other and thereby changed them. Blogs were cafes where people exchanged diverse ideas on any and every topic you could possibly be interested in. They were Tehran’s taxicabs writ large.

Since I got out of jail, though, I’ve realized how much the hyperlink has been devalued, almost made obsolete.

Written by Randy McDonald

August 3, 2015 at 10:57 pm

[BLOG] Some Friday links

leave a comment »

  • blogTO notes that you can now LARP at Casa Loma.
  • Centauri Dreams notes the odd reddish marks on the surface of Saturn’s moon Tethys.
  • Crooked Timber takes issue with David Frum’s misrepresentation of an article on Mediterranean migration.
  • The Dragon’s Gaze notes the discovery of the aurora of a nearby brown dwarf.
  • The Dragon’s Tales notes evidence of carbonation on the Martian surface and suggests the presence of anomalous amounts of mercury on Earth associated with mass extinctions.
  • Geocurrents maps the terrifying strength of California’s drought.
  • Language Hat notes that Cockney is disappearing from London.
  • Language Log notes coded word usage on the Chinese Internet.
  • Marginal Revolution links to a paper examining the effects of hunting male lions.
  • The Map Room links to new maps of Ceres and Pluto.
  • The Planetary Society Blog examines the Dawn probe’s mapping orbits of Ceres.
  • Progressive Download traces the migration of the aloe plants over time from Arabia.
  • Savage Minds notes how hacktivists are being treated as terrorists.
  • Window on Eurasia notes how the Ukrainian war is leading to the spread of heavy weapons in Russia, looks at Russian opposition to a Crimean Tatar conference in Turkey, suggests that the West is letting Ukraine fight a limited war in Donbas, and looks at the falling Russian birthrate.

[URBAN NOTE] “Free, fast Internet to come to New York City’s poor”

leave a comment »

Wilson Dizard at Al Jazeera America writes about the new initiative by New York City to make sure the poor will have access to free and fast Internet.

New York City announced Thursday that it will install high-speed broadband service in two public housing projects later this year, at no charge to residents, as part of a broader effort to shrink the Internet access gap between rich and poor.

“Whether you’re a parent looking for a job, a child working on a school project … broadband access is no longer a luxury; it’s a necessity,” said Council Speaker Melissa Mark-Viverito in a statement. “This effort helps close the digital divide and addresses the needs of the nearly 3 million New Yorkers who do not have access to broadband Internet at home.”

The first housing projects to be wired under this effort are Queensbridge North and South, in Queens, followed by Red Hook East and West Houses, in Brooklyn, and Mott Haven, in the South Bronx. The city says it hopes to bring high-speed access to 16,000 through the $10 million effort, giving them an alternative to using library computers or browsing the Web on smartphones.

The contractor for the broadband service has yet to be selected, a city official said, and the mayor’s plan does not include subsidies for computers to access the Internet.

The move comes as federal regulators have started to treat Internet access as an essential household utility, rather than an optional communications service. Earlier this year, the Obama administration successfully pushed the Federal Communications Commission (FCC) to regulate Internet traffic more like a public utility, such as telephone communication.

Written by Randy McDonald

July 21, 2015 at 10:59 pm

[LINK] On the worldwide fanbase of the Esperanto language

leave a comment »

In the post “Esperanto Fans”, Tyler Cowen of Marginal Revolution linked to an article in The Verge talking about how Esperanto lives even in the 21st century. The discussion at Marginal Revolution focuses on ways in which the language is still useful.

Like its vastly more successful digital cousins — C++, HTML, Python — Esperanto is an artificial language, designed to have perfectly regular grammar, with none of the messy exceptions of natural tongues. Out loud, all that regularity creates strange cadences, like someone speaking Italian slowly while chewing gum. William Auld, the Modernist Scottish poet who wrote his greatest work in Esperanto, was nominated for the Nobel Prize multiple times, but never won. But it is supremely easy to learn, like a puzzle piece formed to fit into the human brain.

Invented at the end of the 19th century, in many ways it presaged the early online society that the web would bring to life at the end of the 20th. It’s only ever been spoken by an assortment of fans and true believers spread across the globe, but to speak Esperanto is to become an automatic citizen in the most welcoming non-nation on Earth.

Decades before Couchsurfing became a website (or the word website existed), Esperantists had an international homestay service called Pasporta Servo, in which friendly hosts around the world listed their phone numbers and home addresses in a central directory available to traveling Esperantists. It may be a small, widely dispersed, and self-selected diaspora, but wherever you go, there are Esperantists who are excited that you exist.

It sounds hokey, but this is the central appeal of Esperanto. It’s as if the initial utopian vibes of the World Wide Web had never reached a wider audience. There’s no money, no power, no marketing, no prestige — Esperanto speakers speak Esperanto because they believe in it, and because it’s fun to speak a foreign language almost instantly, after a couple months of rolling the words around in your mouth.

The internet, though, has been a mixed blessing for Esperanto. While providing a place for Esperantists to convene without the hassle of traveling to conventions or local club meetings, some Esperantists believe those meatspace meet ups were what held the community together. The Esperanto Society of New York has 214 members on Facebook, but only eight of them showed up for the meeting. The shift to the web, meanwhile, has been haphazard, consisting mostly of message boards, listservs, and scattered blogs. A website called Lernu! — Esperanto for the imperative “learn!” — is the center of the Esperanto internet, with online classes and an active forum. But it’s stuck with a Web 1.0 aesthetic, and the forum is prone to trolls, a byproduct of Esperanto’s culture of openness to almost any conversation as long as it’s conducted in — or even tangentially related to — Esperanto.

Written by Randy McDonald

July 17, 2015 at 10:31 pm

[BLOG] Some Wednesday links

leave a comment »

  • Crooked Timber’s John Holbo wonders about people who are foxes and hedgehogs, following Isaiah Berlin.
  • The Dragon’s Gaze links to one examination of carbon and oxygen in exoplanet atmospheres and links to another noting how white dwarfs eat their compact asteroid and other debris belts.
  • The Dragon’s Tales notes that the dinosaurs disappeared in the Pyrenees amidst environmental catastrophe.
  • Joe. My. God. notes that Liberty University is liable for helping a woman hide her child away from her lesbian partner’s custody.
  • Language Hat notes an apparent mistake in prose.
  • Language Log examines new frontiers in negative negation.
  • Languages of the World notes the role of Dante in establishing an Italian literary language.
  • Marginal Revolution wonders what books contain the most wisdom per page.
  • The Search notes one librarian’s experience with web archiving.
  • Torontoist shares photos of the Pan Am Games.
  • The Volokh Conspiracy argues that genetic engineering of babies for IQ will occur as soon as the technology becomes possible.
  • Window on Eurasia notes that support is growing for an enquiry into the Malaysian Airlines shootdown, notes military reform’s stagnation in Russia, and looks at a Crimean Tatar meeting in Turkey.
  • The Financial Times‘ The World notes that Spain has come out weaker of this round of Eurozone negotiations.

Get every new post delivered to your Inbox.

Join 456 other followers