China Battles the “Information Barbarians”?

China Battles the Information BabariansIan Baruma has a thought-provoking article in yesterday’s Wall Street Journal, entitled, “China Battles the Information Barbarians“. The author puts Google management’s decision to stop cooperating with Chinese censors and their threat to close down its operations in China in historical context. According to Baruma, the Chinese government has battled Western Information Imperialism since the Jesuits first arrived in China in the 1600s. This latest round is just one more in the battle that has been going on for about 400 years between the Chinese government-in-power’s desire to adopt Western technology and those same gatekeepers who want to keep out ideas they consider dangerous. How can you let in the ideas and technology you want but keep out “subversive” material? The article takes into account economics, technology, politics, and culture in terms of how the Chinese government manages the information accessible to its citizens and tames the data deluge.

China has an amazing history, but I’m skeptical of the idea that you can entirely control information. The author’s main point reminds me of the phrase, “have your cake and eat it, too”. I think the government can have some control over information, but I’d like to think they can only gatekeep so much. Networks have many layers to them and someone who knows what they are doing can bury a virtual network — or so I want to believe. I am not convinced that you can control what you let in and keep out regarding technology and information beyond a certain point — but don’t ask me to define what that point is.

The accompanying video demonstrates how Chinese Internet censorship is effected. For example, YouTube, Facebook, and Twitter have been inaccessible in China since spring 2009. Apparently, I’m wrong, and you can, indeed, control Internet access to a variety of information sources. (IP blocking, hello?)

What do you think? Is it possible to have your technology and control the effects, too? If you control the physical & virtual gates to information, can you truly control what goes in and out? And, as one commenter pointed out, how do Google’s actions benefit the average Chinese citizen? Are Westerners being “Information Imperialists”?

Data Privacy Day 2010 — Your Online Reputation 28th, 2010 was International Data Privacy Day.

The idea behind the day is to raise awareness of the need for data privacy, and to encourage “dialog among all of the stakeholders — businesses, individuals, government agencies, non-profit groups, academics, teachers and students –- to look more thoroughly at how advanced technologies affect our daily lives.”

So, what is Data Privacy Day? The authors at write:

Data Privacy Day is an international celebration of the dignity of the individual expressed through personal information. In this networked world, in which we are thoroughly digitized, with our identities, locations, actions, purchases, associations, movements, and histories stored as so many bits and bytes, we have to ask – who is collecting all of this – what are they doing with it – with whom are they sharing it? Most of all, individuals are asking ‘How can I protect my information from being misused?’ These are reasonable questions to ask – we should all want to know the answers.

(I brought up my own complicity in and ambivalence about the loss of privacy in my online life in a post earlier this week called, “Facebook’s Bait & Switch and User Complicity“.)

As one part of this dialog about the need for personal data privacy, Microsoft conducted research to determine whether or not HR employees consider a candidate’s online reputation when deciding to hire or reject a candidate. The answer was a resounding, “yes”. Your online reputation matters a great deal. Microsoft published the results of this research about “how people manage the information they and others place on the Internet” online as an overview in .pdf and in more detail via .ppt.

Microsoft provides a guideline to take charge of your online reputation. They also interviewed people about whether or not your online reputation should affect your candidacy for employment. Their thoughts are available for viewing in the video below or online here (if the video won’t play).

Get Microsoft Silverlight

What do you think? Should your online life affect your candidacy for a job?

Beginning a Series — Reviews of Open Data Sites

binary codeI will be reviewing English-language, government-sponsored open data sites as an off-shoot of my doctoral work. I will begin initially with the “key” government sites compiled by the authors of The Guardian‘s DataBlog as one of their inaugural posts.

Last week I reviewed, so I while I may add a bit more detail to my initial review in a second post, I will not completely re-review it. The sites I will review in the upcoming weeks are:

I will also do some searching of my own and see what else I can locate that is an English-language, government-sponsored Open Data web site. However, if any of you know of any sites that I do not have listed above, please do send them to me! (And, “thanks!” in advance.)

So…what do I mean by “review”? I plan to examine the number and types of data sets made available, policies for use and re-use, “other” policies, and, the overall “look & feel” and usability of the site(s). I will also discuss “anything else” I find interesting.

Is there something in particular you’d like me to add to my review criteria?

Is “Information Management” Hype?

I enjoyed watching this video from 2008. It begins with a variety of quotes and mis-quotes by technology experts beginning in 1899, using an early 1900s moving pictures style of graphics. The author then uses images of streams of data and a catchy Elvis song to throw (unsourced) facts and figures out about data use, expected data use, and, data quality.

What caught my eye were the figures on data quality, or, more precisely, the lack thereof. Do you think that only 14% of organizations have completely accurate data? Should that figure be higher or lower?

Is Web 3.0 About Taming the Deluge of Data? Social Path has an interesting post about how Web 3.0 is about “taming the deluge of data”. The author(s) wrote the post based on a presentation the author(s) had seen recently by Andrew Keen.

The author(s) write that three trends are defining “3.0”:

  • Aggregators: one point of entry to multiple social network sites;
  • Simple Sharing: easy ways to share something with friends and family on said multiple social network sites; and,
  • “Un-Sites”: a search engine re-direct will take you to an aggregation of online information related to that shop.

I agree with the first two points re: aggregating and sharing. I do think those will help users access and share their data. As for “un-sites”, well, I can see it working for the “hip” crowd, I’m not so sure I’d take a corporation or other organization seriously that uses a “splat” method as an online presence. It is one way to organize your online presence, albeit an ugly-but-cute one.

Earlier in the post, the authors quote Andrew Keen.

The best explanation I’ve heard was from Andrew Keen, author of “The Cult of the Amateur.” In a recent Social Media Club presentation here in Birmingham, Andrew broke out the Web’s history like this:

Web 1.0: Mainstream media and retailers dominate, using traditional approaches to broadcasting and sales.

Web 2.0: Blogging, peer-to-peer sharing and Google empower the masses to communicate openly. The old guard struggles to remain relevant.

Web 3.0: Mainstreaming of social media creates a constant flow of information. Challenge for users and businesses alike is to harness the flood without drowning.

I don’t agree with Keene’s assessment of Web 1.0. I’m going to nitpick. First, Phase 1 of the Web was about Research & Development and allowing Public Access to the Internet and Web. It was about Berners-Lee creating hyperlinks and the development of SGML ==> HTML + XML. The Internet was originally a DARPA/ARPA project. Web 1.0 is about the government handing over the Internet to the private sector and opening it up to the business sector for use. Now, if Keene would like to use the term, “public web”, then I would agree with his points regarding Phases 1-3.

In my opinion, there are 4 phases. Phase 1 is the R&D phase, development of the Web as part of the Internet infrastructure, and the transition from research organization access only to public access to the Web and Internet via private companies. This also means that “Web 2.0” is really “Web 3.0”, and that the next transition is to “Web 4.0”. I expect I am a minority in this opinion, and that my “Web 1.0” is simply “pre-Web 1.0” to most users.

Second, many of the major retailers were slow to make the jump to the Web, and didn’t begin to dominate it until the latter stages of Web 1.0/early 2.0, when they finally figured out what to do with a web site. Remember how quickly Microsoft had to scramble because the company’s executives didn’t see the coming of the public Internet and anticipate their customer’s interest in using it?

If we are, indeed, moving to the next phase of the Web, I would like to see some discussion related to the ephemeralness of it all. What do you keep? What do you throw away? Storage is cheap, and, indeed, getting cheaper, but why pay to keep petabytes of data/information stored, migrated, emulated, etc., when you neither need nor want it? What happens to your online life when you die and who controls it? How do you sift through all the chaff to find the wheat?

Aggregating your online presence via a search engine redirect is a nice trick. Can we also deal with some of the more serious questions on this round of the Web, rather than just more technical evolution? {Note: I love technology, but there are limits.}

I know, I know. We won’t, because it isn’t sexy.

The Digital Dilemma

The Digital Dilemma - AMPASIn the fall of 2007, the Science and Technology Council of the Academy of Motion Picture Arts and Sciences (AMPAS) released a report entitled, “The Digital Dilemma“. In a nutshell, the Council tackled the topic of archiving digital movies. They examined how this could be done, what the costs would be, and how these methods and costs compared to CMYK.

Over time, this report has proven to be one of my favorites. The authors miraculously kept it at 70 pages, but managed to cover a lot of information within those few pages. I also tip my hat to them for battling the politics between and within L.A. movie studios, so that they could output a usable document with a set of recommendations that can be adopted across and outside of the movie industry.
Continue reading “The Digital Dilemma”

Facebook’s Bait & Switch and User Complicity

mark-zuckerbergI admit I was shocked when Facebook announced last month that the new default for users would be a complete lack of privacy, unless you had or did set your privacy controls to shut out anyone but your friends. Librarians have a very strong notion of patron privacy that spills over even to us Information Scientists. Long before the company’s founder announced changes to the privacy of users, I had put controls on who could see what.

I was shocked to learn my friends list would be made public, along with other personal information. I could understand if my name and address were made public, but my friend list? That is the equivalent of “someone” printing one’s personal address book in a newspaper, page by page. I felt that Facebook’s founder had pulled a bait and switch. Somehow, I think that if Zuckerberg had established a “no privacy” policy at the outset, his company would not have succeeded.
Continue reading “Facebook’s Bait & Switch and User Complicity”

An Information Management Fairy Tale

This is a story about a young dragon, Data Quality, who settles in a shire, far, far away. He seemed harmless at first, so he was ignored. Then he grew into a menace, and the villagers hired a knight to fight the dragon and found the fountain of knowledge.

And they all lived happily ever after.

[Thanks, Alex K. via @infoholic]

HM Government Opens Up Government Data to the Public British Government has released data sets to the public for use in either the public or private sectors at

Previously, the governments of the United States, Australia, and New Zealand had created data sites for use by the public, including commercial use. The primary idea behind the release of these data sets is that publicly funded data ought to be made available to the public for free for re-use. The site creators hope that individuals and businesses will use the data creatively to add economic value and generate new services. Sir Tim Berners-Lee and Professor Nigel Shadbolt led the project in the UK.

The Guardian has posted a video interview with Berners-Lee and Shadbolt. Shadbolt gave an example of one re-use of this data by the public: an online route-planning tool that helps cyclists avoid areas where cyclists have the most accidents. Both project leaders discuss how the project developed, why they wanted to put government data online, why the data was released for free, and their hopes for data re-use.

The Open Data Principles the creators state on the site are as follows:

  • Public data will be published in reusable, machine-readable form
  • Public data will be available and easy to find through a single easy to use online access point (
  • Public data will be published using open standards and following the recommendations of the World Wide Web Consortium
  • Any ‘raw’ dataset will be re-presented in linked data form
  • More public data will be released under an open licence which enables free reuse, including commercial reuse
  • Data underlying the Government’s own websites will be published in reusable form for others to use
  • Personal, classified, commercially sensitive and third-party data will continue to be protected.

Currently, the site is set up for users to run basic searches on just under 150 data sets. There are around 20 applications listed for use. I browsed through the available data sets. The available topics begin with 2008 Injury Road Traffic Collisions in Northern Ireland and end with a Youth Cohort Study & Longitudinal Study of Young People in England.

I look forward to following this project, seeing what data is added, and what re-uses of the data are made. I have not attempted to use any of the data sets, so I cannot report on any success or problems I have had with using them. If you have used or do use any of these data sets or applications, please let me know.

[Thanks, Jennifer M.]