Computer literacy should absolutely be universal in the adult population and hopefully universally under development in the entire juvenile population. I’m not sure programming literacy is the most important component of this literacy. While there’s more than a little truth to the old saw “program or be programmed,” I’m currently more concerned about the level of data literacy in the public. Public policy discourse around counterterrorism, law enforcement and so much more is full of talk about “inter-agency information sharing,” while consumer issues getting a lot of airplay include things like “privacy policies,” “data brokers” and datasets said to have been stripped of “personally identifiable data.” As an inoculation or public health measure I would recommend a high school course, strongly encouraged if not required, that introduces database concepts, perhaps using SQL (which seems to me to have a gentle and intuitive learning curve, but my mileage may be atypical). I would want this introduction to take the students at least as far as the concept of a table join. That is because this is (in my opinion) where the “magic” of SQL happens. This is why having access to two datasets confers more than twice as much informational power (and Information *is* power) than having access to one dataset. Exercises for the kind of data literacy course I’m proposing would ask questions such as,
- How would you go about trying to infer individual identities from records in this dataset which has been stripped of identifying information?
- How would you go about devising a system for calculating a numeric “score” for each of the [people, products, locations, etc.] in this dataset, where your goal is that higher scores might be predictive of higher probabilities of [a crime taking place, a loan going into arrears, a consumer making a purchase, etc.]
- How would you go about building a recommendation engine? Again, I’d like to see the emphasis less on the coding and more on the choice of what data, and data relationships, to work into the recommendations and in what way.
One more thing: Far too many of the programming courses I have taken (in conventional colleges and universities) rely far too heavily on quasi-business problems that are grossly oversimplified and unrealistic. I seem to remember a “write a simple reservation system for a simple hypothetical airline” or something. No wonder there’s no such thing as an entry-level job. I would hope that for the data literacy course the datasets would be empirical, which is to say, real world data. I would also hope that at least some of the datasets would be large-ish. Keeping in mind that (unfortunately for my purposes) information does NOT want to be free, some class assignments may be data collection assignments, perhaps sending the students out to conduct some surveys, or keep a food diary, or do some GPS-surveying or what have you.
I suppose, like many in my age cohort (older generation X) I was a very early adopter of the Internet and later the Web and tend to have “old school” attitudes about many of these things.
I think of web pages in general as possessing two cosmetically similar (sometimes identical) things that serve different functions. One is links (<a> elements). The other is text spans styled the same as links, whose destinations are onclick handlers rather than URL’s. I’m actually overjoyed that someone wishes to use this concept for good rather than shady.
But still, I’m old enough to remember that the original idea behind CSS was that styling should be the prerogative of the audience, not the developers. If it were as simple as one of my browser settings is a master CSS template to go over every page I visit, mine would be something like
As for the blank target thing, I only ever follow links by clicking right mouse button, then select “open link in new tab”. It’s a grooved reflex with me, largely grooved by the stuff described elsewhere in the present comment. Nothing like pain to train the brain. Other grooved reflex is ALWAYS hovering any link/pseudolink (as the case may be) before (even right) clicking. But Facebook et al even spoof the title attribute to ≠ the href.
I think my age cohort (so-called generation “X”) was the canary in the coal mine. I’m over 50 and haven’t yet managed to land a job that’s not part time and/or temporary. Maybe it’s because I’m from a working class background and didn’t know any better than choose a liberal arts major. Maybe it’s because I’m something like fifth percentile in communication skills. But now upper middle class kids, even ones with communication skills, are settling for precariat jobs, or worse, piecework gigs. So the problem gets attention. Whether that attention will include public policy action, I do not know. It may already be too late. Usually when something becomes a problem for the upper middle class it gets recognized as a problem, but we may have reached the point where the threshold for having an opinion that matters starts in the upper upper class range.
I learned in a recent discussion on Fecebook that the semantic web was discussed during the Telluride Tech Fest back in 2002.
I’m not a member of the professional classes and don’t attend conferences and the like, but I certainly remember “semantic web” being quite the buzzword for a season or so. I looked it up a few times but never figured out for sure what it refers to. Just now looked it up on Wikipedia. It seems that distribution of machine-readable data is a large part of it. It seems to me that to the extent that machine-readability itself is monetizable (and it seems to me to be VERY monetizable) it will not be freely distributed. To quote James Alexander Levy, “For information to be free, the coordinates of the information must be free.” You can have all the speedy and public trials in the public record (as required by the US Constitution) but if machine readability of the public record is proprietary, there will be a business model for Intelius-like malignancies to offer $35-a-pop peeks at “the public record.”
I once downloaded and played with a MediaWiki plugin called Semantic MediaWiki. I found its markup schema too finicky and too labor intensive to be useful. Perhaps if I had a data entry staff… And there you have it. Workers deserve to be paid, so work product deserves intellectual property protections. But with data entry as a line of work degraded all the way to Mechanical Turk level piecework pay, you’d think machine readability would be too cheap to meter.
Perhaps it is, in terms of production costs, but the strategic advantages of information asymmetry (including asymmetric levels of machine readability) far outweigh the monetization opportunities of selling access to machine readability. To offer semantic web functionality as a product would be to leave money on the table, so instead of semantic web we have “big data.”