Tuesday, July 13, 2004

Commentary back in a week

I will be sporadically blogging since I will be out of the office attending Microsoft's Global Briefing in Hotlanta until July 21st. I'll try gathering some interesting non-confidential tidbits to share on some cool MS technologies when I get back.

Spotlight on Spotlight

Mighty interesting read on Apple's new search engine and its management of metadata, a traditionally difficult but interesting computer science problem.
------------------------
Daring Fireball says Apple's Spotlight will be a real and a well-thought-out product:
Daring Fireball: Spotlight on Spotlight: ...two years ago... Apple hired Dominic Giampaolo, renowned file system design expert and creator of the highly-regarded, metadata-rich Be File System.... [W]hat then has Giampaolo been working on?... Spotlight — which is, in the words of one WWDC attendee, Giampaolo’s “baby”.... [T]he aforementioned source who attended the Spotlight session at WWDC sent me the following report:
Spotlight is completely, relentlessly focused on files and files’ metadata. Files are the only object returned to Spotlight queries. Two aspects of Jobs’ keynote were thus misleading: The “spotlight” effect on System Preferences was wholly unrelated to Spotlight. Spotlight’s ability to show results from Apple Mail archives on Jobs’ machine was tantamount to a sham. Believe it or not, Tiger Mail has switched to an “exploded” Maildir-like storage format with a single message per file.
One implication of Spotlight’s file-centricity is that its ability to search “email” might not apply to clients other than Apple Mail — it’s the fact that the new Tiger version of Mail stores each message as a separate file that allows Spotlight to effectively return individual mail messages as search results. No other major mail client uses a one-message-per-file storage format.
Spotlight’s full-text search is outsourced to SearchKit, which will be considerably faster in Tiger (“3x indexing, 20x incremental search” over Panther). So, Spotlight has three places to look for information about files: its own hand-tuned substring-matching metadata store (built by Giampaolo, not part of Core Data or anything else), Carbon’s HFS+ catalog calls (so Spotlight will respond to searches for type and creator), and SearchKit’s full-text index.

Both metadata collection and full-text indexing depend on cooperating per-file-format Importers, either written by Apple or by third parties. Like Google, no matter how much text an Importer provides, Spotlight only cares about the first 100K of raw text. Importers are fired on every file the moment it is created, saved, changed, or moved, including when files are made available through a newly mounted drive. Performance is said to be excellent in every case except network-mounted home directories, which are bedeviling on several levels and on which they’re still working.
It’s through the default set of Importers that Spotlight is able to index and search format-specific metadata, such as the ID3 tags in MP3 files. What’s cool about this architecture is that Spotlight’s indexes will thus stay up-to-date automatically. All you need to do is save, move, or copy a file, and Spotlight’s metadata and content indexes will note the changes on-the-fly. Compare and contrast to the full-content file searching previously provided via Sherlock, which required periodic monolithic re-indexing of the content of your drives.

Reading code is hard

Eric Lippert doles out good advice sharing some of his coding and debugging best practices.

Something cool from Microsoft

...about Potential and Passion.

Monday, July 12, 2004

Something funny from Sun

...found Inside Jack.

Slowly encircling

Israel’s illegal but unstoppable barrier

Economics of two Americas

Class warfare is being waged in the current elections, again.

Wal-Mart vs. Neiman Marcus illuminates with more analysis showing how economic growth is benefiting the wealthy and fast disappearing for the poor.
...the University of Michigan and the Conference Board both publish monthly gauges of consumer confidence...Both measures—produced by nonpartisan economists—find that Americans, on the whole, are confident—more optimistic, in fact, than they have been in two years. But they also found that while those with incomes above $50,000 have become more confident and optimistic, those with incomes below $50,000 have become less so.

The Conference Board shows the same split. In its most recent month, May, the index for over-$50,000 demographic was 112.1, the highest it's been since June 2002. But for those making under $50,000, confidence not only remains below its levels of July 2002, it has been falling in 2004. (Since January 2004, the confidence for the under $15,000 subset has fallen from 69.1 to 65.6; for the $15,000-$24,999 subset, from 85.2 to 69.3; for the $25,000-$34,999 subset, from 92.9 to 82.9; and for the $35,000 to $49,999 subset, from 95.2 to 93.6.)

...If the economy were undergoing a broad-based expansion, if a rising tide were lifting all boats equally, you might expect that trend to continue. But the views of the rich and poor are moving in opposite directions. The split results—the growing pessimism of the poor and the growing optimism of the rich—suggest the economy's improvement isn't helping everyone. That is bad news for a lot of Americans, but it may be good news for the Kerry-Edwards ticket.

Take CMU courses online

Carnegie Mellon is offering free courses through its Open Learning Initiative. Unlike MIT's OpenCourseWare which has 700 courses available, Carnegie Mellon currently only has five courses available but is fully interactive. Learn microeconomics here.

Sunday, July 11, 2004

On political accountablity

...where I also add my 2 cents.

The closing of the American book

Read a forceful piece, apparently the #1 most emailed article for the New York Times right now, speaking to the consequnces of Americans choosing not to read. Cable networks are no doubt happy about this trend as long as more eyeballs stay glued to the tube. I wouldn't be surprised if this impacts our productivity, innovativeness, and work ethic over the long-run, in turn engendering not so attractive demographics that is less educated and well-off. What goes will come around.