interbiznet.com
New interbiznet Bookclub
interbiznet
Find out more
Got a news tip?
|
Home | ERN | Bugler | The Blogs | Blogroll | Advertise | Archives | Careers
Another SIP (March 24, 2005) - In response to the other day's article about Statistically Improbable Phrases, old friend and industry wonder child Dave Lefkow (of Jobster) writes:
Let's start by saying we're still "thinkin' on" the question. Search, as executed in the online recruiting process, has always left us with a case of dry mouth. We're incredibly certain that the best solution is a rigorous application process with qualifying questions and the opportunity for alarm-driven human intervention. You get real talent when its fresh and on the line. The more a selection technique removes human intervention, the lower the resulting quality. Search is an answer to a stupid question. Having too many resumes in the file and not knowing what they are is a front end quality problem, not an overall process issue. Being data rich usually means being information poor. Search, in other words, is a technical response to a failed sourcing process. The current reigning search technologies in our industry are descended from Resumix (RIP). The fundamental idea is that a structured lexicon can be developed for a range of niches. Think of a structured lexicon as a thesaurus of all of the possible meanings of a set of words. The development of a structured lexicon (whether manual or automated) depends on an initial set of assumptions about the meaning of this word or that word. Our longstanding favorite "stump the search engine" question for these tools is the "Program Manager" question. We ask the vendor to show us how their tool discriminates between the Program Manager at a local non-profit and one in upper management at Boeing. One has a title granted to compensate for bad compensation. The other is a nearly immortal, mythical character with all of the power of a king. A resume for one job is an embarrassment for the other. We've never seen a search engine that could help with the problem. (We've seen plenty that claim it's technically possible with the right investment, it's just not implemented yet. That's what engineers always say when they mean no.) SIP, as we understand it, comes from a different perspective. Rather than assuming that it can define and capture meaning, a SIP index would simply be looking for the simplest array of things that made this document different from that. An SIP, in other words, is a linguistic fingerprint. It's a glimpse at a literary DNA without regard to meaning. The SIP question is "What makes this document unique?" "Having looked at all books in the world, this document seems to be unique. The occurrence of the following phrases make it observably distinct from any other document." This is a quantifiable, non-judgmental definition of the differences between documents. Tomorrow: How SIP might realign search
notions? John
Sumser
Take a quick look at Industry News, read the Bugler.
Home | ERN | Bugler | The Blogs | Blogroll | Advertise | Archives | Careers Copyright © 2013 interbiznet. All rights reserved.
|
Electronic Recruiting News
FEATURES:
RESOURCES:
ADVERTISING: RESOURCES:
RECENT ARTICLES:
Stocks We Watch:
|