AIR ( Part Two ) = Artificial Intelligence in Recruiting
----------------------------------------------------------------
A few notes that I made in margins of a book ( written
in 2006 ) that I read in 2007
May be by now ( in Oct 2016
) , someone has already implemented the type of
EXPERT SYSTEM that I conceived in my notes
If not , here is a great opportunity for some Indian
Start Up !
I would be happy to guide , if requested
hemen parekh
hcp@RecruitGuru.com
28 Oct 2016
--------------------------------------------------------------------------------------------------------------------------------
Name of Book : Artificial Intelligence
Application Programming ( second edition ) / 2006
Author
: M Tim Jones
When read : 2007
--------------------------------------------------------------------------------------------------------------------------------
Page 3
We are planning that " Weights " of each
keyword gets automatically adjusted / updated dynamically , with arrival of
each new resume in our database . We only provide the " seeds " - the
starting weights
-----------------------------------------------------------------------------------------------------------------------------
Page 4
Inputs are " keywords " / Outputs are "
Raw Scores "
------------------------------------------------------------------------------------------------------------------------------
Page 7
" Staffing Requirement Prediction System "
This would be of interest to us .
Job Advts posted ( on any job site ) are nothing but
expressions of a Company's " Staffing Requirements "
If we have 5,000 job advts for a given Company ( over
last 3 years ), we should be able to find a time-series / trend and should be
able to predict its " future " requirements
Simplest is extrapolation of past trends !
-----------------------------------------------------------------------------------------------------------------------------
Page 46 - 47
We have talked of a concept like " Rubik's Cube
" , to place ( ie bring together ) on each " Face " , keywords
belonging to a given " Skill " or " Function "
Goal = Getting all " same coloured " keywords
( squares ) onto the same face of the Cube
Path = Time taken ( shortest )
Each " face " of our virtual " Rubik Cube " would
have SAME colour , if someone succeeds in virtually " Rotating " the
layers along the 2 axis of freedom until all squares are SAME colour - meaning,
that all the squares contain keywords belonging to SAME SKILL
We can "
time " the game , to see which visitor ( who ) , manages to get all keywords
( squares ) on the SAME face in the shortest possible time - then give him
recognition / credit by publishing his name on our web site
This could be a lot of fun - and could possibly draw a
lot of young kids to our web site !
There will be no log-in / registration for playing this
game . Just walk in and play .
Then download the COMPLETED / SUCCESSFUL image ( with
time taken ) & email to all friends to prove your cleverness !
----------------------------------------------------------------------------------------------------------------------------
Page 69
Addition / deletion of " which " and "
how many " keywords would result in a Raw Score better than " best
found " solution / resume ?
-------------------------------------------------------------------------------------------------------------------------------
Page 70
What keywords ( elements ) are always found "
together " ?
--------------------------------------------------------------------------------------------------------------------------------
Page 72
Is one resume a " solution space " in which
particles ( ie Keywords ) are swarming around ?
--------------------------------------------------------------------------------------------------------------------------------
Page 91
* Master set of 5,000 keywords
* Function wise
sets of 100 keywords
* " Skills
" and " Functions " are classifications
--------------------------------------------------------------------------------------------------------------------------------
Page 92
* Read my notes
on " Expert System " / Eg : Finding keywords which have NEVER occurred
before in ANY resume
How will s/w
KNOW that it is a NEW keyword ? and then KNOW , to which NEW skill / function
does it belong to ?
* Clusters are Functions
/ Skills / Cities / Designation Levels / Edu / Exp etc
* Obviously
WIPRO / INFOSYS / TCS ? Satyam , belong to a well-defined " Customer Set
"
They all have
" common attributes " ie:
> Similar /
identical job advts posted
> Similar /
identical resumes searched
--------------------------------------------------------------------------------------------------------------------------------
Page 93
* Each " subset
" can be , resumes belonging to :
> Same FUNCTION
> Same SKILL
> Same CITY
> Same EDU...etc
* Corporate are
" purchasing " resumes from our web site and we will have exhaustive
data on their " purchases " , ie:
> No and Types
of resumes transferred to Folders / Opened & Viewed
--------------------------------------------------------------------------------------------------------------------------------
Page 98
* We too are
planning a " Recommendation System " - which would recommend Job
Advts to Jobseekers and resumes to Recruiters
* If WIPRO HR
manager short listed / viewed / interviewed , such and such candidates , same
could also be of interest to INFOSYS HR
manager ?
-----------------------------------------------------------------------------------------------------------------------------
Page 113
* I have written
some rules .
----------------------------------------------------------------------------------------------------------------------------
Page 142 (
Chapter on " Ant Algorithm " )
* Can we
develop an " agent " ( ant ) for each Corporate Subscriber's , each
" job advt " , which travels to " distant places " (
different job sites ) and safely bring back " Food " ( resumes ) to its own " Nest " ( folder ) ?
* Millions of
Ants ( s / w agents ) let loose on Web Network , each programmed to find a
" specific " type of resume - then , when it finds , passing on this
data to the " next ( adjoining ant / agent ) , i.e. communicating
Then becoming
" free " again to search for next resume for which it is programmed
A resume may pass from one agent to another ( may be
several hundred / thousand ) , till it finally gets received by the Ant / Agent
who is " programmed " to find it in the first place !
So , it is not necessary that an Agent Ant finds only
that resume for which it is programmed as long as it finds any ONE resume any
where & COMMUNICATES ( passes on to the next / adjacent Ant )
Like packet switched network ?
All packets getting assembled by the DESTINATION ANT
Parameters stored getting matched with arriving resume's
parameters
--------------------------------------------------------------------------------------------------------------------------------
Page 143
In GuruGem ( improved HARVESTER ) , we are trying to
find web - records containing,
> Company
Name
> Designation
,
but from a single " source " , viz: Google
Our S / W agent " travels " to Google , finds
& brings back the results ( food )
But , could we possibly design / devise , millions of S
/ W Agents , each " programmed " to roam the web ( or predetermined
URLs ) and find,
> One of the
thousands of " Designations "
, OR
> One of the
thousands of " Company Names "
, OR
> One of the
lakhs of " Executive Names " ,
OR
> Any
combination of the above ,
AND,
then bring back the results ?
* Since most
job sites permit access to Job Advt Database to any visitor , without need of a
Password , it may be
much easier for Ant Agents to roam Job sites ( like
search engine spiders ? ) and bring home suitable job advts ( what we do , in a
limited manner )
-----------------------------------------------------------------------------------------------------------------------------
Page 165
Game = Online Recruitment / Searching for Employers
/ Searching for Candidates
Characters = >
Recruiters > Jobseekers
Agents = >
Job Advts > Resumes
Environment = Virtual
Job Fair /
Virtual Employment Exchange
( read my notes )
--------------------------------------------------------------------------------------------------------------------------------
Page 207
I believe Cyril used such an algorithm , to read a
plain text resume
The goal was to pick out the " Address " from
a plain text resume
The software did manage to accurately find the "
Address " ( from anywhere in the resume ) , after about 72 hours of
continuous " Exploring / Processing / Learning "
This was nearly 8 years ago !
I am sure , by now , far more powerful Neural Net
" Shells " can be freely downloaded , which would parse a plain text
resume to find accurately , ALL the fields / values, which we need to create a
structured database ( 27 Jan 2007 )
It is simply a question of experimenting .
And now , we
have no shortage of hardware
------------------------------------------------------------------------------------------------------------------------------
Page 230
Someday , ( when we have a million resumes on our web
site ), we will have many " sub sets " of candidates who started
their careers ( first job ) , at,
> same AGE,
and with
> same EDU Quali
Question is :
How did their careers " evolve " ?
What > Salary
> Designation, did each of them reach / achieve , after 5 years / 10
years / 15 years , of experience ?
Was there a " pattern " ?
Did their career paths run " parallel " or
did they diverge ?
If the paths did diverge , what other " factors
" ( eg; Employer Companies ? ) influenced such divergence ?
Did some plateau out after a time , whereas others
continued to climb the Desig / Salary ladder ?
{ see my hand-drawn
graph on this page for better grasp }
-------------------------------------------------------------------------------------------------------------------------------
Page 300
* With an ever
expanding set of " Rules " , it will be possible for a " Rule
based " system , to delete faked / forged resumes , where a candidate is
telling a lie re: some " facts "
* I have listed
several such rules . See my notes on " Expert System " - also my
handwritten notes in the margin of book " Expert Systems "
* In each resume
, each and every field value is a " fact "
Therefore , " Facts " are , Age / Exp / Desig
Level / Edu Quali / City / Skill keywords ( found in a resume ) / Salary /
Company Name ( Employer )
* One can start
with simple rules such as ,
> " Experience
( years ) " cannot exceed " Age "
> " Age
" cannot be less than " 16 "
> " Edu
level " cannot be less than " 10th Std / SSC "
>
" MD ( Managing Director )
" cannot be less than " GM ( General Manager ) "
> " Post Graduate " cannot be less than " Graduate "
> "
Salary " cannot be less than "
Rs 1,000 per month "
--------------------------------------------------------------------------------------------------------------------------------
Page 335
In developing Function Profile graphs , we are simply
depending on " presence " or " absence " of any given
keyword ( Binary State ) . Then assigning " Weightage " { of a given
amount or a weightage of ZERO } , depending upon presence or absence
But , in a real world, it may turn out that the
candidate with a lower " Raw Score " ( because of many " absent
" keywords ) , turns out to be a better choice ( higher interview score in
IIT ? ) , than a candidate having higher raw score ( where most keywords were
present )
In such a scenario , should we try " FUZZY LOGIC
" algorithm ?
--------------------------------------------------------------------------------------------------------------------------------
Page 352 { Chapter on NATURAL LANGUAGE PROCESSING }
" Dragon Naturally Speaking " s/w package
Read my notes : ARDIS / ARGIS
-------------------------------------------------------------------------------------------------------------------------------
Page 372
Beysian spam filter ?
based on a starting database of " unwanted "
keywords in the email messages
-------------------------------------------------------------------------------------------------------------------------------
Page 376
We must experiment with Kurzweil's paraphrasing
software
Could Ray Kurzweil's " Paraphrasing Software
" , be based on this ?
See web site of Kurzweil .
Using this , can we " rewrite " a resume (
like creating a step-brother ) from a given " Sample Resume " ?
If such " paraphrasing / re-writing " of
resumes can be done online ( automatically ) on our web site , then we could
add one more element of FUN - even if the rewritten resume contains some absurd
text !
In fact , such absurdity may lend an element of FUN !
[ Author's Words
: The source for parsing text into
bigram chains and then building sentences from them , is surprisingly simple .
The bigram is implemented as a simple two-dimensional array . Each dimension is
represented by unique words that were parsed from the text ]
--------------------------------------------------------------------------------------------------------------------------------
Page 385
Read page 376
If we succeed in paraphrasing / rewriting a text resume
online , we could generate / develop a " Subliminal / Subconscious "
function profile graph ! ( - since keywords would have got changed )
This could be fantastic !
--------------------------------------------------------------------------------------------------------------------------
Page 388
Eg : Agents of jobseekers negotiating salary / terms
etc with agents of recruiters
Virtual Job Fair will be / can be " Negotiating
Platform " for ,
> Buyer's
Agents (
Recruiters )
> Seller's
Agents (
Jobseekers )
--------------------------------------------------------------------------------------------------------------------------------
Page 389
Only yesterday we discussed that in , " Post Job
" form , recruiter will add :
" Put into my folder , all future / incoming
resumes , having percentile score of > ( greater than ) XYZ "
Now , he has created an Agent which checks incoming
resumes daily & puts into folder " Resumes of Interest "
Also , an email will go out to the recruiter concerned
( alert )
* Like our
GuruGrab ?
[ Author's words : Given mobility , an Agent could be
despatched to the remote database to automatically filter the results and then
return only what was required by the end user ]
--------------------------------------------------------------------------------------------------------------------------------
Page 440
A few weeks back, there was a report of a mechanical
spider-legged robot , which learned on its own to change its " gait "
when one of the legs was broken !
--------------------------------------------------------------------------------------------------------------------------------
Page 441
May be we can download if useful in our narrow domain
of Online Recruitment
[ Author's words : Cycorp has recently released OpenCyc
, which is an open-source version of the CYC technology.
OpenCyc includes a knowledge base ( 6,000 concepts with
60,000 assertions ), the CYC inference engine , and a number of language
bindings and APIs to support software development with knowledge base ]
-----------------------------------------------------------------------------------------------------------------------------
No comments:
Post a Comment