Note dt : 09 June
1998
My first note on ARDIS ( Artificial Resume
Deciphering Intelligent Software
) was written on 01 Dec 1996
Some 18 months later , I sent following note to Yogesh
/ Cyril , who had translated my notes / U-Is / logic into
www.3pJobs.com ... and launched it on 14 Nov
1997 - some 10 months before GOOGLE got launched officially !
--------------------------------------------------------------------------------------
Uploaded : 04 Nov
2016
-----------------------------------------------------------------------------------------
Yogesh / Cyril ,
ARDIS
While discussing the " Data Capture & Query
" Module ( Module # 1 ) , a few days back , we also talked about the
" Knowledge Base
" already available with us . This knowledge base has been acquired /
created over last 8 years
This knowledge base comprises of English Language ,
* Words
* Phrases
* Sentences
* Paragraphs
As far as " Words " are concerned , I myself
worked on " Categorizing " them in different " Categories "
This was nearly 12 months ago , using software tool
" TELL ME
" , developed by Cyril
In this connection , I enclose Annex A / B / C / D
Under " TELL ME " , I have already " categorized "
over 15,000
words into some 60 different " Categories "
Some of these are shown in Annex C
In addition , Cyril had developed another simple method
, under which , I could quickly categorize :
* P =
Person's Name ( " Name " of a person )
* C =
Company Name
* Q = Edu
Qualification of an individual
* L = Name
of a Location ( mostly , a CITY )
As far as these 4 categories ( out of 60 odd categories of words
) is concerned , I have already covered :
----------------------------------------------------------
FREQUENCY
...................... No of
Words Covered
----------------------------------------------------------
> 100
...................................
7,056
51 - 100.............................. 3,913
26 - 50 ........................... 5880
11 - 25
.............................
13,246
-------------------------------------------------------------
TOTAL ............................... 30,246
-------------------------------------------------------------
( See Annex
A ) . These are ISYS-indexed words
So ,
under both the tools combined , I might have already
" categorized " over 30,000 words
Over the last 5 / 6 weeks , we have already scanned /
OCRed and created .txt
files of some 13573 pages of bio data .
And this population is growing at the rate of some 300
pages per day
We talked about a simple software which will pick - out
ALL the words ( except for " common " words ) , in each of these page
, then,
compare each such word with the " Knowledge Base " of 30,000
words which I have already " categorized "
If a " match " is found , the word is
transferred to respective " category " and marked " KNOWN "
If there is " no match " , the word gets
tagged as " NEW " and gets highlighted in the .txt file
Now , anytime , any consultant is viewing that page on
the screen and comes across a " NEW " marked word , whose "
Meaning / Category : he knows , he will have a simple " Tool " ( on
that very screen ) , with which he will go ahead and " categorize "
that word . This TOOL could be perhaps " TELL
ME "
We should debate whether we also give the " rights
" to any consultant to " ADD " a new CATEGORY , itself
It should be possible for any number of consultants to
work on this TOOL , simultaneously , from their own individual work-stations ,
whenever time permits or whenever they are " Viewing " a .txt page for any reason
This arrangement would " multiply " the
effort several times as compared to my doing it " single-handedly " !
PLUS ,
It has the advantage of using the knowledge of several
persons having different academic / experience background
We could also consider hiring " Experts "
from different " Functional Areas " , to carry out ( this
categorization ) in a dedicated manner
Now that we have 13,573 pages ready ( for this simple "
match making " process ) , we could seriously consider " hiring
" such " experts "
We could even take " Text Books " on various SUBJECTS /
CATEGORIES and prepare an INVENTORY of all words appearing in each book and
put them in the SUBJECT category
Many innovations are possible - if only we could make a
beginning
Such a beginning is possible now
Let us give this a serious thought and discuss soon
regards,
hcp
-------------------------------------------------------------------------------------------------------------------------
No comments:
Post a Comment