tnezami All American 8972 Posts user info edit post |
I have a list of about 30 words that I need to count the frequency of in about 50 different documents.
I'm sure I'll have to go one document at a time, but is there a faster way than CTRL+F EACH WORD one at a time?
Can I put the entire list of these 30 keywords in and have WORD or some free downloadable program find them in the document? 4/1/2007 3:28:07 PM |
drunknloaded Suspended 147487 Posts user info edit post |
i dont know if this will help but one time i downloaded google desktop
like a few days later when i serached something on google i noticed it had like 3 of my documents listed with the keywords from my google search...so like couldnt you get google desktop, serach for the terms and it would bring up which documents have said words?
[Edited on April 1, 2007 at 3:32 PM. Reason : \/ prolly not the best one though ] 4/1/2007 3:30:36 PM |
tnezami All American 8972 Posts user info edit post |
hmm...that's an option... 4/1/2007 3:31:26 PM |
benz240 All American 4476 Posts user info edit post |
copernic desktop search has a much more powerful client 4/1/2007 3:33:38 PM |
tnezami All American 8972 Posts user info edit post |
is it relatively easy to use/set up for this type of thing? 4/1/2007 3:34:19 PM |
benz240 All American 4476 Posts user info edit post |
it will work, just might get pretty tedious if you are talking about hundreds of instances of the word. Basically the copernic client (i just tried this out) will allow you to search for instances of a word (or words) within certain types of files within one folder, for instance. then when you click on one of the results, it will display the contents in the preview pane below, with a button for each search term...click on that button and it will cycle through all the instances in that particular doc. It doesn't have a function (that im aware of) to summarize the findings...so basically this will be a little easier than Ctrl+F. 4/1/2007 3:44:32 PM |
tnezami All American 8972 Posts user info edit post |
sweeet...i just downloaded it...now I just need to figure out how to get it to search within a specific folder
[Edited on April 1, 2007 at 3:49 PM. Reason : got it.] 4/1/2007 3:45:25 PM |
CSC4EVER Starting Lineup 63 Posts user info edit post |
Why not just write a perl script to parse the documents, and to create hashes of each one incrementing the number of times the words occur for each document? 4/1/2007 4:22:32 PM |
Perlith All American 7620 Posts user info edit post |
^ Overengineering the problem. And, perl isn't something that could be passed on easily from Person A to Person B. (Not saying not a good solution, but may not be the right solution).
If this is something you'll be doing frequently, I would encourage you to find a solution that indexes and updates what you are looking for automatically. Not sure what's out there, but think in the long-term if you are going to do this more than once. 4/1/2007 5:15:59 PM |
CSC4EVER Starting Lineup 63 Posts user info edit post |
A program that does something similar could be whipped up fairly quickly in java, and a swing gui slapped on it, and it would be just as portable. If he wants something to do the results fairly quickly, writing a perl script would take probably no more than 10-15 lines, maybe even less to accomplish. I don't necessarily consider that over engineering a problem. 4/1/2007 5:19:06 PM |
CSC4EVER Starting Lineup 63 Posts user info edit post |
Hmm, here is some code I whipped up real quick in perl. It sorta does what I think you would be looking for. With some modification you should be able to get your results fairly quickly...
@files = <*.txt>; foreach $file (@files) { open myfile,$file; print $file."\n--------------------------\n"; while ($line = <myfile> { chomp $line; @words = split / /,$line; for ($i = 0;$i<scalar(@words);$i++) { $words{$words[$i]}++; }
} @words = keys %words; @words = sort @words; foreach $word (@words) { print $word." ".$words{$word}."\n"; } print "\n"; } 4/1/2007 5:31:48 PM |
WolfAce All American 6458 Posts user info edit post |
Haha, stupid tdub putting a smiley face in teh perl
Quote : | "while ($line = <myfile>" |
And yeah the first thing I thought of when I read the first post was perl script, but really you could do one nearly as easily in any language.
[Edited on April 1, 2007 at 7:45 PM. Reason : ]4/1/2007 7:43:19 PM |
FenderFreek All American 2805 Posts user info edit post |
Seems we all think alike. I had something like ^^ in mind as well, so I'd go that route. It's quick, easy, and cross-platform. (Though I'm no Perl god, so mine wouldn't have been as neat and concise.)
Can't go wrong with some Perl, B. 4/1/2007 8:25:37 PM |