Arab13 Art Vandelay 45180 Posts user info edit post |
Ok, here at work I have a bunch (read 100's-1000's) of simple .txt files that have several columns.
the 5th column in has numbers under a constistent label I will call a ptid.
i need a way to grep all the data for each ptid # in the line and compile it together with data that is associated with the same ptid # accross multiple files.
so it would go from many files that look like this
PROTOCOL # LABID(# and txt) ASSAYID# SPECID # PTID# VISITNO# VISITDAY# DRAWDT# ASSAYRUN(txt) ANTIGEN(txt) DILUTION# READING# READRANG# NOANT(#) NOANTRAN# TITER# TESTDT# RELIABLE(txt) REPLACE MODDT(txt) COMMENTS(txt)
each being a column
to this
PTID# - all the data in some order across for this PTID# in every file
8/16/2006 9:40:01 AM |
MiniMe_877 All American 4414 Posts user info edit post |
perl + 5min work = $profit 8/16/2006 9:52:18 AM |
agentlion All American 13936 Posts user info edit post |
excel - macro to import files, then some formulas or another macro or pivot table to aggregate all the data 8/16/2006 9:57:31 AM |
Arab13 Art Vandelay 45180 Posts user info edit post |
yeah i was thinking perl or excel.... 8/16/2006 10:00:09 AM |
agentlion All American 13936 Posts user info edit post |
to see how to do the import, open excel and start recording (Tools > Macro > Record New Macro), then open the .txt file and import it using tab delimited or whatever. Then look at the macro that was generated, and it will probably be just a single function call for opening a file. You'll have to surround that in a loop that passes in the name of the files. Then you'll have to add in the actions for the column you want (do a lot of Macro Recording, performing actions, then reading the generated code). 8/16/2006 10:05:57 AM |
Perlith All American 7620 Posts user info edit post |
Assuming your PTID is your primary key, are the columns different in each text file? If not, how are you going to differentiate Protocol # in Text File A from Protocol # in Text File B? 8/16/2006 11:09:10 AM |
joe17669 All American 22728 Posts user info edit post |
Quote : | "perl + 5min work = $profit" |
is perl really that powerful/quick/efficient? (assuming someone knows what they're doing?)
I think I need to learn this8/16/2006 11:11:21 AM |
agentlion All American 13936 Posts user info edit post |
yes, Perl can grep/search/regex/whatever a 1000 text files in no time. 8/16/2006 11:18:47 AM |
Arab13 Art Vandelay 45180 Posts user info edit post |
indeed, no all the columns should be the same for all the files.... but not all columns will have entries....
pretty sure.... there are several dozen different 'formats' that i need to look through to verify that....
and yes, perl if scripted correctly can do that.... 8/16/2006 2:34:10 PM |
qntmfred retired 40726 Posts user info edit post |
perl for sure 8/16/2006 3:21:08 PM |
Arab13 Art Vandelay 45180 Posts user info edit post |
rather there will be 4 'compliations' of the data, each having a different form, each of the forms pertains to specific file sets so all the data for each will be uniform in organization 8/17/2006 1:03:41 PM |
Arab13 Art Vandelay 45180 Posts user info edit post |
fuck i had a excel IF formula set up where it would remove duplicate ptids like this
12040190 12040190 12040190 12040196 etc
and turn it into
12040190
12040196
but i fucked it up somehow and lost the IF statement.... and i had it working too.... 8/31/2006 9:29:51 AM |
Arab13 Art Vandelay 45180 Posts user info edit post |
n/m fixed it 8/31/2006 9:42:16 AM |
agentlion All American 13936 Posts user info edit post |
assuming you have your list starting in A1, the formula starting in B2 and filled down is =IF(A2=A1,"",A2) that will replace the duplicates with blanks 8/31/2006 9:45:33 AM |