Grandmaster All American 10829 Posts user info edit post |
Basically I have a bunch of Belarc Advisor (installable program) printouts and html's saved that need to be put into an excel file. Every time I start, I'm like fuck this redundant data entry bullshit. Does anyone have thoughts on a more efficient way of doing this? The information I'm after is very basic. Computer name, Processor, OS, RAM, HDD, etc. 1/14/2009 1:33:46 PM |
darkone (\/) (;,,,;) (\/) 11610 Posts user info edit post |
Write a perl script to parse all the data and compile it all in one delimited file that you can then just import into Excel. 1/14/2009 1:39:08 PM |
Grandmaster All American 10829 Posts user info edit post |
yeah even if I could stumble my way through a VB script I'd probably be ok, but alas programming is simply not my forte. I need to find a program out there that already does this. 1/14/2009 2:07:46 PM |
smoothcrim Universal Magnetic! 18966 Posts user info edit post |
windows or *nix? 1/14/2009 3:16:50 PM |
Novicane All American 15416 Posts user info edit post |
a simple java program could work. I think java reads line by line, just have it parse each line based on whatever you need. Lots of examples on their api. 1/14/2009 3:24:05 PM |
synapse play so hard 60939 Posts user info edit post |
http://www.google.com/search?hl=en&rlz=1C1GGLS_enUS291US308&q=hardware+inventory+network&btnG=Search http://www.google.com/search?hl=en&rlz=1C1GGLS_enUS291US308&q=hardware+inventory+network&btnG=Search http://www.freedownloadmanager.org/downloads/network_inventory_software_software/
*shrug*
http://www.lansweeper.com/ 1/14/2009 3:45:40 PM |
Grandmaster All American 10829 Posts user info edit post |
I totally must have been googling the wrong phrases.
I barely got a C in intro to java 1/14/2009 3:56:22 PM |
synapse play so hard 60939 Posts user info edit post |
well that lansweeper looks pretty cool
at least they have pretty graphics and it looks like for $150 you get a ton more functionality if you need it... 1/14/2009 4:02:32 PM |
Grandmaster All American 10829 Posts user info edit post |
yeah unfortunately it only looks applicable on a domain. which a lot of the locations are on, but some are not. (I think you need administrator access to grab machine and system info remotely) 1/14/2009 4:06:53 PM |
Tiberius Suspended 7607 Posts user info edit post |
Quote : | "write a perl script blah blah blah ellipses ellipses" |
correct answer... depending on how the HTML files are formatted and laid out, this may be as simple as stripping the html tags and replacing \n with , to create a CSV record
@list_files = <*.html>;
foreach $record_name ( @list_files ) {
open( $record_fh, "<", $record_name ) || die "oh shit";
while ( $record_fh ) { s/\n/,/; # replace newline with , s/<.*".*">(.*)</.*>/$1/u; # remove the open element and close element tags print; }
close( $record_fh ); print "\n"; # end of record
}
treat the above as psuedo-code, it will almost certainly run but those two substitutions are untested and might make inaccurate assumptions 1/14/2009 8:00:39 PM |
Tiberius Suspended 7607 Posts user info edit post |
you know, go ahead and drop the ".*" part
that would be to assume the tag has at least one property defined and that its value is double quoted, which is a bold assumption 1/14/2009 9:17:24 PM |
synapse play so hard 60939 Posts user info edit post |
BOLD!
I'm guessing that if he isn't cool with vb he's not cool with perl 1/14/2009 10:44:01 PM |
smoothcrim Universal Magnetic! 18966 Posts user info edit post |
Quote : | "windows or *nix?" |
either way, I wrote both. shell/batch scripts are the way to go. you'll need local admin to run calls on either. you can use wmi + nmap and a wrapper script to do a discovery+scan. wmi can output to html, xml (preferred), and plaintext.1/15/2009 12:18:03 AM |
Tiberius Suspended 7607 Posts user info edit post |
Quote : | "I'm guessing that if he isn't cool with vb he's not cool with perl" |
The skeleton I gave there is a great start towards doing what he wants, it will load all .html files in a directory and perform a basic translation to .CSV, which can be redirected to a file and loaded in Excel. It's kind of like the tools being proposed here, except it's free, right there, can be easily modified, and doesn't require collecting the data a second time.
That said, if there are superfluous newlines or extra data in some files the columns may be a bit off, which would require tweaking the substitutions to better identify html elements that contain data. That might take 5-10 additional characters in the regexp, and if he wants to paste a sample report from one of the workstations I'm sure someone else or myself would be more than happy to update the snippet.
It would probably be less effort at this point to install perl, run the provided script on the data he already has, and tweak the substitutions a bit if necessary than completely scrapping the data he has, migrating one or more remote locations to AD to use the suggested tools, etc etc. Perl isn't just superglue for Unix, it's primarily a tool for data extraction and report generation. This scenario is pretty much exactly what the language was written to treat.1/15/2009 8:52:53 AM |
Grandmaster All American 10829 Posts user info edit post |
I definitely appreciate all of the input and I didn't have time to submit the html of my PC this morning. I wanted to include the entire thing, so if I missed anything that would be considered a breach of security, someone here let me know before it's been taken advantage of. The HTML generated by Belarc is (ComputerName).html
http://www.elusivity.com/(Servosity).html
Quote : | "It would probably be less effort at this point to install perl, run the provided script on the data he already has, and tweak the substitutions a bit if necessary than completely scrapping the data he has, migrating one or more remote locations to AD to use the suggested tools, etc etc. Perl isn't just superglue for Unix, it's primarily a tool for data extraction and report generation. This scenario is pretty much exactly what the language was written to treat." |
well fuckin' said.
I probably have 30 reports right now with another 50 to come. My boss wants me to manually enter them, but even if it takes me more time in the long run but exponentially less "work" I always try to figure out this type of solution. After reading all of this, I definitely should reconsider learning at least some type of basic (lolzpun) language skills.
[Edited on January 15, 2009 at 12:10 PM. Reason : .]1/15/2009 12:05:06 PM |
Tiberius Suspended 7607 Posts user info edit post |
It looks like each category is surrounded by a <FONT> tag, headings are either in <B> or <I> tags that can be filtered pretty easily, and most of the categories seperate items with <BR> tags, but there's a variable number of items per category. In this case there are 5 volumes, while I imagine most systems have 1-2. I suppose the most straight-forward approach would be to read all of the records to find the largest # of items in each category (e.g. the computer with the most volumes), then re-read all of the records to generate the CSV, padding records that had fewer than the max items for that category with blank columns. This is to say that a system with only 2 volumes might have a col for vol1, col for vol2, and then 3 blank columns appended so that the next category lined up on the same column as your system with 5 volumes... but I guess it'd be prudent to ask how much of the data you need and how the spreadsheet should be laid out
[Edited on January 15, 2009 at 10:32 PM. Reason : *] 1/15/2009 10:31:03 PM |
Grandmaster All American 10829 Posts user info edit post |
O/S Office Version (not that important) System Model Processor RAM Drive Space
that's pretty much it i guess. The keys and such would be nice but if it's going to create problems.
Spreadsheet can be laid out in any way that appears organized or that is conducive to manipulation. 1/15/2009 10:55:33 PM |
Grandmaster All American 10829 Posts user info edit post |
Anyone want to tailor this script working, or walk me through, before I concede defeat and start countless hours of data entry? Tiberius, your help has been more than I would have anticipated from a thread like this, but I don't know enough about the language to debug the errors from that first go around and I don't have the adderall to sit down for hours with trial and error.
I can take the legwork out of making whatever DVDR's you guys have queued next or throw you some other form of FXP'd pirated goodz. 1/22/2009 9:38:25 AM |
WolfAce All American 6458 Posts user info edit post |
Quote : | "I'm guessing that if he isn't cool with vb he's not cool with perl" |
Yeah even if you're real rough with programming, hacking your way through some regex scripting in perl to learn it in the long run will be well worth the effort1/22/2009 9:55:18 AM |
Tiberius Suspended 7607 Posts user info edit post |
sorry I got busy and haven't been able to follow up with this
if nobody else has chimed in I'll post something working tomorrow 1/22/2009 10:17:35 AM |
Tiberius Suspended 7607 Posts user info edit post |
#!/usr/bin/perl
@list_files = <*.html>;
foreach $record_name ( @list_files ) {
open( $record_fh, "<", $record_name ) || die "oh shit";
print "$record_name\n"; @lines = <$record_fh>; close( $record_fh ); $record = join( "\n", @lines );
$record =~ m/\<TD.*\>Operating System.*?<\/TR>.*?<TR.*?FONT SIZE.*?>(.*?)<\/FONT><\/TD>.*?<TD.*?FONT SIZE.*?>(.*?)<\/FONT>.*?<TR.*?Processor.*?\/TR>.*?<TR.*?FONT SIZE.*?>(.*?)<\/FONT>/ms;
$os = $1; $model = $2; $cpu = $3;
$record =~ m/<TR.*Drives.*<\/TR>.*?([0-9]+\.[0-9]+) Gigabytes Usable.*?<.*?FONT SIZE.*?>([0-9]+) Megabytes Installed/ms;
$disk = $1; $mem = $2;
print "$os,$model,$cpu,$disk,$mem\n";
}
The regular expressions should probably be broken up more for efficiency's sake, but I am too lazy. I also made several assumptions about the format of the file that may not hold true, but it parses the example you gave:
tiberius@terminus ~/projects/belarc-parser/data $ ../parse.pl (Servosity).html Windows Vista Ultimate (x64) (build 7000),Gigabyte Technology Co., Ltd. 965P-DS3 Enclosure Type: Desktop,3.00 gigahertz Intel Core 2 Duo 64 kilobyte primary memory cache 2048 kilobyte secondary memory cache,737.15,4094 tiberius@terminus ~/projects/belarc-parser/data $ ls (Servosity).html
[Edited on January 23, 2009 at 4:58 PM. Reason : .]1/23/2009 4:56:46 PM |
Grandmaster All American 10829 Posts user info edit post |
Will active perl work for this? I haven't had shell access to a unix box since i ditched IRC.
C:\Users\Esoteric\Desktop>belarc.pl (Chmkn004).html ,,,, 1/23/2009 5:30:01 PM |
Tiberius Suspended 7607 Posts user info edit post |
hm... I dunno... in theory perl is perl, but PM me a link to the perl distro you're using and that file, I'll do a quick sanity check over here on my Win2k3 VM 1/23/2009 11:45:27 PM |
Grandmaster All American 10829 Posts user info edit post |
I might have found someone's AutoIt Solution but for some reason it's throwing errors.
http://www.autoitscript.com/forum/index.php?s=&showtopic=20472&view=findpost&p=288605 1/27/2009 11:22:44 AM |
Grandmaster All American 10829 Posts user info edit post |
I have zipped and uploaded two html files, one that works and one that does not, the parsing script, the Belarc install and the AutoIt install file. /PARSE/ is hardcoded as the directory it's currently looking for the html files.
I took out installed software table and messed up some of the formatting on the html file that works, so don't take that into consideration if anyone decides to take a look at this.
Regardless, I appreciate the help.
http://www.elusivity.com/PARSE.zip
Got in contact with the author, I believe he's going to help.
[Edited on January 27, 2009 at 1:58 PM. Reason : .] 1/27/2009 1:38:31 PM |