User not logged in - login - register
Home Calendar Books School Tool Photo Gallery Message Boards Users Statistics Advertise Site Info
go to bottom | |
 Message Boards » » I got featured on Hackaday.com... Page [1]  
timbo
All American
1003 Posts
user info
edit post

http://hackaday.com/2012/06/14/penny-auction-hacking-put-on-your-statisticians-hat/

If you look at my code, please be nice.

6/19/2012 10:39:31 AM

BigMan157
no u
103352 Posts
user info
edit post

That's pretty neat. I know a guy trying to start his own version of one of those sites.

Why'd you use Selenium?

6/19/2012 12:11:20 PM

timbo
All American
1003 Posts
user info
edit post

I tried using BeautifulSoup and urllib but when it parsed the website, the containers with bidding information were empty (due to the AJAX script running). The only scraping module that would actually recover the values I wanted was Selenium.

The script basically functions by opening an auction in a window, recovering bidding info, then refreshing every 10 seconds. After the auction ends, the data is organized and dumped into neat files and a final summary file that contains all the auction goodies you would want to know (Number of bidders, number of bids per user, auction length, etc).

6/19/2012 12:20:43 PM

qntmfred
retired
40432 Posts
user info
edit post

kick-ass dude, nice job.

you probably mentioned it in one of your blog posts, but why aren't you just scraping directly with http requests? why bother with selenium? also, you could much more easily analyze the data if you use a real database and not just csv/excel



[Edited on June 19, 2012 at 12:30 PM. Reason : nm just read ^ still, i can't imagine you couldn't scrape it if you know the right ajax]

6/19/2012 12:23:52 PM

timbo
All American
1003 Posts
user info
edit post

You're right about scraping the AJAX requests directly. There was a way to do it, but it required individual cookies that the server generated (http://pennystats.blogspot.com/2012/04/very-interesting-find.html). It is possible to do it that way, but the data was messy and I honestly didn't know how to generate valid auction cookies and scrape them directly.

Selenium offered a turn-key solution that just worked, so I just decided to go with it.

[Edited on June 19, 2012 at 12:42 PM. Reason : .]

6/19/2012 12:39:28 PM

BigMan157
no u
103352 Posts
user info
edit post

i too would have approached it with php/curl and DBed the data, but hey, if you got a solution that works for you why not

6/19/2012 12:44:18 PM

timbo
All American
1003 Posts
user info
edit post

I am pretty sure the majority of people that scrape the data use php and dump them into a database. (http://www.allpennyauctions.com/).

Another benefit of doing it that way would be that I could use a significantly less powerful server to scrape data. Right now I have a dual core Xeon server (3.3ghz) with 8GB of RAM chugging away and it can only scrape about 2000-2500 auctions per day. I think if I upped the ram to 16 GB I could probably grab them all at once.

6/19/2012 12:50:07 PM

mildew
Drunk yet Orderly
14177 Posts
user info
edit post

http://pennystats.blogspot.com/2012/04/first-post-in-what-could-be-quite.html


That pop up next to the scroll bar is annoying as shit

6/19/2012 12:54:29 PM

timbo
All American
1003 Posts
user info
edit post

I usually use a scrolly mouse so I never noticed. I can see how that would be annoying.

The worst part is that wordpress doesn't allow you to modify their "Dynamic" theme so there's nothing I can do about it.

6/19/2012 12:57:55 PM

synapse
play so hard
60916 Posts
user info
edit post

Very nice work.

6/19/2012 3:25:39 PM

xienze
All American
7341 Posts
user info
edit post

You'd have to venture over to Java, but htmlunit would give you a way to run the page's Javascript.

6/19/2012 6:44:43 PM

Hiro
All American
4673 Posts
user info
edit post

This thread is epic. Great work timbo

6/19/2012 7:17:51 PM

Moox
All American
612 Posts
user info
edit post

How can I use this to make money?

6/20/2012 12:41:39 AM

timbo
All American
1003 Posts
user info
edit post

You need to break down the data and look at stuff you want to target. Then look for the best time to try and win.

The charts of the day are useful for doing this. This one in particular.
http://pennystats.blogspot.com/2012/06/pennystats-chart-of-day-61112.html

6/20/2012 9:18:32 AM

Moox
All American
612 Posts
user info
edit post

So basically I should log in the middle of the night on weekends, buy $50 gift cards, and sell them to Plastic Jungle?

That simple?

6/21/2012 4:00:08 AM

timbo
All American
1003 Posts
user info
edit post

That was my theory. But 5000+ people have read my blog since then, so I duno if it is still applicable. You could always use my software to data mine and see if those statistics are still accurate.

[Edited on June 21, 2012 at 1:29 PM. Reason : spelling]

6/21/2012 1:28:47 PM

 Message Boards » Tech Talk » I got featured on Hackaday.com... Page [1]  
go to top | |
Admin Options : move topic | lock topic

© 2024 by The Wolf Web - All Rights Reserved.
The material located at this site is not endorsed, sponsored or provided by or on behalf of North Carolina State University.
Powered by CrazyWeb v2.38 - our disclaimer.