joe17669 All American 22728 Posts user info edit post |
Early last year, a few of us got together and developed a wiki-based knowledgebase of the different types of research that we have been doing, etc, and due to the unexpected popularity among "regular" employees (ie those who aren't doing research) and receiving lots of questions about how our stuff works and why, we also developed a section of the same wiki that teaches the fundamentals of electric power. It's very basic in the sense that we even explain some things in terms of water pipe and water towers
The wiki existed only on our local intranet, and then was later opened up to the other power businesses in the same company. Once this was done, the load was almost overwhelming at times and brought the server to a screeching halt. We can't get our wiki up on the main company servers because the wiki software hasn't been thoroughly tested with their systems, and they are afraid it might cause problems. They also don't use PHP on any of their stuff. Instead, we're running it on a dedicated computer here in our floor. It's a Dell Precision Workstation, dual xeon processors, 4GB of RAM, and a gigabit ethernet port (it's an old computer we used to run simulations on). We have XP Server on there, and use Apache 1.3.33 with PHP and MySQL.
Since none of us have a good grasp of how to most efficiently turn a computer into a webserver, I thought I would ask here. We aren't sure if the slowdowns are caused by the computer and OS, or the wiki software that we run.
We decided to go with a wiki package called DokuWiki (http://wiki.splitbrain.org/wiki:dokuwiki) because of its simple wiki markup code and because it didn't require a database. Everything is done with flat text files and images, and stored in a directory hierarchy which makes backing up very easy.
We looked at the success of Wikipedia and thought about using their software, but the wiki markup wasn't nearly as simple as DokuWiki's (simple is good, so other people can help contribute if they wish).
Recently, the site has been getting anywhere between 325,000 to nearly 400,000 hits per month. While that sounds like a lot, over time it doesn't seem overwhelming, and would think that a server such as the one we have could handle it.
Do you think that in fact the server might be the issue? Is it just not powerful enough, or are we running the wrong OS/httpd combination? Windows is installed because we are pretty comfortable with it, and we don't know hardly anything about installing a flavor of linux and getting it setup as a webserver. Or could it possibly be the Wiki software and it using flat text files? Would upgrading to a software that uses a MySQL database give us much more improvement?
We keep some pretty cool website statistics using the awstats log analyzer, and here are a couple of screenshots:
The year-to-date listing by month of the number of hits and bandwidth
Not really important, but this is the browser distribution for the current month. Interestingly enough, the computers the company gives us only comes with IE installed, but yet there Firefox makes up the majority of the hits.
I appreciate any advice on what we could do to speed things up. It isn't bad right now, but if we keep expanding things, it may become a larger problem in the future. Thanks ] 8/11/2006 1:59:25 PM |
Noen All American 31346 Posts user info edit post |
yea you really should move to linux, that box is way more than enough to handle serving pages. Also you should DEFINITELY move to Apache 2.x, it runs a shit ton faster in both Windows and Linux (multi-threaded)
Also, does the wiki use caching at all? if not you may want to implement a caching scheme to reduce processing load.
[Edited on August 11, 2006 at 2:02 PM. Reason : .] 8/11/2006 2:02:00 PM |
Arab13 Art Vandelay 45180 Posts user info edit post |
off topic
what sort of research? 8/11/2006 2:04:13 PM |
1 All American 2599 Posts user info edit post |
Quote : | "Not really important, but this is the browser distribution for the current month. Interestingly enough, the computers the company gives us only comes with IE installed, but yet there Firefox makes up the majority of the hits." |
Use a script to block access from firefox. 8/11/2006 2:06:52 PM |
joe17669 All American 22728 Posts user info edit post |
Noen, looking at the documentation of the Wiki, it does have caching capabilities, and is enabled by default (so we probably didn't turn it off)
Quote : | "DokuWiki speeds up browsing through the wiki by caching parsed files. So if a currently cached version of a document exists this cached copy is delivered instead of parsing the data again. On editing and previewing no cache is used." |
Arab13, I'm doing research on next-generation power system control mechanisms to help promote the stability, quality, and reliability of the power grid, on all three levels of generation, transmission, and distribution (I mainly work on transmission though). We're working on using computational intelligence (like neural networks, fuzzy logic, genetic algorithms, and particle swarm optimization) to replace the slower, conventional control systems in an attempt to better automate and control the grid.
1, nah, we don't need to do that I just thought it was interesting that a lot of people have gone and downloaded and installed Firefox and prefer to use it over the already-available MSIE. The company doesn't really care what we install on the computers, as long as it is legal and doesn't fuck with their network infrastructure ]8/11/2006 2:09:56 PM |
esgargs Suspended 97470 Posts user info edit post |
Everyone in this thread is dead wrong.
The bottleneck isn't the operating system but your database. Serving a big website using flat files is disasterous. You really need to use a database server. I would go with optimized non-ISAM mySQL, or if you're comfortable using it, Microsoft SQL server Express edition.
Run the Microsoft Management Console and take statistics of your disk activity. If the site goes down it means that the hard drive is having a hard time!
There are lots of open source wiki platforms out there that use a database.
search wikipedia for "wiki software" 8/11/2006 2:44:27 PM |
Perlith All American 7620 Posts user info edit post |
Isolate the bottlenecks (easier said than done) and fix the problems there. Upgrading software + packages will probably help short-term, but not long-term.
joe ... exactly how BIG do you think this wiki will continue to grow? From a quick glance (and I'm by no means an expert on this), dokuwiki doesn't look like it will scale for what you guys are doing. I'm tending to agree with gargs about the need for a database, on a dedicated database server. It might be worth convincing management now for additional $$$/time to research and develop a long-term solution. My $0.02. 8/11/2006 3:37:58 PM |
Stein All American 19842 Posts user info edit post |
Your problem is the Wiki software.
There's no question. The computer you're using is more than capable of serving that few pages per month that it shouldn't even be remotely slow if you're using anything coded worth a damn. 8/11/2006 4:16:47 PM |
darkone (\/) (;,,,;) (\/) 11610 Posts user info edit post |
^ I agree. That's not enough traffic to slow down that machine. 8/11/2006 4:24:23 PM |
esgargs Suspended 97470 Posts user info edit post |
I would suggest using a Java container to serve a Wiki 8/11/2006 4:25:46 PM |
Noen All American 31346 Posts user info edit post |
Guys, it uses a caching scheme, which if it works, virtually eliminates the issue being the flat-file implementation.
First thing to do is push to Apache 2 on windows, see how that effects the load.
Next thing is to run some load testing on it from another machine, apache comes with the ab (apache bench) load tester that can be used for some basic stuff.
Quote : | "Run the Microsoft Management Console and take statistics of your disk activity. If the site goes down it means that the hard drive is having a hard time!" |
I do agree with this, check the Performance MMC in Admin Tools, see what's going on with the hardware.
Quote : | "There's no question. The computer you're using is more than capable of serving that few pages per month that it shouldn't even be remotely slow if you're using anything coded worth a damn." |
Yea if he runs some ab trials, it will tell him real quick of this is the problem
Quote : | "I would suggest using a Java container to serve a Wiki" |
Worst idea ever.8/11/2006 6:17:33 PM |
esgargs Suspended 97470 Posts user info edit post |
You'd be amazed by the performance of J2EE applications on weak hardware 8/12/2006 4:20:59 PM |
Noen All American 31346 Posts user info edit post |
not even close to PHP in small scale hardware performance. 8/12/2006 10:05:10 PM |
jackleg All American 170957 Posts user info edit post |
but
multi-threading!!!1 8/12/2006 10:14:33 PM |
esgargs Suspended 97470 Posts user info edit post |
PHP can't handle loads. The folks at Flickr are/were using PHP, and had constant issues.
I expect better from you, NOEN 8/12/2006 11:51:49 PM |
Noen All American 31346 Posts user info edit post |
the folks at flickr arent running a single quad core cpu workstation server.
Java scales a hell of a lot easier for enterprise than PHP, but for a single all-inclusive machine, PHP is a MUCH better performance option. 8/13/2006 12:02:50 AM |
esgargs Suspended 97470 Posts user info edit post |
Care to be more specific?
JSPs reach near static run time. Couple that with hibernate and other persistent state techniques and you have a MUCH MUCH better database performance. PHP is very very bad with databases. 8/13/2006 3:12:09 AM |
Noen All American 31346 Posts user info edit post |
the jvm has a pretty massive system footprint. For a single workstation running a jvm and database, versus one running php and a database, php destroys it.
btw, there are also plenty of ways to speed up php's performance outside the code. Also if I recall correctly (and please correct me) Flickr was originally coded on php4.x. PHP5 is a pretty dramatic step forward in larger scalability and runtime performance. 8/13/2006 12:49:39 PM |
esgargs Suspended 97470 Posts user info edit post |
I see where you might be going with the footprint issue, but that's hardly a concern for a single application running on a server with 4GB RAM. Footprint becomes an issue when you have multiple applications running on the same server (shared hosting) because Java doesn't work well in that scenario (explaining the expensive Java based hosts). As regards execution time, once you have a Java container set up, it even beats native C code in some instances.
You're right about Flickr. PHP5 is better but it still suffers from memory issues.
[Edited on August 13, 2006 at 1:14 PM. Reason : .] 8/13/2006 1:07:17 PM |
Noen All American 31346 Posts user info edit post |
In single machine, any decent production JVM will bring the machine to its knees under moderate load, combine it with a database and all the web-server duties and it gets to be a problem. Do all this in windows and you can forget about it.
I definitely agree though, properly done Java is retard fast, just not the best for small scale apps.
PHP5 has memory issues due to programmers not knowing what the fuck they are doing, versus the PHP4 problem of horrible referencing, instancing AND programmers not knowing what the fuck they are doing. 8/14/2006 9:27:57 AM |
Maugan All American 18178 Posts user info edit post |
Quote : | "You'd be amazed by the performance of J2EE applications on weak hardware" |
Holy shit what kind of idiocy do you profess?
Our enterprise J2EE jvm-deployed application is slower than goat nuts. Real shame too because its built on top of a blazing fast database.8/14/2006 10:18:14 AM |
joe17669 All American 22728 Posts user info edit post |
I think we're going to try three different things, each a little more drastic than the previous, just to see what we can do.
1. We upgraded to Apache 2, and we had a little trouble getting the http.conf file to recognize our installation of PHP and the virtual domain we had setup to point to the site. Things are running a little better, but I think that there is much more room for improvement.
2. The installation of Windows doesn't appear to be very gummed up, but perhaps there's lot of stuff going on in the background that is slowing it down. We figured we might try formatting and reinstalling XP Server and only installing Apache 2 and PHP on it to see how that does.
Quote : | "yea you really should move to linux, that box is way more than enough to handle serving pages. Also you should DEFINITELY move to Apache 2.x, it runs a shit ton faster in both Windows and Linux (multi-threaded)" |
3. We're going to play around with a second computer we have, and install a flavor of Linux on there and see if we can figure out how to migrate the Wiki over to it for testing and all that good stuff. If we feel comfortable enough with it, we'll install Linux on the original server and continue to use it.
A couple of questions.... What distribution of Linux should we use, considering its efficiency for running a webserver and the fact that none of us are Linux pros? I've toyed around with building Apache2 on my Powerbook through the command line, and it went pretty well. I guess it isn't much different on a Linux machine. I've done some Googling around to see what distribution is best, but everything I find pretty is pretty much written by the fanboys of the particular distribution, and all of them are super 1337 and better than every other
Secondly the server we have runs headless, and we just use Microsoft's RDC to connect into it and configure it. Is it possible to get a Linux-based GUI up over a remote connection? I know we can ssh in pretty easily, but doing everything in the command line is a little beyond our scope right now.
Thanks everyone ]8/31/2006 4:14:52 PM |
Noen All American 31346 Posts user info edit post |
Really, any distro except Redhat will work well. Redhat is pretty good too, but they use a different methodology for everything it seems (linux people feel free to correct me here, this has just been my experience).
The really good news is virtually any linux distro will have Apache 2.x packaged with it, so you can build and install it during the initial setup/installation of the OS. Many also have MySQL 5.x also prepackaged for your convienance. As far as remote administration goes, pretty much all the stuff you will want to do to Apache/the box/MySQL will have to be done on commandline or through a web interface, so SSH'ing in should be more than sufficient. 8/31/2006 4:21:03 PM |