performance

First Version of LoadTest.IO Released

The very fresh and raw 0.1 version of http://loadtest.io is now available for your enjoyment.

What is It?

LoadTest.io is multi-threaded load-tester with auto-discovery. It is ideal for load-tests that target breadth. You give loadtest.io some initial URLs and it will crawl the rest of your website, auto-discovering links as it goes. It will also try to not repeat already-tested URLs and reach maximum number of unique URLs in the shortest time possible.

Due to this characteristic, LoadTest.io is a very good tool for testing websites that employ caching. Such websites can not adequately be tested by stress-testing tools that only hit limited number of URLs.

Why Would You Care?

The majority of tools currently available do not have auto-discovery feature. Which means they can only test a fixed set of URLs. But any reasonably built web-system has some kind of caching, so after the first hit, any consequent hit to the same URL only tests your cache not - the web system (application layer, database etc). Such test can be unrealistically optimistic and misleading. Real traffic from real users will not just hit 20 hand-picked URLs from your web-site.

LAMP Settings for a High-Performance Small Server.

With the VPS expansion you can now get a (very) small virtual private server (VPS) for a very affordable price. However, when you get a server with something like 256MB or 512MB RAM and a portion of CPU power, using default MySQL/PHP/Apache settings is a pretty bad idea.

Performance and scalability tuning of a server is more of an art than science, in the sense that there're no ready-to-use formulas. Optimal server settings depend on many unique factors: web-app code, traffic to the site, site's information architecture among other things. It's virtually impossible to really optimize server settings without thorough understanding of the web application and a lot of testing.

That said, you are not going to run newsweek.com or huffingtonpost.com on a 256MB slice. Also, the default settings are typically so off that it is possible to give you a much better starting point. I'd like to share with you some settings that have worked well for me. I am assuming you are running a small site or a blog, with the traffic of several thousand page-views/day, using Wordpress, Drupal or something of that kind. I highly recommend getting at least a 512MB VPS, but these settings are better than the default for a 256MB server, as well.

Following are some variables for various settings files that have been modified from their default values.

Twitter Spitting on Ruby On Rails Performance

Very interesting: Twitter is abandoning Ruby on Rails due to claimed scalability problems:
http://www.techcrunch.com/2008/05/01/twitter-said-to-be-abandoning-ruby-...

Beware: eAccelerator and PHP ZLIB Compression

On numerous LAMP servers we run, we repeatedly experienced an odd and disturbing problem. Users were fed some "random" ASCII text, instead of a rendered page, leaving them in complete frustration. After rigorous investigation, we came to a conclusion that the problem lies in a combination of PHP's zlib compression and eAccelerator. The odd part that greatly complicated debugging was that only some users experienced the problem.

eAccelerator is a "free open-source PHP accelerator, optimizer, and dynamic content cache". It's also one of the most widely used and stable implementations. The performance gains from eAccelerator are very imperssive, especially since there is no code-change involved and cache is transparent - data is real-time.

zlib comperssion is a PHP feature that compresses output (pages) and can significantly decrease both traffic usage, as well as page load times.

Unfortunately, the two do not work well together.

Blazing Fast Grep

It was somewhat of an unexpected news and an accidental finding, when I found out today that perl-compatible grep is much faster compared to the default one. I was trying to grep a 145MB text-file

grep -i 'someword' largefile.log - 14 seconds
grep -iP 'someword' largefile.log - under 1 second
grep -iP 'someword.*?' largefile.log - under 1 second

Perl-compatible regexp search is orders of magnitude faster!

It is not surprising that the two modes may be using different algorithms, however... Since perl-compatible is more generic, complex and inclusive of the simpler cases, it makes you wonder why would they bother? Why not just default the simpler case onto the more generic, Perl-compatible one and have both of them fast? I guess - one more glaring example of over-engineering waste; in this case - in a Linux classics :) I, for one, am going to always use the "-P" option from now on.

Install PECL Memcache with XAMPP and PHP4

XAMPP is an absolutely wonderful, packaged, self-contained distribution of apache, mysql, php and tons of hard-to-install php extensions. Not only does it make sysadmin's life easier, by solving 99.9% of LAMP problems out-of-the-box, but it also allows PHP-vendors to create packaged distributions of complex systems.

However, even with a long list of packaged extensions, obviously there may be a need to install an additional one. PECL Memcache client of memcached distributed cache server, is a very probable candidate for high-load systems.

Unfortunately, PECL is broken in a vanilla XAMPP installation with php4. Typically you won't be able to use "pecl" directly and when you try to install manually with phpize you will get an error like:
" PHP Warning: Unknown(): Unable to load dynamic library '/opt/lampp/lib/php/extensions/no-debug-non-zts-20020429/memcache.so' - /opt/lampp/lib/php/extensions/no-debug-non-zts-20020429/memcache.so: undefined symbol: OnUpdateLong in Unknown on line 0"

Five Nines Availability - Kaizen of Performance and Scalability

As far as Enterprise Architects are concerned, there is no other, single characteristic of a software system more important than the performance and scalability. Or, at least - there should not be. Sadly, in reality, performance is often an afterthought. Inexperienced development teams look at it in disdain and refuse to "waste" time on performance tuning. They amuse themselves with "feature enhancements" and only remember performance when things start to fall apart. That moment in time is usually before the champagne glasses, in celebration of the "successful" launch, get a chance to dry.

Boosting Firefox Performance

If you noticed that your Firefox installation takes significant time to connect to websites, there may be an easy solution. It will take you 30 seconds and save you a lot of time/annoyance in the future.

So:

  • Type "about:config" in the address field and have Firefox open it as if it was an Internet URL. Firefox configuration page will appear.
  • In the "Filter" text field, type: "dns" to get dns-related configurations only.
  • Note the configuration named "network.dns.disableIPv6". It's value should be "true" (IPv6 should be disabled). If not, click on the value field to make it true.

Restart Firefox and enjoy faster connection times.

Drupal 5 Performance Benchmark vs Drupal 4.7

Dries recently published Drupal 5 performance benchmark results.

I would like to share some of my thoughts about the subject.

First, of course, we should appreciate that Dries took time and performed the test. Better some results than none. However, it is only natural that objective audience mostly meets benchmark results with certain level of skepticism. I won't be an exception. Personally, I especially do not like or trust black-box performance tests. If you tell me that A is faster than B, tell me why, show me the specific part of A that makes it faster. It will make results much more interesting, trustworthy and educational.

Extreme Drupal Performance for Authenticated Sites

Cache is an often-used, non-code related fix to performance problems. No matter what kind of cache you use, though (query cache, app cache etc.) you do need a mechanism to repopulate cache once it is invalidated by the system. And you definitely do not want it to be an unfortunate flesh-and-blood user who hit the non-cached page first time, to trigger caching. You'd much rather have a cron do that.

Pretty simple - put a cron in place that hits your site every minute. Job well done.

Yeah, but what if your website is accessed only through authentication? You have to be able to allow cron pass through that, somehow. And here is how:

  1. Add a cron user to the system, with no privileges just authentication and page view ones. Let's name it - cron
  2. Enable "path" module and create a special alias for the home page (or page you need recached). Let's say: "node" -> "cronnode".
  3. Install "securesite" module and secure path "cronnode". This will allow cron to use HTTP Authentication for this path. You need a special path to not disrupt your flesh-n-blood users.
  4. Set up a cron like the following:
    */1 * * * * /usr/bin/wget -q -O /dev/null --user cron --password cpass http://example.com/cronnode
    

Enjoy

P.S. One more thing: every session opened is logged by both user and securesite modules, so if you mind filling-up your log (probably) with meaningless messages, you may want to do something about it in :

user_login_submit() in modules/user.module
and
securesite_init() in securesite/securesite.module
Syndicate content