The Foundthru site has been built to demonstrate the functionality and performance of the various plugins available from www.wpintense.com.
- FoundThru contains over 800,000 affiliated products
- This server is a $40 per month Digital Ocean server
- The server stack is the Rocket Stack
- The site is uncached so you can see and feel the raw page speed performance. Caching should not be relied upon to solve performance issues – it should only be used to solve traffic spikes
- Back-end admin is as fast as the front-end shop
- Free-text search happens in milliseconds
- Filters are applied in milliseconds
- The underlying database is still MySQL – i.e. no need for constantly out of date synchronisation with Elastic Search or other such over-engineered solutions
How FoundThru was built
The products were almost all loaded using Datafeedr. Datafeedr provides access to pretty much all affiliates which provide affiliate data feeds. Some other products were loaded manually to demonstrate some other plugin functionality.
The server stack is the WPI Rocket Stack which you can learn more about and build yourself for free here.
The entire WPI Performance Plugin pack is in use on this site. This is important for multiple reasons.
Why Caching is not enough to solve scalability issues
A quick note re: caching – I’ve deliberately left caching OFF from this site so you can see the raw performance for yourself. If you have a large site, caching is NOT the answer to your scalability issues because of the following:
- Googlebot searches behave very differently to users, so pages are typically not cached for googlebot. If googlebot thinks you have slow page speed, your SEO rank is negatively affected.
- If you have a lot of filters, you cannot possibly cache all combinations – it would take too much disk space. Priming the cache would take too long too anyway.
- If you have poor underlying performance, for example a page taking 10 to 30 seconds to load, your server DISK and CPU are being THRASHED. That means your server will struggle to provide any good level of response for other users, even if the pages they request are actually cached. Additionally, table scans that cause all data to be read for a single query invalidate all the benefits of MySQL query caching – i.e. they flush the db cache. By transforming your underlying performance to sub-second, THEN you can add page-level caching so that pages frequently hit by users get cached and not only can you survive googlebot and regular user use, you can also survive traffic spikes.
Why vanilla WooCommerce doesn’t scale well
WooCommerce comes with very badly coded sidebar widgets. These widgets do two bad things that affect scalability :
- A lot of their widget code and shortcode code causes table scans which means they read ALL products in your database on many/most pages
- Some of their widgets (e.g. price filter) parse all matching products using PHP.
The two above things are a recipe for disaster once you have a lot of entries in your database, which I discuss in more detail here: https://www.wpintense.com/performance/
I wrote the Faster Woo Widgets plugin to replace the WooCommerce widgets. They include lots of great features and functionality, as well as performing incredibly fast. They also include something called progressive caching which means this:
If you have a category such as:
Books -> Fiction -> Fantasy
With Faster Woo Widgets, when the Books top level category has been counted once, it does not need to be counted again until products are added/deleted or altered in the shop.
That means when a user visits the Fiction sub-category, the widget re-uses the query results for all the top-level categories which have already been counted. Similary, when the user visits the Fantasy category, if there are any sub-categories then only those will need to be counted. i.e. a LOT of server-resource is spared. On top of that, the queries themselves are faster and use the DATABASE directly to perform counts instead of relying on PHP code.
Faster Woo Widgets includes category, attributes, price filters, colour filters, label filters and many other cool options including Ajax. So not only does this plugin provide fast widgets, it also provides the best functionality we’ve found from any widget plugin out there.
Why vanilla WordPress doesn’t scale well
WordPress is generally pretty fast, but becomes slow on archive pages for these key reasons:
- WP_Query uses LEFT JOINs to check for existence of entries in wp_postmeta and/or wp_term_relationships
- WP_Query is then forced to use a GROUP BY command to grab unique post IDs
- WP_Query forces a sort, even when not required, typically by DATE DESC
- There are indexes missing from key tables. Many plugins, for example, will join to wp_postmeta and check for meta_value without using a meta_key. WordPress presumes that all use of wp_postmeta will be a join on wp_posts.ID = wp_postmeta.post_id – this is not the case and sticking their heads in the sand about this is not helping.
The LEFT JOIN issue means if you visit your /shop/ archive, WP_Query does a LEFT JOIN for ALL your products against all their matching wp_postmeta and/or wp_term_relationships. It then sorts these values too and finally presents you with the top 20 to display on page.
If you have 800,000 products, this means 800,000 items are joined with 800,000 postmeta rows and 800,000+ (depends how many categories per product) term_relationships.
By rewriting these queries to use WHERE EXISTS instead of LEFT JOIN we can eliminate the need for the GROUP BY to get unique post IDs. This then provides the opportunity to remove the sort operation and be left with a query which frequently only reads 20 – 40 rows from your database to provide page 1 of your archive.
In particular, when loading products in /shop/ there is no benefit to loading the most recently added products. Better to just use the natural database sort order.
I wrote Scalability Pro to optimise the underlying WordPress performance. It includes options to rewrite LEFT JOINs to WHERE EXISTS as well as many more options to remove or alter poorly written WordPress functionality that you are likely not using.
Why the default Free Text Search doesn’t scale
The free text search that comes with WordPress doesn’t scale AT ALL. This is because the underlying SQL uses a LIKE operator. For example, a search for ‘yeti microphone’ will use something like:
WHERE description LIKE ‘%yeti microphone%’
This % symbol means wildcard. Having a wildcard at the beginning of a LIKE statement makes it impossible for any database to use any index that might exist on that column. It has been coded this way to ensure WordPress works on as old tech as possible.
Fulltext indexes are the real solution here – but old versions of MySQL did not have fulltext indexes, in particular in Innodb. That is no longer the case, and anyway, you should use PerconaDB or MariaDB instead of MySQL. Both of these have fulltext indexes available.
Fulltext indexes are designed for:
- Better text matching including word-stemming
- Faster text matching – e.g. 100ms response on 800,000+ products on this store – you can go try it yourself
I wrote Super Speedy Search to provide ultra-fast free-text search. It fixes the default search pages as well as providing an ajax-based WooCommerce free-text widget for your sidebar.
Why pagination and most (all?) other infinite scroll plugins don’t scale
Including product counts forces all those products to be counted. Most users don’t make it past page 1 of your archive – they will go straight to the filters. And with the prevalence of smart phones and tablets these days, scrolling continuously down the page is a better option for most users.
Previous attempts by others at infinite scroll fail for this key reason:
- When a user finds a product they like and click to it, then click the BACK button, they end up back at the top of your last archive page.
Many plugins have attempted to solve this by loading page numbers into your browsers history, but then your user ends up confused by being half way down the page with no products above. Some plugins solve that by having infinite scroll upwards which is frankly confusing.
Many plugins don’t attempt to solve it, and instead the scroll position is at the bottom of the screen and page 2 is attempted to be loaded. But then they are in the wrong position and what if the product they last clicked was actually on page 3 or page 4 anyway? Your user has now left your site. Additionally, your server has now been under additional unnecessary load loading these 2 extra pages which were already loaded seconds ago. Page caching can help, but still it’s not required.
Infinite scroll is required to eliminate pagination and pagination needs to be eliminated for scalability reasons, so I solved the problems above.
When a user clicks through to a product page from your archive, my Auto Infinite Scroll plugin saves the current archive to the browsers storage system. It also saves the scroll position.
That means, when the user clicks to a product and then clicks the back button on their browser, the archive page (with the code I’ve written in the plugin) loads the archive from the browser storage and scrolls instantly to the position the user was at previously.
This means, back MEANS back. The user clicks back and even if they were on page 20 of your infinite scroll, that entire archive is loaded INSTANTLY with no additional server resource required. They are scrolled down to the correct position too. That means the user has the experience they EXPECT and can continue on their shopping experience.
Additionally, Auto Infinite Scroll is coded to require zero configuration. Install it, de-activate any other infinite scroll features (e.g. other plugins or features from your theme) and voila, auto infinite scroll with archive-browser-storage for brilliant back-button experience!
You can learn more about Auto Infinite Scroll here.
What speeds are we talking?
FoundThru is FAST. It’s uncached, and it’s a $40 per month server. It has 800,000 products and every page loads in under 1 second including DB and PHP time.
I’m not talking GT Metrix or pingdom speed, because those speeds are pretty much irrelevant and academic. I’m talking TTFB – time to first byte – which is the same as saying ‘page load speed’.
Try it out – browse the site – search for products – select whatever filters you like. See if you can find a slow page.