Maximize your eshop's potential with the wisdom of your most treasured asset: Your customers!
Maximize your eshop's potential with the wisdom of your most treasured asset: Your customers!

Optimizing DB storage and file system in a Woocommerce e-shop with 4,000,000 images



I had just completed an eshop with 10k products/50 attributes that had achieved 100% GT Metrix score, 91% on Pingdom, >80 in mobile Google pagespeed (99% on desktop) and was relaxing observing an increase in its sales by over 150% in the first couple of months since it went live, but a friend didn’t let me rest:

He had in mind another, more daunting, challenge: To do the same for a shop in the car parts business starting with 200k products and eventually ending up exceeding 1 million.

The challenge of hundreds of thousands of products in WordPress/Woocommerce:

As you know the core of woocommerce product catalogue is basically the wp_posts and wp_postmeta tables. Although the wp_postmeta provides great flexibility via the meta_keymeta_value pairs, in the context of e-shops you may end up with over 50 meta_keys (even with empty values) so, for 200k products, your wp_postmeta table can have in excess of 10 million rows!

Obviously, you can trim it by removing all the entries that have empty values, in our case we reduced our 10million rows table to less than 5 million by doing just this.

Adding indexes in the tables that matter in your database is another technique that can speed up your site. There are numerous plugins that do just that, if you are uncomfortable in adding your own keys. One such plugin is Index WP MySQL For Speed

Here I am going to share with you how I dealt with the… elephant in the room: the optimal handling of millions of images and thumbnails in such a large installation that, nowadays, we tend to overlook when estimating the consumption of resources (as we take for granted the performance of the file systems) while also adding enormous latency to your site’s rendering.

Optimizing the Media Gallery

Car part eshops have over 200k products, thus about 300k images. If your theme is “interesting”, (i.e. has many different views) each image may have 5–7 thumbnails more than the three WP native thumbnails. So you will end-up with roughly 3 million images.

As your supplier-provided images are usually large jpegs, you may find worthwhile to optimize them to more modern formats, such as webp. So, you will end up with about 6 million images.

Such a number of files is bound to overwhelm any filesystem: If you place 6 million images in one folder your system will get a lot of “latency” in serving them: Just try to place over 100k images in any folder and try to open it. You will wait for many seconds (if not minutes). Caching or CDN may help but still you have to acknowledge that organizing your images in folder with up to 20k items in each you will get a huge performance boost.

In this article you will learn:

1. Where WordPress (and Woocommerce) stores the information about images, attachments and thumbnails
2. The optimal way to convert our images to webp
3. How to manipulate these tables in order to change the default type to webp
4. How to manipulate these tables in order to “relocate” our images to multiple folders
5. Basic bash commands that will move the physical images to the folders we have set in our WordPress.

Let’s begin:

1. Where WordPress (and Woocommerce) stores the information about images, attachments and thumbnails

WP stores information about the images of a post/product in only tables and just three fields/properties:

wp_posts in the column guid

wp_postmeta in the meta_value of the meta_key _wp_attachment_metadata and the meta_value of the meta_key _wp_attached_file

To complete our task -that is bound to multiply our performance — we will work with just five fields in these two tables:

wp_posts.id
wp_posts.guid
wp_postmeta.post_id
wp_postmeta.meta_key 
and
wp_postmeta.meta_value

The two tables are associated with each other via the wp_posts.id and wp_postmeta.post_id fields i.e. the wp_postmeta knows for which wp_post.id it refers to via the post_id field.

Example data of a single image that have do be dealt with are:

In wp_posts and entry in guid field:

https://yourdomain.com/wp-content/uploads/2022/03/0001159-5-organo-antipsyktikoy-mpatarias-17220-11805-02.jpg

And in wp_postmeta

35579 in meta_key _wp_attached_file we have a meta_value:
2022/03/0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02.jpg

35579 in meta_key _ wp_attachment_metadata we have a meta_value:
a:6:{s:5:”width”;i:800;s:6:”height”;i:800;s:4:”file”;s:67:”2022/03/0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02.jpg”;s:8:”filesize”;i:48621;s:5:”sizes”;a:17:{s:6:”medium”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–300×300.jpg”;s:5:”width”;i:300;s:6:”height”;i:300;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:23150;}s:9:”thumbnail”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–150×150.jpg”;s:5:”width”;i:150;s:6:”height”;i:150;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:20421;}s:12:”medium_large”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–768×768.jpg”;s:5:”width”;i:768;s:6:”height”;i:768;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:37970;}s:8:”td_150x0″;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–150×150.jpg”;s:5:”width”;i:150;s:6:”height”;i:150;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:20421;}s:15:”td_150x0_retina”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–300×300.jpg”;s:5:”width”;i:300;s:6:”height”;i:300;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:23150;}s:8:”td_300x0″;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–300×300.jpg”;s:5:”width”;i:300;s:6:”height”;i:300;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:23150;}s:15:”td_300x0_retina”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–600×600.jpg”;s:5:”width”;i:600;s:6:”height”;i:600;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:31422;}s:8:”td_696x0″;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–696×696.jpg”;s:5:”width”;i:696;s:6:”height”;i:696;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:35005;}s:10:”td_265x198″;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–265×198.jpg”;s:5:”width”;i:265;s:6:”height”;i:198;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:21869;}s:10:”td_741x486″;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–741×486.jpg”;s:5:”width”;i:741;s:6:”height”;i:486;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:30486;}s:21:”woocommerce_thumbnail”;a:6:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–300×300.jpg”;s:5:”width”;i:300;s:6:”height”;i:300;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:23150;s:9:”uncropped”;b:0;}s:18:”woocommerce_single”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–600×600.jpg”;s:5:”width”;i:600;s:6:”height”;i:600;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:31422;}s:29:”woocommerce_gallery_thumbnail”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–100×100.jpg”;s:5:”width”;i:100;s:6:”height”;i:100;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:19664;}s:12:”shop_catalog”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–300×300.jpg”;s:5:”width”;i:300;s:6:”height”;i:300;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:23150;}s:11:”shop_single”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–600×600.jpg”;s:5:”width”;i:600;s:6:”height”;i:600;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:31422;}s:14:”shop_thumbnail”;a:5:{s:4:”file”;s:67:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–100×100.jpg”;s:5:”width”;i:100;s:6:”height”;i:100;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:19664;}s:28:”dgwt-wcas-product-suggestion”;a:5:{s:4:”file”;s:65:”0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02–64×64.jpg”;s:5:”width”;i:64;s:6:”height”;i:64;s:9:”mime-type”;s:10:”image/jpeg”;s:8:”filesize”;i:19201;}}s:10:”image_meta”;a:12:{s:8:”aperture”;s:1:”0″;s:6:”credit”;s:0:””;s:6:”camera”;s:0:””;s:7:”caption”;s:0:””;s:17:”created_timestamp”;s:1:”0″;s:9:”copyright”;s:0:””;s:12:”focal_length”;s:1:”0″;s:3:”iso”;s:1:”0″;s:13:”shutter_speed”;s:1:”0″;s:5:”title”;s:0:””;s:11:”orientation”;s:1:”1″;s:8:”keywords”;a:0:{}}}

2. The optimal way to convert our images to webp

For starters there are a lot of plugins out there that can convert your images to webp. Webp-Express is my favourite for smaller sites. However, for such volumes of images processing your images via a plugin in the server can take weeks — literally! Estimate that in a decent host each image conversion can take 1–2 seconds. So we are talking about 2 million seconds (do the math to figure out the exact days that are needed in your case).

Here is an alternative proposition:

Download the freeware XN Convert so that you can convert locally your jpeg to webp at a rate of about 10,000 images per minute over 333 times faster — (this is how long it took me utilizing 40/48 cores in my Threadripper). Please convert your images maintaining the same name and just appending the .webp in your files in the same directory structure. E.g. the image

03573573336.jpg will become 03573573336.jpg.webp

Recommended settings:

Having done the conversion we are ready to proceed with the more interesting stuff in the database.

3. ­­ How to manipulate these tables in order to change the default type to webp

So our first step is rather trivial. We have to rename in our tables the .jpg extension to .jpg.webp

This is done with rather trivial instructions:

Update wp_posts set guid=Replace(guid,’.jpg’, ‘.jpg.webp’) where guid like ’%.jpg%’

Update wp_postmeta set meta_value=Replace(meta_value,’.jpg’, ‘.jpg.webp’) where meta_value like ’%.jpg%’

Please notice that the meta_value when in meta_key=’ _wp_attachment_metadata’ also contains the mime type of the thumbnail/attachment. So we have to alter this as well:

Update wp_postmeta set meta_value=Replace(meta_value,’image/jpeg’, ‘image/webp’) where meta_value like ’%image/jpeg%’

In our example our image “details” have now been modified as:

https://yourdomain.com/wp-content/uploads/2022/03/0001159-5-organo-antipsyktikoy-mpatarias-17220-11805-02.jpg.webp in wp_posts.guid field,

and

2022/03/0001159–5-organo-antipsyktikoy-mpatarias-17220–11805–02.jpg.webp in the wp_postmeta.meta_value field (when meta_key is ‘_wp_attached_file’) and in a similar manner in the wp_postmeta.meta_value when the meta_key is ‘wp_postmeta.meta_value

Tip: having done this, the jpg images are now irrelevant, so we can delete them to halve the number of images:

On terminal you just run this command:

find . -name “*.jpg” -type f -delete

It is a good practice to first run it without the -delete for extra peace of mind:

find . -name “*.jpg” -type f

4. How to manipulate these tables in order to “relocate” our images to optimally sized multiple folders

First, let’s create a scenario of the task at hand so that the code that you will see will make more sense.

Scenario:

Let’s assume that all our images were imported in the uploads/2022/03 folder.

Let’s assume that our images have as their name the product sku. Say they start with something like:

00006786868.jpg
03573573336.jpg etc.

As we have zillions of them, we would like to place them in a structure like:

uploads/accessories/0000/
uploads/accessories/0357/

So effectively we are going to split our 200k original files in 10000 directories that say will have approximately 200 original images each /or 2000 images per folder inclusive of the thumbnails / and/or 4000 images per folder if we also opt to convert them to webp.

But lets stick to our original example:

Our original image in webp format is located here:

https://yourdomain.com/wp-content/uploads/2022/03/0001159-5-organo-antipsyktikoy-mpatarias-17220-11805-02.jpg.webp

But we desire to position it here:

https://yourdomain.com/wp-content/uploads/accessories/0001/0001159-5-organo-antipsyktikoy-mpatarias-17220-11805-02.jpg.webp

Our task therefore is to replace the “/2022/03/” with “/accessories/0001/”

As “accessories” is a fixed string so there is not much to it.

In order to get the first four characters of the original image we are going to use the substring command in SQL

The syntax is subring(original_string, starting position, number of characters to pick)

update wp_posts set guid=replace(guid, ‘/2022/03/’, Concat(“/accessories/”, substr(guid ,45,4), “/” )) where guid like “%/2022/03/%webp” and CHAR_LENGTH (guid) < 247 ;

In my case the starting position of the 0001159… in wp_posts.guid was the 45th. In your domain this may be different so you have to count the characters with care.
As the meta_value field has a maximum length of 255, just to be on the safe side I am checking for the length of the image path so that it does not exceed this limit also taking into account the additional characters of the new path — so I am checking for 247 instead of 255.

So now our images is properly stored in the wp_posts.guid as desired:

https://yourdomain.com/wp-content/uploads/accessories/0001/0001159-5-organo-antipsyktikoy-mpatarias-17220-11805-02.jpg.webp

Now let’s update our wp_postmeta entries:

update `wp_postmeta` fm
inner join `wp_postmeta` ff on ff.post_id=fm.post_id
set fm.meta_value = Replace(fm.meta_value, “2022/03/”, Concat(“accessories/”, substr(ff.meta_value ,
47,4), “/” ))
where fm.meta_key=’_wp_attachment_metadata’ and ff.meta_key=’_wp_attached_file’ and fm.meta_value like ‘%accessories/%’;

This is a tad more complex query in order to modify in one go the meta_values for both the meta_key=’ _wp_attached_file’ and for meta_key=’ _wp_attachment_metadata’ — We are using this join because we can extract with certainty from the wp_posts.guid the first four characters of the file name.

Before you run this query make sure that you change your substring starting value, it is bound to be different from the original value — now the entry in the wp_posts.guid field has been changed from “/2022/03/” to the new directory “/accessories/”.

5. Basic bash commands that will move the physical images to the folders we have set in our WordPress.

Now that we have prepared the database lets move the physical files to the desired destination.

Just enter bash (#!/bin/bash) and run this command:

for f in *.jpg.webp;
do
firstFourChars=”${f:0:4}”;
mkdir -p — “../../accessories/$firstFourChars “;
mv — “$f” “../../accessories/$firstFourChars”;
done

Having done this, just regenarate all your thumbnails using in WP CLI:
wp media regenerate — only-missing — yes

Conclusion:

Now you can enjoy minimal latency in the serving of your images and your images are 30–60% smaller in size than their jpg equivalents, saving disk space for your hosting package as well as bandwidth for your CDN package as well as the bandwidth of your mobile users. The guaranteed much faster rendering of the page — is well rewarded not just in speed metrics but primarily by the users themselves who will double/triple their spending in your website, just on the basis of its performance.

The evidence of our experience with faster “new” e-shops that have identical aesthetics and prices with the “old” is undisputable. Sounds unbelievable but it is true!

Rest assured that these few days of optimization and tuning will have the largest impact in your website’s bottom-line.

Leave a Reply