Follow us on Twitter! @loadintheloop
Access the git commit here.
Sponsorship provided by blackfire.
LITL Episode 1
[00:00:00] Eric Hileman: I can go first, I'll introduce us.
[00:00:03]Ivan Chepurnyi: Okay. Start it.
[00:00:08] Eric Hileman: Yeah, so this is Load In The Loop with Ivan and Eric. I am CEO and co-founder of MageMojo Magento hosting we specialize in hosting for Magento on AWS and we run into a lot of performance issues? So that's why we have Ivan here and I'll let you introduce yourself.
[00:00:34] Ivan Chepurnyi: Yeah. Hello everyone. My name is Ivan. I'm running a performance consultancy as an independent consultant with my own BB in the Netherlands . I work a lot with merchants and solution partners and also, I run into a lot of performance issues not only in a custom code built by others, but also a lot of performance issues in Magento core itself.
[00:01:03] Is this the show is going to be more about performance and about the issues we face and how to actually to fix them. So it's going to be a little bit of fun and a little bit of educational material for everyone. So I hope everyone going to enjoy it. So Eric, can you tell me a little bit about the project we're going to review today.
[00:01:27] Eric Hileman: Yeah, so we have this M1 site, fireplace, fireplace stores and it's a customer of ours and what we've been seeing is: they are running into slow site issues and the slow site when we look at it, we see a lot of long running PHP-FPM processes and when we attach to those and we S Trace them, it looks like it's just core URL rewrite over and over and over and over and over and over which looks like a Load in the Loop.
[00:01:58] Ivan Chepurnyi: so far. I was already browsing this website. And actually I already found the load in the loop but let's just explain a little bit the process how we go through it. Right? So first of all, we try to profile every single page we see and let's start with the home page here. We see there's actually a lot of brands listed and models of the fireplaces as far as I understand. I'm not a good at fireplace products, but looks like a business-oriented website like B2B right? Or something similar like it's very simple. It has nothing to do with presenting product with pictures and so on. It's more kind of for people who know their job and just finding products by part numbers, right?
[00:02:53] Okay, so we actually got a very nice sponsorship from Blackfire so [00:03:00] big shout out to them and they provide us licenses to be able to profile your stores. Live here on the show and give you valuable advices? So thanks to Blackfire and we are right now using Blackfire in order to find issues on this customer's website.
[00:03:22] So let's start with profiling right? So I'm going to profile homepage and when you start profiling right you already can see if it's fast page or not. Here, we see it's incredibly fast because. Blackfire usually runs kind of ten iterations and then it gives you an average profile results based on ten iterations.
[00:03:46] And in this case, we see that yeah, it's very fast. It's on average half a second to render homepage. That is a good result. There is nothing actually to see here and in [00:04:00] profile. So let's go. Further on so let's just click on any of the links here and a home page with Brands. So I'm going to go into this one.
[00:04:10] Yeah seems also like a simple page. Let's just run a profiler really quick.
[00:04:17] Eric Hileman: It's loading pretty fast.
[00:04:20] Definitely cache is on.
[00:04:23] Ivan Chepurnyi: Looks like cache has nothing to do here because as far as I see if there would be a full page cache on or something like that. We would see less database queries here, 77 database queries, and it's a good result for average category page.
[00:04:41] So I have a feeling like yeah, this is just page without much of the data. If this customer would sell configurable products. We could see much more database queries here because configure products are very prone to load in the Loops. So [00:05:00] here we have also, I see some kind of a drop-down. Let's analyze this one a little bit.
[00:05:07] And see it's hard to get a quick over here. Okay. So here I see there is kind of Brands so you can select the brand and then we see there is two last items that give me a little bit of hope that we can find some...
[00:05:25] Eric Hileman: Yeah complete model listing, yeah, that looks like a lot are in stock. Yeah checking
[00:05:30] Ivan Chepurnyi: Yeah, so in stock Fireplace parts and fireplace parts complete model list and let's start with in stock fireplace parts. Let's see what's going on on that page. Okay. So it lists quite a few products, let's run a profile here. And let's see what it's going to show us. And I'm gonna grab some sips of coffee.
[00:05:58] Eric Hileman: This'll take a little bit.
[00:06:00]Ivan Chepurnyi: So I see we even didn't hit the first request. So you remember homepage like it was like one after one after one after one and here we have to wait a couple of seconds. So it's an indication that there is something happening over there.
[00:06:20] Eric Hileman: How much overhead does Blackfire add in this profiling and does it exclude that when we look at results?
[00:06:27] Ivan Chepurnyi: No, unfortunately not. So usually I would say it's something like 50% because Blackfire needs to put hooks on every single method to function, locale, and so on in order to know how much time it took. And in order to create a report how many times this particular method has been called or this particular function has been called what kind of database query was running and so on so it takes a lot of time to do it for Blackfire.
[00:07:02] So usually it kind of doubles the execution time. So the numbers that you see. Actually in Blackfire you can just divide them by 2 and then you get more or less reasonable result and in case if you just want to know the actual uncached result, you can open always debug Tools in your browser and check what is the waiting time of the response from the server. It would be the best indication of how much time it took to render your page. So here we see like it takes 7 Seconds. What is interesting, I was expecting actually to see here a lot of database queries. Yeah, but here, right it looks very suspicious because there is actually only 77 database queries.
[00:07:47] There is nothing happening on iO side as well. So let's analyze here. So here we see that big memory used. This is 16 megabytes almost nothing. So CPU time is actually the biggest time here and here we see the probably there's some kind of CPU bound process as it happens. On this particular customer website, right?
[00:08:14] So what what can we really tell from this result? Right? We cannot tell anything unless we actually dive deep into the code what I really like in Blackfire that actually can tell you the piece of the part of the page that actually slows you down a lot because they have this amazing view called timeline.
[00:08:39] And in Timeline, you can go into complete rendering process. It shows you actually kind of like a waterfall of every single PHP functions that's gets executed. So here right away or right out of the bat. I see that. This customer has not disabled session locking in redis. So...
[00:09:06]Eric Hileman: That's the usleep.
[00:09:07]Ivan Chepurnyi: Yeah, so that's it usleep and this is actually exclusive time of the usleep. So it adds at least one point four seconds right on his page
[00:09:19] Eric Hileman: Can you talk a little bit about that? What is that usleep in the redis session? Trying to get a lock?
[00:09:26] Ivan Chepurnyi: Yeah, so it's trying to get a lock. It's actually very bad to lock implementation because usleep will put your process to sleep. And if you have a lot of concurrent processes. You are actually not eliminating any kind of race condition possibilities by usleeping because you are just expected somewhere in the future, you will wake up and the lock will be gone.
So you can process with or proceed with the request. And in this case, we see that it's you sleeps at least one, two, three, four, five five times. So 5 times you sleep. It means the process was checking five times if the lock was released and only on the fifth time. It didn't see any awake and it was able actually to get the data from redis session.
[00:10:19] However, locks locks itself are redundant on reading the data from session storage because when you write into a session storage, it's an atomic operation because it's only one key itself and and looking over there and checking for possibility of the lock is completely redundant. So I usually always disable session locks on reddis.
[00:10:50] Eric Hileman: Yeah, so that's a change you can make on the redis side to disable the locks.
[00:10:56]Ivan Chepurnyi: It's Magento. I think it's Magento 1, right? So in Magento 1, it's app/etc/local.xml dot XML. So when you configure redis as you have one of those flags and one of those flags is actually disabled working.
[00:11:09] Eric Hileman: Oh, yeah, right.
[00:11:10] Ivan Chepurnyi: Yeah, so you can put it over there and another one that stands out a lot we can see it's huge green bar here in the bottom, right? It takes another six seconds. And this is Major review model review append summary. And here where I usually, as reviewer of the websites have performance issues, need to get an access to the core code or to the code that actually I see here that gets executed. So for instance if it would be a custom of functionalities that was built on his website. I will get in contact with the customer and get the complete code base with all the extensions installed everything in order to be able to look at the code and see what actually happened over there because backfire only gives us hints.
[00:12:04] And the actual problem you need to dive deep and analyze more and here for instance, we see that it's actually a core functionality of Magento, in Magento 1 actually that produces a huge performance bottleneck. It's called append summary so we can just go online. Let's say we can go into GitHub. And there is this nice OpenMage project.
[00:12:32] I think right? And we can go into my Magento mirror. We can go into particular modules that is an interest for us. So I'm going to go into the review again. What was it? It was model review. Okay, so it was model-review, right? And what we can go here we can go into this append summary. So let's take a look at this one.
[00:13:05] So if we read through this code what stands out here, right so it receives a collection and this is a collection of products. If as we see from the method signature. Does this collection of products is used in order to collect all the IDS of products that might have reviews being loaded additionally for this item then if there are no products in collections, then it just returns: reasonable. And then if there are some products it actually adds an entity IDs as a filter for retrieving data from another collection called review summary collection. So this one contains a summary of the review like the rating of the product and not total number of reviews. And so on. And here we can see that yeah, it also adds a store filter and then loads it and then it walks through this collection of products Okay, so.
[00:14:18] Here I see there is a very small amount of I/O operation. You see it's only 227 nanoseconds. So it's very small. So this is probably only two database queries that are happening over there and these 2 database queries. It's actually our load of the product collection and load of the summary collection most probably.
[00:14:45] Aunt here then we can see that there's actually an asset for each and well what I mean about necessary shy means that there is one collection gets iterated through and then inside of it w, actually iterate through another collection of the data, right? You see here, there's actually two for each statements.
[00:15:06] And this is already weak red sign because if we take a look at Big O notation, it's kind of standard for mathematicians and computer scientists to explain complexity of an algorithm. And in this case, this kind of two loops in order to find particular data Based on data in these two Loops is called Brute Force search.
[00:15:41] So Brute Force search and it's when you actually visit every single item from one collection of data and every single item from another collection of data in order to find if these two items are related to each other. This is called Brute Force and this is Brute Force approach is actually what's causing the slow down for this customer.
[00:16:07] Right. So the problem here is that the Magento core code in this particular area actually does two necessary iterations and these two necessary actions are completely redundant because of one single reason. Let's try to actually edit a file and just show what the real implementation should look like because yeah, we can't edit the core, but we can do it for educational purpose right to explain how you actually can fix this problem very easily and very simply so the customer probably would have to override this model in order to fix this performance bottleneck.
[00:16:53] So here we have a collection of products. But what's interesting in this case, we're actually not interested into iterating all the products because we anyway have to assign only by this entity PK value, a value to the product if there is some data available for it. So what I can do I can actually remove this Loop completely.
[00:17:17] And when I'm doing it, then I only iterate through available summaries of the reviews. So if there is a review we're going to assign it to product if there is no review we just don't do anything. So we're just not iterating not not doing any additional redundant work. And here Magento core collection actually has already a very nice method we can use so we can use here instead of this condition.
[00:17:51] We can create another one. We can say that item equals collection get item [00:18:00] by ID. And this get item by ID will use a hash map inside of the Magento product collection in order to find product by ID so here we just specify this summary, get BK, yeah, it's entity BK value. Okay. So this is the name of the method so here is actually a very simple fix for this functionality. So we check if we get an item by ID and if we get an item by ID, we just set to this item this rating summary and just applying this very simple change would remove a lot of time from this rendering so I don't want to actually run this 10X iteration profile for this page because I already know it's going to take a while and probably is going to take the rest of the recording if we just run it, let it run 10 times.
[00:19:07] So I'm gonna use a feature here in Blackfire called disabled aggregation. This would allow me to run a profile only once instead of 10 times. So let's wait till it finishes running and let's go back to the code and discuss it a little bit and here is the code that we actually modified so we just walk through the summary if it exists if it doesn't exist, we don't do anything as this one was doing already. It was just iterating through all the products and then iterating through all the summaries. So basically. Even if there were no summaries to deal with, Magento was still iterating through every single product anyway in this specific method.
[00:19:58] And here, right, we can see that we just replace the situation with simple lookup by hashmap. And then we just set a rating summary to an item itself. So this one would replace this brute force look up or search by actual Big O notation OM. So it will only execute it number of times the number of items we have in original collection because hashmap lookup is 01 so it means the complexity is very low.
[00:20:41] And the brute force look up has a complexity of big OM in power of M. So where M is number of one collection and M is number of other collection and it means that they actually have an exponential number of times with the number of data we have, we have bigger number of iterations and operations we have to perform.
[00:21:09] So that's why it's called Brute Force search. So let's take a look. Is it finished? Yes. So here we can see that. Yes, the time is actually exponential because we just increased the data set a little bit but it's already like three times more right? From the point of view of CPU time. And yeah, here are all products we can see it has this timeline here and let's see.
[00:21:42] What's happening here. Yeah. Now we see that this append summary is actually what takes most of the time. So, before in this profile, right, it was taking five seconds and here it takes all 28 seconds in order to render this page.So I can tell for sure that this customer of yours will be very happy by not paying too much in future for his all product category page.
[00:22:15] Right? And and yeah, I think like there is some merit why no one actually notices issue before in Magento core. So it's kind of unusual when you actually show all of your catalog on a single category page.
[00:22:35] Eric Hileman: Yeah. I'm also curious why they're calling append summary. It looks like it's just the link. Do they even really need to be calling append summary? That's a review summary, right?
[00:22:50] Ivan Chepurnyi: Yeah, it's just a standard Magento Behavior. So when you are on a category page, and when you load a list of the products, and you have Magento review module enabled, then Magento tries to add a review item to every single product you have in your collection and with the case of this customer: unfortunately, he has too many products. And yes, there is not much database operations because majority of the issues I deal with is when actually someone calls database query for every single product of every single configurable assigned to it. So. This is very interesting problem was actually to find. So thank you very much for this very nice use case because I was really surprised and with my open eyes when I saw actually this number of database queries because for me like 30 seconds and you have only 75 database queries?
[00:23:53] Yeah, what's going on there? And actually supporting this issue in the core code was a very nice exercise.
[00:24:04] Eric Hileman: Is there anything out there already to override? Like I'm surprised nobody's like created an extension or something that would optimize that?
[00:24:15] Ivan Chepurnyi: Yeah, it sounds like not a lot of people are actually facing this issue because not not my diversion is actually showing all the catalgog.
[00:24:23]Eric Hileman: 80,000 products on a category page..
[00:24:26] Ivan Chepurnyi: Yeah, so I would even say like 30 seconds. It's probably a good result for this kind of number of products. So yeah in general. Yeah, how many nodes does this customer have to to actually bootstrap in order to support both to browser/website.
[00:24:58] Eric Hileman: Oh, yeah, that's... We've had to Auto scale it in there. They're not even... The traffic's very low and it's hit triggering auto-scaling. So, you know, on Stratus we pay for that. So the customer just pays per Google analytics session and Google does a good job filtering out Bots. But our team still has to restart PHP fpm for them once in a while because it gets hung.
[00:25:25] Ivan Chepurnyi: Yeah, so basically yeah, it's very interesting. So we actually don't make any money with this customer, right?
[00:25:34] Eric Hileman: No, no and the customer, and the customer gets upset too because they come to us and they say well, "Why isn't... What's wrong with the server? Why isn't the server working? Right? You know something must be wrong on the server" and I'm like well, I don't think so.
[00:25:48] Everything looks good.
[00:25:50] Yeah, I've seen a lot of looped things...
[00:25:52] Ivan Chepurnyi: It's very interesting thing. You know, like all the time get surprised when a customer actually decides to migrate from one hosting to another. Just because the minuscule of the difference between performance because maybe there is some different MySQL configuration, or redis configuration but usually like migrating from one hosting to another doesn't make sense unless you try to optimize your software, because if your application is slow, if your application doing this stupid things like we just say alright?
[00:26:25] Eric Hileman: It comes down to support.
[00:26:27] Ivan Chepurnyi: Yeah, you know know. No, no matter what kind of hosting you have maybe yeah, if you migrate to AWS with the most recent CPUs, maybe it will be a little bit faster. But yeah, why would you pay for these CPU Cycles anyway, if you can actually make much more efficient algorithm?
[00:26:51] Eric Hileman: I think a lot of customers can't figure that out. They don't have the knowledge between the server and the code so they look at the code and they don't , like they see that but they don't. Like for example, our team will connect with S Trace and I don't know any other host that will do that. They'll actually go into the code say it's this file, it's this line number, you know, our team will attach S trace to the PHP fpm process and profile and says, "there's definitely something looping going on here."
[00:27:22] So then our team would continue investigating it until they figured it out. But now that we have this podcast, we can pull out these examples that we would normally do and that's the main reason people change host is because of the deep level of Magento knowledge.
[00:27:37] They kind of keep hopping from one host to the next until they hit one that has the tools and the team and the time to put into it to help them understand how that's hitting the server because the developers themselves usually don't get that kind of deep level of understanding of the server; where the server meets the code or the rubber meets the road.
[00:28:01] Ivan Chepurnyi: Yeah what I see as a problem in general in developers like lot of developers think in point of view of objects. So no one thinks about the actual service layer itself and what data you need in order to render this part of page. They think, "Hey, which object I need to grab from Magento in order to render this part of the page?" And then they end up actually looping through a lot of objects where they only need need to let's say only one property out of the table.
[00:28:34] You know like no no one thinks about hey, if I need only one property of the object, maybe I don't need to have the token. Maybe I have to create another service module that will do a lightweight query with number of IDs of the entity. I'm trying to get data from you know, it's it's always kind of finding a balance and I'm really surprised at not that much people actually pay attention to these kind of problems. You know? I have met so many different solution partners and I was working also out of whose merchants and when merchants come to me and tell me hey, can you please help us out with the performance? I try to make it as much as possible clear to session... Solution providers that they're using, how to actually avoid those issues in the future.
[00:29:29] Eric Hileman: You do train though right?
[00:29:30] You do onsite training?
[00:29:31] Ivan Chepurnyi: Yeah, I do on-site trainings for solution providers. But when I get paired with a merchant, Merchants usually don't want to pay for training of solution provider. But I anyway try to at least have one two hour meeting after performance audit with possible fixes that I applied for a merchant website in order to explain to them what I actually found, how I found it, how I use those tools, and how they should avoid those issues in the future. Because iIdon't want to be a person solving stupid performance issues. I want I want to you know, work on the more complex solutions, create some nice services, you know, like build microservices that perform very well. I don't want to go in the merchant website and fixing all those load in the loops.
[00:30:28] I want to see less of those in our ecosystem.
[00:30:32] Eric Hileman: So do you work with agencies, then you'll go on-site and do training what their developers to try to...?
[00:30:38] Ivan Chepurnyi: Yeah. Yeah. There's the most recent one was Imagination Media in Florida. It was very nice trip. Just right after Meet Magento New York. Ali asked me to come over and help them out with one of the merchants and we had actually a week of on-site performance optimization.
[00:31:00] So we were doing performance optimizations of one of their customer projects directly with them and it was interesting because we actually reduced the number of database queries that was happening on their Magento to store from something like 3000 to 60. You can imagine. you know, how many extensions we fixed and how much core code we also fixed.
[00:31:33] Eric Hileman: That's next.
[00:31:34] Ivan Chepurnyi: Only one week, only one week. You know, like it's crazy because those issues are easy to spot and easy to fix. It just usually about hey, here we have to get the data from the index here, we have to get the data from something... Maybe from a simple database query we have to do for list of products not for every single product.
[00:32:04] So we just fix those parts. And as well, you know, there is a lot of things that are related to unnecessary HTTP calls as well or for instance a nice use case. I think we found with amasty modules right? When amasty modules also time check their version from a file system by loading composer Json file...
[00:32:36] Eric Hileman: Oh, yeah...
[00:32:36] Ivan Chepurnyi: And it's like adding like half a second to the page load time because composer validates all the JSON structures runs a lot of CPU logic that was there and it's just terrible so people just... that do other things and for instance, I remember we were also doing something as well with your team. We were doing training.
[00:33:00] For meet Magento New York and with your team, we actually hit one of your customer websites and it was having some kind of an extension that was just logging its own version number into the lock file, but in order to obtain this version number it was using not the constant, it was using a value from configuration file in model dot XML and this module dot XML when you try to access this information Magento has to hit uncached version of the configuration.
[00:33:40] So it has to walk through every single module that is installed in Magento, read the XML file, mesh it all together and then inside of this big XML structure, it has to pick up this single module and return you back this version number. So sometimes developers never profile their own code that they write.
[00:34:05] So extension developers just don't care like hey, here's an API. They don't think about what's happening behind the API. And this is so it was a bit of a problem because some API have different groups. For instance that API for version retrieval of the database? It's meant to be used only during the setup process.
[00:34:30] So when you set up a grade or where you're doing a set up DB status checks, but it's never meant to be used on the front end because on the front end you already get updated database. You're never comparing your versions of the database schema. So it's it's kind of things that just happens to be and you have to find them and then just outburst on Twitter and tell everyone. Hey guys don't do it.
[00:35:03] Eric Hileman: That's you know part of this... Is to call out the extension vendors and I know they can you found one extension issue and then found a GitHub report of it from years ago about the issue and it's still hasn't been fixed.
[00:35:24] Ivan Chepurnyi: So yeah, I think we need to need to make a little bit of an announcement before we close the show, right?
[00:35:32] So we need to tell people that they can actually submit their stores for review. Right. Yeah, so guys if... And girls go and submit your stores. If you are with MageMojo it's going to be very easy because MageMojo has Blackfre installed. If you are not at MageMojo, we have to do a little bit of migration..
[00:36:01] to a temporary... yeah a temporary migration toMageMojo in order to be able to profile your website. But if you already on MageMojo it just a matter of minutes in order to start and review your website. So feel free to contact MageMojo or contact me. I can forward all the requests to MageMojo.
[00:36:27] Eric Hileman: Yeah. Yeah. What's your what's your email address Ivan?
[00:36:30] Ivan Chepurnyi: Oh, it's email@example.com.
[00:36:39] Eric Hileman: You can reach us LITL@MageMojo.com... Load in the loop. You want to if you want to get reviewed? Be happy to do that for you.
[00:36:57] Ivan Chepurnyi: Yeah, and you always can reach us out on Twitter as well.
[00:37:01] So it's I think is the fastest way to get our attention.
[00:37:07] Eric Hileman: What's your Twitter handle?
[00:37:09] Ivan Chepurnyi: The same as my name: @ivanchepurnyi. But yeah, I think most of the people can find me on Twitter because a lot of people know who I am.
[00:37:22] Eric Hileman: Yeah, you have more followers than I do. That's just my name @ericvhileman.
[00:37:31] Ivan Chepurnyi: So I think it's time to close the show. Yeah, and thanks everyone for tuning in. I hope everyone likes the show and till the next time bye.