Before we dive into the nuts and bolts of how GMS is implemented, how it is architected, let’s first understand at the high level what’s going on in terms of page fault handling now that we have this community service underlying GMS. In this picture, I am showing you two hosts, host P and host Q and you can see that the physical memory on this host is divided into the local and the global part. Similarly, the physical memory on host Q is divided into the local part and the global part. And we already mentioned that these are not fixed in size but the size actually fluctuates depending on the activity and that’s what we are going to illustrate through a example situations. So the most common case is that I am running a process on P and that page faults on some page X. When that happens, you have to find out if this page X is in the global cache of some node in the cluster. let’s say that this page happens to be in the global cache of node Q. So what will happen in order to handle this page fault is the GMS system will locate, oh this particular page it’s on host Q. So it’ll go to host Q. And the host Q will then send the page X over to node P and clearly, if there there was a page fault that means that the memory pressure on host P is increasing and therefore, it is going to add X to it’s current working set. That is it’s local allocation of the physical memory is going to go up by one but it cannot go up by one without getting rid of something here. Because the sum of the local and global is the total amount of physical memory available in this node and therefore, what P is going to do is, pick the oldest page that’s available in the global part and send it over to node Q. So, in other words, what we are doing so far, as host Q is concerned is saying, well X happens to be currently in the working set, then resend it to host P. And host P says, well, my working set is increasing. Therefore, I have to shrink my community service and we going to reduce the global part by one. Pick the oldest page. let’s say it’s Y. Send it over to host Q. So that the host Q can host this new page Y in the global cache on this node. The key take away for you is that, for this particular common case, the memory pressure pressure on p is increasing, so the local allocation of the physical memory goes up by one, and the global allocation, the community service part goes down by one on host P. Where as on host Q, it remains unchanged because all that we have done is we have traded Y for X.