Click on the imagery below to see it in full screen and use the arrows to navigate around the area. Google Street View, you should check out InsantStreetView. Binary accounts for you. It can even be accessed from the web browser on your mobile device. Why not explore some of the best places on Earth with Street View? This material, and other digital content on this website, may not be reproduced, published, broadcast, rewritten or redistributed in whole or in part without prior express written permission from PUNCH. Google Street View app you can use.
You can use request to edit road segments to suggest a new location be reviewed and possibly added at some point in the future. You can click on that to view it in fullscreen so you can move around and start exploring. Pegman, then that means Street View is not available for that location. The Instant Street View site is great if you want to look at a specific location immediately, but if you know how to use Google Maps already, then you can not difficult switch to Street View from there too if the location you want to look at has been photographed by the Street View team. So you plugged in your home address and got nothing. Consider checking back in a few months or so to see if your house or a particular address has been added to Street View. If what you type in is too vague, a dropdown list of options will appear as suggested locations that match your entry. Some rural areas are still being mapped.
The 360 imagery closest to him will appear below. Start by accessing Google Maps by navigating to google. Google Maps as a way to contribute, so that you can help users see more of what they want to see in those locations. When you find the right place, you can use your mouse to click and drag around to change direction, and use the arrows at the bottom to move backward, forward or sideways. We can only send, in this case, four, which is the previous value for this congestion window. So this is the only value that has changed, so in this case, I would only send this one key value pair, which is great. Let me close the connection. The speeds are increasing. This is the graph that is the key graph that got Google to start thinking about this seriously and actually start the work on SPDY back in, 2009, I guess, or 2008 even.
It was like, get this resource, version number. Thankfully now we have some automation for a lot of that, but I still know people who do this by hand, which is kind of sad. Most recently it has actually been updated just in the last year to 10 packets. Oh, this thing that I received belongs to that stream. This is something that we need to do a lot to optimize. We only send 20 kilobytes. This is another actually interesting opportunity for servers to do a smarter job, and intermediaries as well, in terms of what are the right algorithms for doing the eviction of these headers, so on and so forth. TCP Slow Start is a feature, not a bug. If you follow all of that, that has to happen within, well, ideally milliseconds.
Google is trying to figure out, how do we make our product fast on mobile? Then we send the request. This is not even to your actual server. Actually, let me go back. After that, we actually just extend it. That also has negative costs. You had to reboot the whole thing. If we saturate all of our links, we can just dig another tunnel and put more fiber; we can just bond the different links and get more throughput. Compared to what we were building five or even ten years ago, the web today looks completely different.
HTTP is not fast enough. The type basically gives you what type of a frame is being communicated here. It just terminates the connection. So these are the current limitations. This is an entire frontier of performance that I think not a lot of people are paying attention to today. Earlier this month, we actually had an interop testing session in Hamburg. We like to write cool things.
First we have to open to TCP connection, which is the SYN and SYN ACK. What if I have those three assets in a cache? How many people here are familiar with Slow Start? We are limited to 6 connections. All the stuff should be handled in HTTP, which is how it should have been to begin with. Basically there needs to be a mechanism to figure out where you currently are and which tower is currently servicing you. So it kind of sucks. Java client, you send the request to the server, the server can push multiple responses back to your client and you can do smart things with it. And once a year we basically run an analysis of what is the average or the median page load time on mobile versus desktop. Thankfully, as I mentioned, 4G and LT deployment, for once, North America is leading deployment of this.
So most of the shift in this latency is not represented across the world. So they construct that map and then they start pushing these assets to future clients. We have the best performance. North America, which is good news for us here. So what if it could send you all three requests at once? So you actually send all these key value pairs to the server. One, we need to talk about how TCP works.
SPDY, we remove all that logic. And this is just your last mile latency with DSL. Multiple streams can flow over a single TCP connection, which is the same thing as saying multiple requests can flow over a TCP connection. And some of those things have worked out and have been great, and some of them have not. TLS time, which takes another couple of round trips. So control plane, the idea here is, first before you can send anything from your mobile phone, you actually have to talk to a tower to get permission to send data. Then we abort the connection, which sucks. Is this such a big change?
Google, of course, has been using SPDY for years now. Each stream, the client and server, whenever they create a stream they declare an ID on it, like 1, 3, 5, 7, 9, etc. We know that we have problems in the HTTP layer. It sends the packet to the serving gateway, and the role of the serving gateway is to figure out where you are on the mobile network. Because as technologists we like to solve things. New York City to London. On each request, you just toggle those bits.
Which I think is a surprise to many people, engineers included. We have Japan leading the pack. This is going to work. We record how the page is constructed, like how many JavaScript files, CSS files, etc. So what ends up happening is we end up opening a lot of these TCP connections, and we never ever use the actual, or frequently, I should say, not never. This effort actually predates January 2012. Basically, this company has figured out that, hey, there are traders in New York, there are traders in London, that care about latency.
So, you do that. All of the previous optimizations like positioning your data closer to the user still applies. Each of those TCP connections has a memory buffer, which is quite costly in many cases. One keeps an even number. TCP connection, this is all for nothing, this is completely useless. What have we done over the last decade?
MessagePack has its own RPC layer. There are two parts of this. Then there is a stream identifier. But the first one is not finished. But you already know that the client, or the server in this case, already has all these values from a previous request. So this is the exponential growth. Internet fast as a whole. So that kind of makes sense, right?
This is exactly the reason also why WebSockets work over a TLS for mobile and other cases and they break for a lot of clients, especially on mobile, when running over vanilla HTTP. Hey, previously I was servicing the JavaScript request but the JavaScript request has higher priority. But then there was a couple of attacks discovered against it, basically security problems. Our last mile latency is x milliseconds. HTTP is originally we started with a very simple protocol. Two, it actually has negative implications on the memory use. So how does push work, CDNs and intermediate cache? One keeps an odd number. So that tells you that if you want to run out and upgrade your connection and buy into the advertising of newest, fastest whatever offered by your local provider, your page is not going to load faster.
Which kind of sucks, right? You want to compress the data, etc. So we can send up to 15 kilobytes of data, which is significant improvement over the previous value, which was three or four packets. So this is completely automated, which is the nice part about it. Once you know the length, you can figure out the type. We want to preserve what we have. So these are the kinds of things that need to be fixed at the server layer.
But basically most of the countries are near or well above the five megabit per second limit. We just said look, just gzip through the damn thing. Get request for example. It was just literally one line. Stubby or something else within a company. So we can send you 15 kilobytes of data, then we have to pause. This was definitely a surprise even to us when we ran these experiments. It basically starts a negotiation with a local tower. If you actually dig deep they will show you these numbers in there.
TCP connection, how do you rate limit or control the allocation of resources between those flows? If it goes above that, significantly above that, above one second, you basically lose the context. MME instance which is basically like a user database. So, how do we make the switch as seamless as possible? But if you control both the client and server, go for it. Let me just shard that across ten different domains. This is a big thing. And as you saw previously in Akamai slides that I showed you earlier, most of the people, an average in the US, is over five megabits per second. So the basic idea with HTTP pipelining is, by default, HTTP provides no multiplexing in the sense that you send a request and you must block and wait until you get the response. So today, actually as of July 2013, we actually have the first implementation draft.
This is just something we have to deal with. We just want to use one TCP connection because that is, in fact, the best way to get the best throughput. So all of that needs to be replaced. JavaScript file, which is critical, because we need it to render the page, and we have a couple of image assets, which are not critical. And if you do the math here, you will figure out that an average request is about 14 kilobytes per request. Right off the bat, what are we trying to solve?
So you send the packet from the external network. To keep the user engaged, the task must complete within 1000 milliseconds. Now what you can do is, you can send a naked request. So this is basically how do we make HTTP work better with TCP? Look, so why is this problem? We want to have everything fast. Hey, six requests in parallel.
You may get an upgrade in your quality, but your page is not going to load faster. We send the request. The towers broadcast the signal. JavaScript to deliver smarter applications now. That also requires quite a bit of plumbing and architecture in terms of how do you deliver that to client. Our pages are much, much bigger than that.
We will just keep one constant and we will just increase bandwidth and see how that affects the page load time. So once again, this is something that both web developers and server developers need to really carefully think about, like how do we leverage this new thing that we just never had in HTTP before? We are terraforming earth between New York and Chicago to build faster links. ALPN adds a mechanism into TLS negotiation where you can actually negotiate the application protocol that you want to use during the time of the handshake. We said, look, HTTP requests are expensive, especially small ones. TCP packet loss of money happens. Well, then you can actually cancel the stream.
The server needs to be smart. New York and back in 43 milliseconds. And Google servers, we try to position them as close as we can to all the ISPs for this exact reason. Hey, I have a video stream and I have this stream. Each frame can have a number of custom flags that each frame defines. We would like to see it improve. Grab all the JavaScript, stuff it into app. So there are some interesting examples of people innovating in this space.
What are the resources you should push? But then also Facebook, Twitter and others started picking it up. We can just make these requests. Then, finally, you embed the headers. Hey, this is a JavaScript file. HTTP connections to the client. So these are the stats.
JavaScript and CSS and images. So the good news is it is a little bit smaller on mobile, so we are optimizing for mobile, but nonetheless. Ilya Grigorik is a web performance engineer at Google, where he spends his days and nights on optimizing the web stack, and driving adoption of performance best practices. TLS, TLS optimization is critical because a TLS handshake is actually very costly. At least, we clarified a lot of the caching. Is it a headers frame or a data frame or something else?
So these sprites are actually occupying quite a bit of memory on mobile devices, which is a problem, actually, for a lot of mobile devices. It is a problem. By January 2012, we had Chrome, we had Firefox supporting it. We anonymize all that data. So my phone is sleeping right now. But the other one that I think is also surprising to a lot of people is slower execution. But if you have an intermediary, then it can be smart about it too. Most clients in most languages, especially, actually, the default HTTP clients are terrible. The serving gateway has no idea. We know that this thing works.
So hopefully by now I have convinced you that latency is, in fact, a problem. This is literally a talk on its own, but just stay with me. Today we do a lot of interesting tricks, hacks, if you want to call them that, in the browser to try and kind of game the system and figure out which requests do we send because we have a limited number of requests, etc. We wanted to render our page in one second. We want to preserve the ecosystem and make it as seamless as possible, ideally, to migrate. If I send the image request first, will I get them back quickly? The end result of all of this is actually, well, simpler applications. And specifically the fact that we have a very strong rollout of 4G networks across North America. We have to do the socket connect. If you click on a button we want to respond to you within 100 milliseconds.
TCP performance part becomes even more important in many regards, so you should definitely upgrade your Linux kernels, make sure that you have the latest TCPU window or congestion control in place. Oh, mobile networks are so unpredictable. It has just been stable, which is not good. We can write the spec, but the servers needs to get smarter. HTTP upgrade flows, if you guys are familiar with WebSocket. But of course, one of the gotchas here is the server needs to be smart about this. They are the two components of speed. The only reason it exists is to work around this limitation, which is imposed intentionally by the browser vendors to say that too many connections actually hurts you, all right, because it causes congestion. Every year, we actually use Google Analytics.
And our pages are not one request. Actually, Microsoft has build a server implementation, so Chrome and Firefox were testing against the Microsoft server. It just knows that generally this person seems to be in the San Francisco area. So all of that kind of in a simple picture here. Like, how do the mobile networks work? That communication with the tower actually takes anywhere from hundreds of milliseconds up to seconds.
So FCC, for the past couple of years, has actually been doing a report or a yearly study, which has finally started to capture some of this data. Basically, the theoretical limit is 35 milliseconds. This is just a fresh request. Which is kind of sad. The flow outbound from your device is a little bit simpler. But then you look at latency. At some point, packet loss of money will happen, at which point we will restart this algorithm. We want to get the best performance out of a single TCP connection.
Mountain View earlier today. Part of the thinking, right now at least, is that CDNs can actually be providing this push. Never, ever would they advertise such a thing. Today the web is basically built on this model right here, which is sequential, and our only work around is to just open multiple connections. Not limit, just five megabit per second threshold here. So this is data from 2007 to basically the beginning of 2013, and you can see that there is a strong trend towards basically increasing throughput, or bandwidth across the world. So this project costs about half a billion dollars.
So the cool part about this, these graphs are actually comparing 2012 to 2013. Two, is we want to address head of line blocking. It turns out it actually, in large part, explains the variability that a lot of people experience with high latency variability. We have 31 bits of priority space reserved for that, so you can be very sophisticated, if you want, in how you prioritize this kind of thing. And bandwidth matters for things like Youtube videos and HD videos, you like to watch Netflix, what have you. So why does this affect HTTP in particular?
The network will save us, the 4G. You start with one megabit of throughput, and the page loads in about three seconds. So this is a big problem on mobile. TCP performance optimizations done since then. And how your server implements the logic of when to increment that window is completely up to you. Oh, this user is under attack. This just creates more and more latency. So when you connect to a server we will open up to six parallel connections, which means that we can transport, at most, six requests in parallel, or get six responses in parallel. We can continue increasing bandwidth, but latency is a problem. It takes multiple roundtrips to do that, so you have to pay attention to your certificate size, you have to optimize your record sizes.
It makes it very, very efficient to transfer this meta data. We can keep it nice. So we just increment those. Linux kernels have updated their CWNDs to start with 10 packets. So TCP forms are very, very important. Let me walk you through this. Then the last is, of course, crowd favorite, resource inlining.
These are kind of the core, these are the best practices that we preach that every site should do. We want to have one TCP connection, which has the pipe wide open and we can just push as much data as we can. Then you actually have to route it on the external network. On 3G networks, on the old generation 3G networks, it literally takes seconds to do that. We want to improve the end user perceived latency. Keep that as separate because that will help you. And all of this is just to send a single TCP packet. We have to wait until the entire file arrives and only then can we start executing that file.
That number has been updated. Chrome team and the Make the Web Faster team at Google. JavaScript, which kind of seems silly. TCP Slow Start in a nutshell. Your servers will have to maintain fewer TCP connections, which is a big deal for people that are running servers that have to handle a lot of TCP connections. This is just one TCP packet to send a notification.
This is a big, general problem across the web. First of all, we have TCP congestion control and avoidance, and specifically we have this feature called TCP Slow Start. So this is just a one time startup cost when your phone has been idle. Basically what happened was in 2008, 2009, at Google, we looked at this latency and bandwidth study. So another component that is often forgotten is latency. Latency is so variable. Servers needs to respect priorities.
So this is an entire talk on its own. So, we want to be here. The server can interleave the frames and it can use one TCP connection to deliver them in parallel. We know that, and we know that latency matters for traders where nanoseconds count. It will help make our clients faster. He has extensive technology background including experience running his own internet hosting company. First of all, one TCP connection. And over the lifetime of the connection, you build up this header space and you can basically be very efficient in how you encode and decode this kind of thing. We actually added this feature called pipelining.
They just look for strings in the byte stream and just swap them out. So a lot of user traffic is migrating to mobile phones, something that we see in spades at Google. It has a single connection and it can split all of those requests and responses into individual frames. But the problem with packet loss of money is when that does happen, we reduce the size of the window, the congestion window, in many cases, fairly significantly. Unfortunately, one of the things that has not worked out is HTTP pipelining. The first thing that you read is the length of the frame, and at that point, you know exactly what you need to do to parse this. Google Analytics collects navigation timing data, which is basically the real user timing data from the clients when they access your page. The video stream can not difficult saturate my link, but I want to limit it at this amount of throughput.
There are definitely bigger requests for things like images, but most of the other assets that we download are very small, and we download them across many connections. TCP is optimized for bulk and long transfers of data, whereas a lot of our actual traffic is short and bursty. The client opens multiple requests. It comes in into the mobile carrier, and the mobile carrier basically has one global router, which is the packet gateway. So these are the high level goals for the protocol. There is going to be commercial support for this kind of stuff.
It gets updated in the user database, user database gets back to serving gateway, serving gateway can then forward the packet to the actual tower, the tower delivers it to your phone. For that we have SPDY. We send eight kilobytes, and then we send the rest. So there are a lot of things that we can do can do in this space. That request will block for a long time, and the other two are blocked too. So first of all, one of the big challenges that we have on the web today is making stuff faster. The original specs actually said you sent one packet.
So all of this takes a lot of time, and what typically ends up happening is, if you do the math for HTTP Archive, coming back to that original thing that we looked at. This is just in terms of HTTP headers, which is significant. But the switch is not going to happen overnight. Especially in the case if you have server processing time and others. There are banners plastered everywhere. And the interesting number here is, of course, this: 86 and 57; so 86 requests. We get an acknowledgement.
We have clients that we can rev. So this has been our hack of the decade. There are actually multiple problems at multiple layers. CSS or JavaScript file. Because otherwise, you just create more and more contention or more blocking. Originally in SPDY, we actually started with just straight up gzip. We incur the server processing time. This is just your last mile latency.
Going end to end, we have the DNS lookup. CSS file and other things. Servers needs to get smarter. Before we were like 2x compared to desktop versus mobile. So flow control allows you to do that. Because it turns out we need to tweak our protocols to make them better to work around this problem. So how do we pick this number? So this is a very simple experiment that we set up. Some of the opportunities and some of the stuff that still is ongoing and needs to be done, smarter servers. So this is kind of an optimistic scenario.
We just end up doing a couple of round trips, which are very expensive. This kind of thing needs to happen at the routing layer and all the other layers within the system. The server can do that and the question is how do you negotiate the stream IDs? Not require multiple connections, so we want to eliminate the need to have domain sharding. Sharding, in more ways, helps clients that have more bandwidth, which are your desktop clients, but it hurts people on slower connections, like mobile phones, because it causes congestion, it causes more retransmissions. That breaks the protocol in spectacular ways.
So efficiency is actually a big optimization concern here. TCP packet loss of money has to happen for TCP to work properly. Bandwidth really matters there, and the good news is we can actually get more bandwidth. For example, the current nGenx implementation does not respect priorities. Your video will be streaming better. Stuff is getting faster. So this is this graph right here.
So page load time here is in milliseconds. So it terminates your TCP connection right there at the PGW, and the PGW actually looks at a bunch of rules, like, should I be forwarding the site of traffic and etc. So I work at Google as the slide says. This infrastructure is going to be built. You can see that in 2013, the latency, especially on mobile, has decreased significantly. That tells you approximately the number of connections that we open.
This is very important to me, so serve this with a higher priority than the image, which I sent you previously. So if we build a shorter cable, literally, there are a bunch of cables there, but if we take a slightly more direct route between these cities, specifically 300 miles shorter, then we can save about five milliseconds of latency. HTTP clients in the past. We like to make things fast. So basically at that point it was becoming a de facto standard and we said, look, there should be a more formal spec around this. There were instances, reports at the time, when this congestion collapse was reached that some packets would literally take a day to get to the other person on the other end.
Web performance, solved deal. But the rest of the world is basically flat. Because you can have a pathological case, or you can have an image file that takes a long time to generate, like a minute, and then all of your requests are piled up behind it. Do routers listen to this or just the servers? We have constraints at the TCP layer. Then there is the actual content download. It would be nice if we could get them fast, but I should be able to display text first. So if you have data that is larger than 16 kilobytes, you would just split it across multiple data frames.
So, for example, Akamai has a nice site, Akamai IO, where you can basically go in and type any country and look at average bandwidth, at least as seen by Akamai. Google data, but our theory, and I think we have good reasons to believe that this is why this is true, is that this is dominated by North America. At the beginning, we basically picked the latest SPDY draft and used that as a base. It will help make service more efficient, and actually reduce latency for a lot of users. So you can optimize, you can get the best bandwidth and you can also get the best throughput. You send request one.
So let us vary these two things independently. LT Advanced, which is the true 4G, if you will. Then there is CSS and HTML. Flow control is kind of interesting. HTML page, and you should really split your JavaScript bundles, right? So we send four kilobytes.
Then I get it back. But you can actually open streams from both ends. So that sucks, and that sucks because we are already within a small constant factor of the maximum speed. Send it to the server. You could do whatever you need to do to generate those three responses, and then you just send us the data for three responses back. Turns out that an average page ends up talking to about 15 distinct hosts on the web today, which is quite large. So all of this is to say we are going to continue to see improvements in bandwidth.
In practice, this is what ends up happening very frequently. We can just do a shorter cable between two endpoints. We get up to a window of, like, 45 kilobytes, or 60 kilobytes, and we stop there. How do we solve it at the protocol level? And this turns out to be a huge challenge on mobile, where just sending one request can take somewhere in the order of a second. We had a lot of big sites. It should result in faster delivery. ISP, basically the POP box at the ISP. How do you determine that?
We have to do HTTP request. We can keep our code modular.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.