What happens when you type google.com in your browser and press Enter
When you type https://www.google.com in your browser and press Enter, a series of events are about to start and, at the end, you will be able to browser the google homepage. There are two important concepts that we must have clear before we go further. The computers connected to the internet are called clients and servers.
The clients are the typical web user´s internet-connected devices (for instance, your laptop or your smartphone) and web-accessing software available on those devices (usually a web browser like Firefox or Chrome). The servers are computers that store webpages, sites, or apps. When a client device wants to access a webpage, a copy of the webpage is downloaded from the server onto the client machine to be displayed in the user’s web browser.
There are some technical names that are important to present before we keep going.
Internet connection: This connection allows us to send and receive information on the web.
TCP/IP: A suite of protocols used by devices to communicate over the Internet and most local networks. It is named after two of the original protocols — the Transmission Control Protocol (TCP) and the Internet Protocol (IP). TCP provides apps a way to deliver (and receive) an ordered and error-checked stream of information packets over the network. (Hoffman, 2021).
DNS: Domain Name System, translates human readable domain names (for example, www.amazon.com) to machine readable IP addresses (for example, 192.0.2.44). (What is DNS?, 2021).
HTTP: HTTP stands for Hypertext Transfer Protocol, and it is a protocol — or a prescribed order and syntax for presenting information — used for transferring data over a network. Most information that is sent over the Internet, including website content and API calls, uses the HTTP protocol. There are two main kinds of HTTP messages: requests and responses. (Why is HTTP not secure?, 2021).
HTTPS: The S in HTTPS stands for “secure.” HTTPS uses TLS (or SSL) to encrypt HTTP requests and responses, so in the example above, instead of the text, an attacker would see a bunch of seemingly random characters. (Why is HTTP not secure?, 2021).
HTTPS is HTTP with encryption. The only difference between the two protocols is that HTTPS uses TLS or Transport Layer Security to encrypt normal HTTP requests and responses. As a result, HTTPS is far more secure than HTTP. A website that uses HTTP has http:// in its URL, while a website that uses HTTPS has https://. (Why is HTTP not secure?, 2021).
TLS: TLS is a security protocol that provides privacy and data integrity for Internet communications. Implementing TLS is a standard practice for building secure web apps. (What is TLS (Transport Layer Security)?, 2021).
SSL: Secure Sockets Layer (SSL) is a security protocol that provides privacy, authentication, and integrity to Internet communications. SSL eventually evolved into Transport Layer Security (TLS). (What is SSL? | SSL definition, 2021).
In the below image you can see the main differences between HTTP and HTTPS.
Quick review of the information flow (so far)
- Once you type a web address into your browser, it goes to the DNS server, and finds the real address of the server that the website lives on.
- The browser sends an HTTPS request message to the server, asking it to send a copy of the website to the client. This message, and all other data sent between the client and the server, is sent across your internet connection using TCP/IP. The request is reaching the server IP on port HTTPS 443.
- If the server approves the client’s request, the server sends the client a “200 OK” message, which means “Everything is OK”, and then starts sending the website’s files to the browser as a series of small chunks called data packets.
- The browser assembles the small chunks into a complete web page and displays it to you.
The above procedure was the general idea of the communication between the client and the server. Now we are going to explain in detail how that communication is created and how the different parts interact between each other. (How the Web works, 2021).
Going through the servers
Once the browser sends an HTTPS request message to the server, the first thing that this communication will find before the server is a firewall. In the image below you will find a simplified chain of interaction in which you can see the client, the firewall and the different kinds of servers use in an application infrastructure. (Application Infrastructure, 2021).
A firewall is a piece of hardware or software that filters incoming network traffic to keep malware and attackers out. Firewalls are used to prevent unauthorized users from accessing unauthorized assets and services, ensuring that users only interact with the application through the web portal and do not have access to any other application infrastructure components. Firewalls can also identify and block packets with mismatched IP addresses — packets that say they’re coming from one location but don’t have an IP address that backs up that claim. (Rubens, 2021). The HTTPS request message is passing through the firewall that accepts traffic on port TCP/443.
Behind the firewall you will find one of the most important parts in an application infrastructure. This part is the load balancer. A load balancer acts as the “traffic cop” sitting in front of your servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance. If a single server goes down, the load balancer redirects traffic to the remaining online servers. When a new server is added to the server group, the load balancer automatically starts to send requests to it. (What is Load Balancing, 2021).
In this manner, a load balancer performs the following functions:
· Distributes client requests or network load efficiently across multiple servers
· Ensures high availability and reliability by sending requests only to servers that are online
· Provides the flexibility to add or subtract servers as demand dictates
Below you can observe a load balancing diagram.
Load balancing can be performed in several ways. The two most important ways are using hardware and using software. Hardware load balancing is a typical strategy for balancing the traffic. The dedicated hardware devices which are used to handle the network traffic and load balancing is called hardware load balancing. Majority of the dedicated hardware load balancers runs Linux. The main disadvantage of load balancing with hardware is that it’s very expensive. On the other hand, Software-based load balancing is exceptionally a very effective and reliable method for distributing load between servers. The software performs the balancing of requests usually on a Linux platform with a wide variety of algorithms for server allocation.
One of the most common ways to implement a load balancer is load balancing with HAProxy. The HAproxy, universally known as High Availability Proxy, is a free, very fast and reliable reverse-proxy offering high availability, load balancing, and proxying for TCP and HTTP-based applications. HAproxy is used in the case where too many concurrent connections over-saturate the capability of a single server. In HAproxy, a client connecting to a single server process all the requests. The client will connect to a HAProxy instance. Based on a load-balancing algorithm, the HAProxy instance use a reverse proxy to forward the request to one of the available endpoints. The different methods to configure a load balancer are based on different load-balancing algorithms. (How To Implement Load Balancing In Database Servers?, 2018).
Here are five of the most common load balancing methods:
· Round Robin: The Round Robin method relies on a rotation system to sort network and application traffic. An inbound request is delegated to the first available server, and then the server is bumped to the bottom of the line. This method is particularly useful when working with servers of equal value.
· IP Hash: In this straightforward load balancing technique, the client’s IP address simply determines which server receives its request.
· Least Connections: As its name states, the least connection method directs traffic to whichever server has the least amount of active connections. This is helpful during heavy traffic periods, as it helps maintain even distribution among all available servers.
· Least Response Time: The least response time method directs traffic to the server with the least amount of active connections and lowest average response time.
· Least Bandwidth: This application load balancer method measures traffic in megabits (Mbps) per second, sending client requests to the server with the least Mbps of traffic.
Below you can the five most common load balancing methods. (What Is Server and Application Load Balancing? Types, Configuration Methods, and Best Tools, 2020)
Before we continue, it is important to explain the importance of web servers and application servers. These two types of servers that are usually deployed together when the objective is fulfilling user requests for content from a website.
A web server stores and delivers the content for a website — such as text, images, video, and application data — to clients that request it. That content is called static content. The most common types of clients are a web browser program or a mobile application, which request data from the website when a user clicks on a link or downloads a document on a page displayed in the browser. (What Is a Web Server?, 2021). The request takes the form of a Hypertext Transfer Protocol (HTTP) message, as does the web server’s response. A Few Examples of a Web Server are Resin, Apache Tomcat.
An application server is a server type that helps to host applications and its fundamental job is to provide its clients with access to what is commonly called business logic, which generates dynamic content; that is, it’s code that transforms data to provide the specialized functionality offered by a business, service, or application. For example, websites for banks. An application server’s clients are often applications themselves and can include web servers and other application servers. Communication between the application server and its clients might take the form of HTTP messages, but that is not required as it is for communication between web servers and their clients. (What Is an Application Server vs. a Web Server?, 2021). A Few Examples of an Application Server are Websphere, JBoss, Weblogic. (Difference Between Web Server and Application Server, 2021).
Some key differences between web servers and application servers are:
· Web Server is responsible for accepting HTTP requests from clients and serving back that HTTP responses whereas Application server exposes business logic to the clients, which generates dynamic content.
· Web servers are used for producing produce static or dynamic, hypertext documents and Application servers use for text document generation for the computation on provided data.
· Web server consumes fewer resources like CPU memory compared with the application server while the application server utilizes more resources.
· The web server supports HTTP/s Protocol, but the application server supports HTTP/s and RPC/RMI protocols.
· Web server provides an environment to run a web application, but the application server gives an environment to run the web with enterprise applications.
· Web Server does not support database connection pooling whereas Applications server support database connection pooling and manage backend logic like calculations, database, processing, etc. (Brent, 2022)
In a typical deployment, a website that provides both static and dynamically generated content runs web servers for the static content and application servers to generate content dynamically. A reverse proxy and load balancer sit in front of one or more web servers and one or more web application servers to route traffic to the appropriate server, first based on the type of content requested and then based on the configured load-balancing algorithm. Most load balancer programs are also reverse proxy servers, which simplifies web application server architecture. (What Is an Application Server vs. a Web Server?, 2021)
At this point we know that when we type a website such as www.google.com, the request from the client will be handle by the web server which will be in communication with the application server to create the dynamic content. However, there is a very important component of the application infrastructure that we must cover to complete the answer to the original question: What happens when you type google.com in your browser and press Enter? This last component is the database server.
A database server is a machine running database software dedicated to providing database services. It is a crucial component in the client-server computing environment where it provides business-critical information requested by the client systems. A database server consists of hardware and software that run a database.
The software side of a database server, or the database instance, is the back-end database application. The application represents a set of memory structures and background processes accessing a set of database files.
The hardware side of a database server is the server system used for database storage and retrieval. Database workloads require a large storage capacity and high memory density to process data efficiently. These requirements mean that the machine hosting the database is usually a dedicated high-end computer.
Database servers have several use cases. Some of them are:
· Dealing with large amounts of data regularly.
· Managing the recovery and security of the DBMS.
· Providing concurrent access control.
· Storing applications and non-database files.
The database server stores the Database Management System (DBMS) and the database itself. Its main role is to receive requests from client machines, search for the required data, and pass back the results. The DBMS provides database server functionality, and some DBMSs (e.g., MySQL) provide database access only via the client-server model. Other DBMSs (such as SQLite) are used for embedded databases. Clients access a database server through a front-end application that displays the requested data on the client machine, or through a back-end application that runs on the server and manages the database.
The ODBC (Open Database Connectivity) standard provides the API allowing clients to call the DBMS. ODBC requires necessary software on both the client and server sides. In a master-slave model, the database master server is the primary data location. Database slave servers are replicas of the master server that act as proxies. (What Is a Database Server & What Is It Used For?, 2021).
Here is an example of server setup in which we have a load balancer, a pair of application servers and a couple of database servers in a master-slave model. (Anicas, 2014).
Now we understand that when we type the name of a website and press Enter, the database server will receive our request, search for the required data, and send us back the results.
Summing up
Finally, in the above image, we show the general flow of the data when you type google.com in your browser and press Enter. Below you will see how the information flows through the internet and the different servers to get a respond from the different servers:
1. Once you type a web address into your browser, it goes to the DNS server, and finds the real address of the server that the website lives on.
2. The browser sends an HTTPS request message to the webserver, asking it to send a copy of the website to the client. This message, and all other data sent between the client and the webserver, is sent across your internet connection using TCP/IP.
3. The HTTPS request message will pass the firewall and arrive to the load balancer which, depending on the logarithm used, will select the most convenient web server to send the message to.
4. If the web server approves the client’s request, the server sends the client a “200 OK” message, which means “Everything is OK”, and then starts sending the website’s files to the browser as a series of small chunks called data packets.
5. The browser assembles the small chunks into a complete web page and displays it to you.
6. The web server handles static data requests, but you want to use an interactive tool. As a dynamic data request, the web server transfers the request to an application server.
7. The application server receives the HTTPS request and converts it into a servlet request. Servlet is a language for exchange between web and application servers.
8. The servlet reaches the database server, and the app server receives a servlet response.
9. The app server translates the servlet response into HTTPS format for client access.
10. The information goes back through the load balancer to your computer, the browser assembles it and displays it to you.
Upon receiving a servlet request from a web server, the application server processes the request and responds to the web server via servlet response. Because application servers primarily work with business logic requests, the web server translates the servlet response and passes an HTTPS response accessible to the user. (Ingalls, 2021).
There you have it. Now you know what happens when you type google.com in your browser and press Enter. As you can see, although the response is immediate, the request you send takes a long way and pass through a lot of different parts such as firewalls, load balancers and different kinds of servers. It is important to know this path because if you do not get response from a website, now you have the tools to try to guess what could be the problem.
Sources:
Hoffman, C. (2021, August 30). The Internet of Things (IoT): An Overview. What’s the Difference Between TCP and UDP?
AWS Amazon. (2021). What is DNS?
Cloudflare. (2021). Why is HTTP not secure? | HTTP vs. HTTPS.
https://www.cloudflare.com/learning/ssl/why-is-http-not-secure/
Cloudflare. (2021). What is TLS (Transport Layer Security)?
https://www.cloudflare.com/learning/ssl/transport-layer-security-tls/
Cloudflare. (2021). What is SSL? | SSL definition.
https://www.cloudflare.com/learning/ssl/what-is-ssl/
Mozila. (2021). How the Web works.
Atatus. (2021, June 7). Application Infrastructure
Rubens, P. (2021, August 30). Types of Firewalls Explained
Nginx. (2021). What Is Load Balancing?
ServerAdminz Limited. (2018, February 12). How To Implement Load Balancing In Database Servers?
Dnsstuff. (2018, February 12). What Is Server and Application Load Balancing? Types, Configuration Methods, and Best Tools
Nginx. (2021). What Is a Web Server?
Nginx. (2021). What Is an Application Server vs. a Web Server?
BYJU’S Exam prep. (2021). Difference Between Web Server and Application Server.
Brent, M. (2022, January 15). Web Server vs Application Server: What is the Difference?
PhoenixNap. (May 31, 2021). What Is a Database Server & What Is It Used For?
Anicas, M. (2014, May 30). 5 Common Server Setups For Your Web Application.
Ingalls, S. (2021, May 21). What is an Application Server?