Misplaced Pages

Web cache

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

A Web cache (or HTTP cache ) is a system for optimizing the World Wide Web. It is implemented both client-side and server-side. The caching of multimedia and other files can result in less overall delay when browsing the Web.

#776223

46-423: A forward cache is a cache outside the web server 's network, e.g. in the client's web browser , in an ISP , or within a corporate network. A network-aware forward cache only caches heavily accessed items. A proxy server sitting between the client and web server can evaluate HTTP headers and choose whether to store web content. A reverse cache sits in front of one or more web servers, accelerating requests from

92-424: A simple early form of HTML , from web server(s) using a new basic communication protocol that was named HTTP 0.9 . In August 1991 Tim Berners-Lee announced the birth of WWW technology and encouraged scientists to adopt and develop it. Soon after, those programs, along with their source code , were made available to people interested in their usage. Although the source code was not formally licensed or placed in

138-526: A basis for general computer-to-computer communication, as well as support for WebDAV extensions, have extended the application of web servers well beyond their original purpose of serving human-readable pages. This is a very brief history of web server programs , so some information necessarily overlaps with the histories of the web browsers , the World Wide Web and the Internet ; therefore, for

184-1064: A long time and so Apache suffered, even more, the competition of commercial servers and, above all, of other open-source servers which meanwhile had already achieved far superior performances (mostly when serving static content) since the beginning of their development and at the time of the Apache decline were able to offer also a long enough list of well tested advanced features. In fact, a few years after 2000 started, not only other commercial and highly competitive web servers, e.g. LiteSpeed , but also many other open-source programs, often of excellent quality and very high performances, among which should be noted Hiawatha , Cherokee HTTP server , Lighttpd , Nginx and other derived/related products also available with commercial support, emerged. Around 2007–2008, most popular web browsers increased their previous default limit of 2 persistent connections per host-domain (a limit recommended by RFC-2616) to 4, 6 or 8 persistent connections per host-domain, in order to speed up

230-539: A mailbox or public folder. WebDAV for Exchange has been extended by Microsoft to accommodate working with messaging data. Exchange Server version 2000, 2003, and 2007 support WebDAV. However, WebDAV support has been discontinued in Exchange 2010 in favor of Exchange Web Services (EWS), a SOAP / XML based API . As part of the Windows Server Protocols (WSPP) documentation set, Microsoft published

276-514: A physical file system path, to an absolute path under the target website's root directory. Website's root directory may be specified by a configuration file or by some internal rule of the web server by using the name of the website which is the host part of the URL found in HTTP client request. Path translation to file system is done for the following types of web resources: The web server appends

322-543: A response to be used without re-checking it on the origin server, and can be controlled by both the server and the client. For example, the Expires response header gives a date when the document becomes stale, and the Cache-Control: max-age directive tells the cache how many seconds the response is fresh for. Validation can be used to check whether a cached response is still good after it becomes stale. For example, if

368-428: A strong impetus to the adoption of reverse proxies in front of slower web servers and it gave also one more chance to the emerging new web servers that could show all their speed and their capability to handle very high numbers of concurrent connections without requiring too many hardware resources (expensive computers with lots of CPUs, RAM and fast disks). In 2015, RFCs published new protocol version [HTTP/2], and as

414-451: A web server and some of the tasks that it may perform in order to have a sufficiently wide scenario about the topic. A web server program plays the role of a server in a client–server model by implementing one or more versions of HTTP protocol, often including the HTTPS secure variant and other features and extensions that are considered useful for its planned usage. The complexity and

460-438: A web server can be a pre-existing file ( static content ) available to the web server, or it can be generated at the time of the request ( dynamic content ) by another program that communicates with the server software. The former usually can be served faster and can be more easily cached for repeated requests, while the latter supports a broader range of applications. Technologies such as REST and SOAP , which use HTTP as

506-427: A web server implements one or more of the above-mentioned advanced features then the path part of a valid URL may not always match an existing file system path under website directory tree (a file or a directory in file system ) because it can refer to a virtual name of an internal or external module processor for dynamic requests. Web server programs are able to translate an URL path (all or part of it), that refers to

SECTION 10

#1732797675777

552-467: Is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content ) or its secure variant HTTPS . A user agent, commonly a web browser or web crawler , initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message . A web server can also accept and store resources sent from

598-539: Is a set of extensions to the Hypertext Transfer Protocol (HTTP), which allows user agents to collaboratively author contents directly in an HTTP web server by providing facilities for concurrency control and namespace operations , thus allowing Web to be viewed as a writeable, collaborative medium and not just a read-only medium. WebDAV is defined in RFC   4918 by a working group of

644-481: Is referring to, so that that resource can be returned to the requesting client. This process is performed with every request that is made to a web server, with some of the requests being served with a file, such as an HTML document, or a gif image, others with the results of running a CGI program, and others by some other process, such as a built-in module handler, a PHP document, or a Java servlet." In practice, web server programs that implement advanced features, beyond

690-601: The CGI to communicate with external programs. These capabilities, along with the multimedia features of NCSA's Mosaic browser (also able to manage HTML FORMs in order to send data to a web server) highlighted the potential of web technology for publishing and distributed computing applications. In the second half of 1994, the development of NCSA httpd stalled to the point that a group of external software developers, webmasters and other professional figures interested in that server, started to write and collect patches thanks to

736-543: The Internet Engineering Task Force (IETF). The WebDAV protocol provides a framework for users to create, change and move documents on a server . The most important features include the maintenance of properties about an author or modification date, namespace management, collections, and overwrite protection. Maintenance of properties includes such things as the creation, removal, and querying of file information. Namespace management deals with

782-659: The World Wide Web Consortium (W3C) to host two meetings to discuss the problem of distributed authoring on the World Wide Web with interested people. Tim Berners-Lee 's original vision of the Web involved a medium for both reading and writing. In fact, Berners-Lee's first web browser , called WorldWideWeb , could both view and edit web pages ; but, as the Web grew, it became a read-only medium for most users. Whitehead and other like-minded people wanted to transcend that limitation. The meetings resulted in

828-480: The public domain . This statement freed web server developers from any possible legal issue about the development of derivative work based on that source code (a threat that in practice never existed). At the beginning of 1994, the most notable among new web servers was NCSA httpd which ran on a variety of Unix -based OSs and could serve dynamically generated content by implementing the POST HTTP method and

874-629: The Delta-V protocol under the Web Versioning and Configuration Management working group adds resource revision tracking, published in RFC   3253 . For searching and locating, the DAV Searching and Locating (DASL) working group never produced any official standard although there are a number of implementations of its last draft. Work continued as non-working-group activity. The WebDAV Search specification attempts to pick up where

920-458: The Internet and reducing peak server load. This is usually a content delivery network (CDN) that retains copies of web content at various points throughout a network. The Hypertext Transfer Protocol (HTTP) defines three basic mechanisms for controlling caches: freshness, validation, and invalidation. This is specified in the header of HTTP response messages from the server. Freshness allows

966-629: The NCSA httpd source code being available to the public domain. At the beginning of 1995 those patches were all applied to the last release of NCSA source code and, after several tests, the Apache HTTP server project was started. At the end of 1994, a new commercial web server, named Netsite , was released with specific features. It was the first one of many other similar products that were developed first by Netscape , then also by Sun Microsystems , and finally by Oracle Corporation . In mid-1995,

SECTION 20

#1732797675777

1012-519: The ability to copy and move web pages within a server's namespace. Collections deal with the creation, removal, and listing of various resources. Lastly, overwrite protection handles aspects related to the locking of files. It takes advantage of existing technologies such as Transport Layer Security , digest access authentication or XML to satisfy those requirements. Many modern operating systems provide built-in client-side support for WebDAV . WebDAV began in 1996 when Jim Whitehead worked with

1058-537: The availability of new protocol , not only because they had the work force and the time to do so, but also because usually their previous implementation of SPDY protocol could be reused as a starting point and because most used web browsers implemented it very quickly for the same reason. Another reason that prompted those developers to act quickly was that webmasters felt the pressure of the ever increasing web traffic and they really wanted to install and to try – as soon as possible – something that could drastically lower

1104-604: The cached response will be invalidated. Many CDNs and manufacturers of network equipment have replaced this standard HTTP cache control with dynamic caching. In 1998, the Digital Millennium Copyright Act added rules to the United States Code ( 17 U.S.C. §: 512 ) that exempts system operators from copyright liability for the purposes of caching. This is a list of server-side web caching software. Web server A web server

1150-870: The efficiency of a web server program may vary a lot depending on (e.g.): Although web server programs differ in how they are implemented, most of them offer the following common features. These are basic features that most web servers usually have. A few other more advanced and popular features ( only a very short selection ) are the following ones. A web server program, when it is running, usually performs several general tasks , (e.g.): Web server programs are able: Once an HTTP request message has been decoded and verified, its values can be used to determine whether that request can be satisfied or not. This requires many other steps, including security checks . Web server programs usually perform some type of URL normalization ( URL found in most HTTP request messages) in order to: The term URL normalization refers to

1196-406: The first version of IIS was released, for Windows NT OS, by Microsoft . This marked the entry, in the field of World Wide Web technologies, of a very important commercial developer and vendor that has played and still is playing a key role on both sides (client and server) of the web. In the second half of 1995, CERN and NCSA web servers started to decline (in global percentage usage) because of

1242-467: The formal working group. WebDAV extends the set of standard HTTP verbs and headers allowed for request methods . The added verbs include: The properties of WebDAV protocol are name–value pair , in which a "name" is a Uniform Resource Identifier (URI) and the "values" are expressed through XML elements. Furthermore, the methods to handle the properties are PROPFIND and PROPPATCH . The WebDAV working group produced several works: For versioning,

1288-455: The formation of an IETF working group because the new effort would lead to extensions to HTTP, which the IETF had started to standardize. As work began on the protocol, it became clear that handling both distributed authoring and versioning together would involve too much work and that the tasks would have to be separated. The WebDAV group focused on distributed authoring, and left versioning for

1334-675: The future. (The Delta-V extension added versioning later – see the Extensions section below.) The WebDAV working group concluded its work in March 2007, after the Internet Engineering Steering Group (IESG) accepted an incremental update to RFC   2518 . Other extensions left unfinished at that time, such as the BIND method , have been finished by their individual authors, independent of

1380-483: The implementation of new specifications was not trivial at all, a dilemma arose among developers of less popular web servers (e.g. with a percentage of usage lower than 1% .. 2%), about adding or not adding support for that new protocol version. In fact supporting HTTP/2 often required radical changes to their internal implementation due to many factors (practically always required encrypted connections, capability to distinguish between HTTP/1.x and HTTP/2 connections on

1426-408: The maximum number of concurrent connections allowed and to improve their level of scalability. Between 1996 and 1999, Netscape Enterprise Server and Microsoft's IIS emerged among the leading commercial options whereas among the freely available and open-source programs Apache HTTP Server held the lead as the preferred server (because of its reliability and its many features). In those years there

Web cache - Misplaced Pages Continue

1472-500: The number of TCP/IP connections and speedup accesses to hosted websites. In 2020–2021 the HTTP/2 dynamics about its implementation (by top web servers and popular web browsers) were partly replicated after the publication of advanced drafts of future RFC about HTTP/3 protocol. The following technical overview should be considered only as an attempt to give a few very limited examples about some features that may be implemented in

1518-412: The path found in requested URL (HTTP request message) and appends it to the path of the (Host) website root directory. On an Apache server , this is commonly /home/www/website (on Unix machines, usually it is: /var/www/website ). See the following examples of how it may result. URL path translation for a static file request WebDAV WebDAV ( Web Distributed Authoring and Versioning )

1564-431: The process of modifying and standardizing a URL in a consistent manner. There are several types of normalization that may be performed, including the conversion of the scheme and host to lowercase. Among the most important normalizations are the removal of "." and ".." path segments and adding trailing slashes to a non-empty path component. "URL mapping is the process by which a URL is analyzed to figure out what resource it

1610-543: The public domain, CERN informally allowed users and developers to experiment and further develop on top of them. Berners-Lee started promoting the adoption and the usage of those programs along with their porting to other operating systems . In December 1991, the first web server outside Europe was installed at SLAC (U.S.A.). This was a very important event because it started trans-continental web communications between web browsers and web servers. In 1991–1993, CERN web server program continued to be actively developed by

1656-488: The response has a Last-Modified header, a cache can make a conditional request using the If-Modified-Since header to see if it has changed. The ETag (entity tag) mechanism also allows for both strong and weak validation. Invalidation is usually a side effect of another request that passes through the cache. For example, if a URL associated with a cached response subsequently gets a POST, PUT or DELETE request,

1702-429: The retrieval of heavy web pages with lots of images, and to mitigate the problem of the shortage of persistent connections dedicated to dynamic objects used for bi-directional notifications of events in web pages. Within a year, these changes, on average, nearly tripled the maximum number of persistent connections that web servers had to manage. This trend (of increasing the number of persistent connections) definitely gave

1748-438: The sake of clarity and understandability, some key historical information below reported may be similar to that found also in one or more of the above-mentioned history articles. In March 1989, Sir Tim Berners-Lee proposed a new project to his employer CERN , with the goal of easing the exchange of information between scientists by using a hypertext system. The proposal titled "HyperText and CERN" , asked for comments and it

1794-416: The same TCP port, binary representation of HTTP messages, message priority, compression of HTTP headers, use of streams also known as TCP/IP sub-connections and related flow-control, etc.) and so a few developers of those web servers opted for not supporting new HTTP/2 version (at least in the near future) also because of these main reasons: Instead, developers of most popular web servers, rushed to offer

1840-426: The simple static content serving (e.g. URL rewrite engine, dynamic content serving), usually have to figure out how that URL has to be handled, e.g. as a: One or more configuration files of web server may specify the mapping of parts of URL path (e.g. initial parts of file path , filename extension and other path components) to a specific URL handler (file, directory, external program or internal module). When

1886-431: The user agent if configured to do so. The hardware used to run a web server can vary according to the volume of requests that it needs to handle. At the low end of the range are embedded systems , such as a router that runs a small web server as its configuration interface. A high-traffic Internet website might handle requests with hundreds of servers that run on racks of high-speed computers. A resource sent from

Web cache - Misplaced Pages Continue

1932-720: The widespread adoption of new web servers which had a much faster development cycle along with more features, more fixes applied, and more performances than the previous ones. At the end of 1996, there were already over fifty known (different) web server software programs that were available to everybody who wanted to own an Internet domain name and/or to host websites. Many of them lived only shortly and were replaced by other web servers. The publication of RFCs about protocol versions HTTP/1.0 (1996) and HTTP/1.1 (1997, 1999), forced most web servers to comply (not always completely) with those standards. The use of TCP/IP persistent connections (HTTP/1.1) required web servers both to increase

1978-643: The working group left off, and was published as RFC   5323 in November 2008. For calendaring, CalDAV is a protocol allowing calendar access via WebDAV. CalDAV models calendar events as HTTP resources in iCalendar format, and models calendars containing events as WebDAV collections. For groupware, GroupDAV is a variant of WebDAV which allows client/server groupware systems to store and fetch objects such as calendar items and address book entries instead of web pages. For MS Exchange interoperability, WebDAV can be used for reading/updating/deleting items in

2024-418: The www group, meanwhile, thanks to the availability of its source code and the public specifications of the HTTP protocol, many other implementations of web servers started to be developed. In April 1993, CERN issued a public official statement stating that the three components of Web software (the basic line-mode client, the web server and the library of common code), along with their source code , were put in

2070-817: Was also another commercial, highly innovative and thus notable web server called Zeus ( now discontinued ) that was known as one of the fastest and most scalable web servers available on market, at least till the first decade of 2000s, despite its low percentage of usage. Apache resulted in the most used web server from mid-1996 to the end of 2015 when, after a few years of decline, it was surpassed initially by IIS and then by Nginx. Afterward IIS dropped to much lower percentages of usage than Apache (see also market share ). From 2005–2006, Apache started to improve its speed and its scalability level by introducing new performance features (e.g. event MPM and new content cache). As those new performance improvements initially were marked as experimental, they were not enabled by its users for

2116-510: Was read by several people. In October 1990 the proposal was reformulated and enriched (having as co-author Robert Cailliau ), and finally, it was approved. Between late 1990 and early 1991 the project resulted in Berners-Lee and his developers writing and testing several software libraries along with three programs, which initially ran on NeXTSTEP OS installed on NeXT workstations: Those early browsers retrieved web pages written in

#776223