博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
HTTP Header
阅读量:4031 次
发布时间:2019-05-24

本文共 9445 字,大约阅读时间需要 31 分钟。

Unacceptable Browser HTTP Accept Headers (Yes, You Safari and Internet Explorer)

Update: 

When a web browser make a request it sends information to the server about what it is looking for in headers. One of these headers is the . The Accept header tells the server what file formats, or more correctly MIME-types, the browser is looking for. Let's take a look at Firefox's Accept header:

GET /page/routing-in-recess-screencast HTTP/1.1Host: RecessFramework.orgAccept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

Let's translate Firefox's request to English:

Dear RecessFramework.org,
I want the resource "/page/routing-in-recess-screencast" and I want it in an HTML or XHTML format. If you cannot serve me this way, I'll take "/page/routing-in-recess-screencast" in an XML instead. If you can't even give it to me in XML, well, I'll take anything you've got!
Love,
Firefox

The Accepts header gives the browser a chance to tell the server which format it wants for a resource. By giving a list of options this content negotiation happens in a single request. One of the key design goals of the  is to minimize back-and-forth communication. The browser could ask for each of these formats one at a time but it would be wasteful.

How does the browser specify the preference Give me HTML/XHTML before XML before *? Preference is indicated by the "relative quality parameter" (q) and its value (qvalue), seen in application/xml;q=0.9,*/*;q=0.8. Here's how the HTTP spec defines it:

Each media-range MAY be followed by one or more accept-params, beginning with the "q" parameter for indicating a relative quality factor. The first "q" parameter (if any) separates the media-range parameter(s) from the accept-params. Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (section 3.9). The default value is q=1.

For as brilliant as the spec is, it is a terrible read. What's going on is simple:

  1. Everything item's default preference value is 1.
    1: html, xhtml, xml, *
  2. If an item specifies q=X, its preference value is X.
    0.9: xml
    0.8: */*
    1: html, xhtml
  3. Order by preference value in descending order.
    1: html, xhtml
    0.9: xml
    0.8: *

The only other major detail is in cases where there are ambiguities the more specific one wins. For example if both application/xml and */* had a preference of 0.9 application/xml would still come first. Firefox chooses to make it explicit that */* is less preferred by giving it a preference of 0.8. Firefox's Accept header is sensible and well thought out. Opera's is too. Other browsers: not so much.

What in The Header Were You Thinking WebKit?

Don't relax yet IE, you're up next, and you're even more egregious. So, what's wrong with , the lauded engine behind  and ? Let's take a look:

GET /page/restful-php-framework HTTP/1.1Host: RecessFramework.orgAccept: application/xml,application/xhtml+xml,text/html;q=0.9,        text/plain;q=0.8,image/png,*/*;q=0.5

Note: Accept split to two lines for width. On quick glance it doesn't look too different from Firefox's. Let's try it again in English just to be sure.

Dear RecessFramework.org,
I want the resource "/page/restful-php-framework" and I want it in an XML, XHTML, or PNG format. If you cannot serve me this way, I'll take "/page/routing-in-recess-screencast" in HTML or plain text instead. If you can't do that for me I'll take whatever!
Thanks,
WebKit

Really WebKit? The browsing engine most responsible for killing XHTML prefers XHTML over HTML! It would also prefer PNG over HTML. That's a little embarrassing, but what is worse: Safari and Chrome accept XML over HTML (and, ambiguously, over XHTML, too). WebKit's Accept header forces web developers to work against the HTTP spec.

Suppose you are  and want to be a good  internet citizens following the . You've got a resource called a  that can be represented as  or  or . You wouldn't want Safari users to get an XML copy of a Tweet by browsing around, so you have to actively ignore WebKit's Accept header preferring XML above all else. Aside: It turns out Twitter's  ignores many REST/HTTP best practices like the Accept header, anyway, but that's another story for another post.)

 from  team:

Most WebKit-based browsers (and Safari in particular) would probably do a better job rendering HTML than XHTML or generic XML, if only because the code paths are much better tested. So the Accept header is somewhat in error. On the other hand, this isn't a hugely important bug, and we design our Accept header mainly to give the best compatibility on Web sites, since content negotiation is not really used much in the wild. Our current header was copied from an old version of Firefox.

Internet Explorer Accepts Polluting the Internet

We've covered   and  . Now let's talk about . The IE team made great strides with being  . Unfortunately, its Accepts header is downright ugly:

GET /book/html/index.html HTTP/1.1Host: RecessFramework.orgAccept: image/jpeg, application/x-ms-application, image/gif,        application/xaml+xml, image/pjpeg, application/x-ms-xbap,        application/x-shockwave-flash, application/msword, */*

This is the Accepts header for IE8 on a Windows 7 machine. One peculiarity is the "application/msword" MIME-type. Office isn't installed but the Word Document Viewer is. This made me wonder, what does IE's Accept header look like on a machine with Office installed?Brace yourlselves:

GET /book/html/index.html HTTP/1.1Host: RecessFramework.orgAccept: image/gif, image/jpeg, image/pjpeg, application/x-ms-application,        application/vnd.ms-xpsdocument, application/xaml+xml,        application/x-ms-xbap, application/x-shockwave-flash,        application/x-silverlight-2-b2, application/x-silverlight,        application/vnd.ms-excel, application/vnd.ms-powerpoint,        application/msword, */*

Ok, now let's translate to English:

Dear RecessFramework.org,
I want the resource "/book/html/index.html". Now, bear with me, I'm Internet Explorer and Office is installed so I can accept this resource in a lot of formats, in this order of preference: GIF, JPG, Progressive JPG, , Microsoft XPS Document, XAML, , Flash, Silverlight 2, Silverlight 1, Excel Document, Powerpoint Document, or a Word Document. If you can't give me "/book/html/index.html" in any of those formats then give me anything you've got!
Thanks,
Internet Explorer

There are two things wrong with this picture. The lesser evil: IE has a hook for other applications to insert new MIME-types into its Accept header. This means if a resource could be represented on the server as a Word Document or as an HTML document, Word as an application can inject behavior into IE so that it always has higher precedence than HTML. All an application has to do is modify the registry (HKLM/Software/Microsoft/Windows/CurrentVersion/Internet Settings/Accepted Documents). (Hear that Cisco? You could increase internet consumption if you stuck a couple  WebEx MIME-types in IE's Accept header.)

The greater evil is that IE sends this ~200-300byte Accept header for every single browsing request. 250 bytes isn't much, but on internet scale per every request of the most popular browser, it adds up. Internet Explorer's Accept header emissions pollute the information superhighway. Lets do some back-of-napkin calculation. Google gets now. If IE has roughly  thats 162 million IE requests on Google a day for 38GB worth of garbage internet traffic. On Google searches alone, IE pollutes the internet with over a terabyte of traffic every month in its Accept header. Anyone want to estimate what this number looks like across the rest of the internet?

Update 1: IE team Program Manager  "I strongly recommend that developers not list MIME types here." Yet Silverlight and Office do. Whoops.

Update 2: IE doesn't send the extended header on *every* request, it sends */* for refreshes and some subsequent visits. []

It is not just wasted bandwidth that is the problem, it is wasted server processing, too. If a server or framework wants to follow through on the HTTP protocol the server must be sure it can't respond with any of the requested formats before it can respond with HTML. Bottom line: IE's Accept header is extremely ugly.

If WebKit is Foolish and IE is Prodigal how valuable is the Accept header?

This was the question I asked myself about half-way through writing the Accept parsing and content-negotiation code going into the next release of .

Content-negotiation with the Accept header is an interesting idea in principle that is hard to use properly in practice because browsers misuse it. As stated, Twitter's REST API doesn't use the Accept header for content-negotiation, they use extensions on the URL '.json' and '.xml'.  Frameworks can enhance performance by ignoring the Accept header and relying on '.xml'-like extensions. As such the next release of the , too, will disable Accept header based content-negotiation by default.

So, when would you want to parse Accept headers for content negotiation? When your consumers are respectful of HTTP and REST (RESpecTful!). This could mean RIAs written in , , or . It could also mean other other servers consuming your RESTful API.

Bottom line: If you're building APIs for other developers to consume, consider using Accept-based content-negotiation. If you're building consumer facing web apps: ignore the Accept header until WebKit and IE get their acts together.

转载地址:http://oohbi.baihongyu.com/

你可能感兴趣的文章
Xpath使用方法
查看>>
移动端自动化测试-Mac-IOS-Appium环境搭建
查看>>
Selenium之前世今生
查看>>
Selenium-WebDriverApi接口详解
查看>>
Selenium-ActionChains Api接口详解
查看>>
Selenium-Switch与SelectApi接口详解
查看>>
Selenium-Css Selector使用方法
查看>>
Linux常用统计命令之wc
查看>>
测试必会之 Linux 三剑客之 sed
查看>>
Socket请求XML客户端程序
查看>>
Java中数字转大写货币(支持到千亿)
查看>>
Java.nio
查看>>
函数模版类模版和偏特化泛化的总结
查看>>
VMware Workstation Pro虚拟机不可用解决方法
查看>>
最简单的使用redis自带程序实现c程序远程访问redis服务
查看>>
redis学习总结-- 内部数据 字符串 链表 字典 跳跃表
查看>>
iOS 对象序列化与反序列化
查看>>
iOS 序列化与反序列化(runtime) 01
查看>>
iOS AFN 3.0版本前后区别 01
查看>>
iOS ASI和AFN有什么区别
查看>>