- Published on
Common HTTP Implementation Quirks
- Authors
- Name
- Tom Hacohen
- @TomHacohen
Svix is the enterprise ready webhooks sending service. With Svix, you can build a secure, reliable, and scalable webhook platform in minutes. Looking to send webhooks? Give it a try!
At Svix, we send a lot of webhooks every month to many different webhooks consumers, which means we see a lot of different HTTP services and servers. This is a list of a few common HTTP implementation quirks you should probably be aware of.
We previously shared a video called HTTP oddities that covered some, but not all of the content below.
Case-Sensitivity Headers
The HTTP specification clearly states that headers should be case-insensitive. This means that whether you write Authorization
or authorization
, they should both be interpreted in the same manner. The key word here is "should", as not all implementations implement this correctly. Some treat differently capitalized headers as distinct, often leading to unexpected behavior.
The problem with this one, is that this is a fairly easy mistake to make. Which resulted in it popping up in a lot of different implementations, especially web frameworks. The reason is that hash maps (dictionaries) are usually a sufficiently good implementation for headers, so many frameworks just used these, and then people using them just extracted keys using the normal (case sensitive) map implementations.
The counterpart to the above issues is libraries that automatically sanitize HTTP header names and automatically make them lower-case (Hyper, I'm looking at you), which makes it very hard to interact with systems that expect a specific casing.
While this issue has been fixed in HTTP 2.0, where header names are enforced to always be lower-case, it's still quite prevalent.
Invalid Header Names
Certain characters, like spaces, are not allowed in header names. However not all implementations correctly enforce this, which means that you have to make sure that you can correctly handle this.
Header ordering
Some middleboxes, like certain proxies or firewalls, have been programmed with simplistic parsers that expect headers in a specific order. Even though this behavior deviates from the standard, it can lead to issues if headers aren't ordered in the way these devices expect.
This is especially prevalent with the Host
header, which is required by many implementations to be the first header. With it being first, they just fail with a 400 (Bad Request).
Headers can appear more than once
Yes, HTTP headers can appear more than once in an HTTP message. When this happens, the values of those headers are usually considered as a single value, combined with commas.
For example:
Accept: text/plain
Accept: text/html
Is equivalent to:
Accept: text/plain, text/html
Some implementations may not handle it correctly, and will instead overwrite one value with the other.
Headers can have multiple values
Similarly to the previous point, HTTP headers may include multiple values which some naive implementations may not support.
For example the below will fail with a naive implementation that just checks that the Accept
header value is equal to text/plain
.
Accept: text/plain, text/html
Query params and arrays
There are multiple ways to pass query parameters in the URL and you have to make sure both you and the client agree on the right way. This is probably one of the crazier examples on the list just because of the amount of different ways you can do it. There are probably more, but these are the ones I've encountered.
1. Using non-indexed arrays
This is one of the more straightforward options, you just pass the params separately and marking them as an array. This can still trip implementations that assume there is only one item:
https://www.example.com?arr[]=foo&arr[]=bar&arr[]=baz
2. Using indexed arrays
Slightly more error-prone than the previous one, but still quite simple. You just pass the params based on the index in the array. I'm not sure if the actual index passed even matters or if it's just a convention, but it usually looks something like this:
https://www.example.com?arr[0]=foo&arr[1]=bar&arr[2]=baz
3. By passing the same param multiple times
This is very similar to the first one, where you just pass the same name multiple times. But unlike the first one, this doesn't make it obvious it's an array. It just looks like a normal parameter, and it's very easy to accidentally repeat it. This version is the default of the OpenAPI spec which makes it much more common than it should be:
https://www.example.com?arr=foo&arr=bar&arr=baz
4. By comma delimiting the array
This one is how most people I spoke with think query parameter arrays behave, and is fairly intuitive. The main problem with this one is that it's quite easy to try to naively serialize (or deserialize) an array, failing to correctly escape the commas in parameter values while doing so. This one is also the shortest of the bunch:
https://www.example.com?arr=foo,bar,baz
5. By pipe delimiting the array
This one is the same as the last one, but it uses the pipe (|
) character as the delimiter instead of a comma:
https://www.example.com?arr=foo|bar|baz
TLS Cipher Negotiation
As part of the TLS connection flow, the client and the server need to agree on a set of cihpers they will use for the connection. Many libraries and servers only support a subset of the allowed ciphers, dropping old and insecure ones, and not yet supporting newer ones. This is fine in most cases, and they can find a cipher to agree on.
Though when dealing with very old servers, and "cutting edge we-only-want-next-gen-cipher" SSL libraries there can be a failure to find a matching cipher, causing the server to reject the connection because of the failure to negotiate a cipher.
Incomplete certificate chains
We talked about it at length in our previous post about incomplete certificate chains. Some servers have misconfigured certificate or certificate chains that could cause TLS clients to fail certificate validation and therefore reject the connection.
Basic Authentication
There are many ways to authenticate HTTP requests. A common oldie but goodie is HTTP Basic Auth, which conveniently supports passing the username and password embedded in the URL like so: https://username:password@www.example.com
.
One thing to be aware of is that not every HTTP client side library supports it out of the box, and you may need to implement this yourself for it to work. Another gotcha, is that both the username and password are optional, and your implementation needs to support both options.
Closing words
HTTP is ubiquitous with implementations in (almost?) every programming language. With so many different implementations, in so many different environments, and being around for as long as HTTP has been, quirks and incompatibilities are bound to happen. Especially in areas of the spec that haven't been strictly defined.
It's therefore the responsibility of implementations that can be used against a variety of different server implementations (like libraries and webhook implementations) to be flexible enough to ensure smooth operations in all circumstances.
For more content like this, make sure to follow us on Twitter, Github or RSS for the latest updates for the Svix webhook service, or join the discussion on our community Slack.