When we are dealing with URL encode in .NET Core 2.1, there are two APIs: System.Net.WebUtility.UrlEncode and System.Web.HttpUtility.UrlEncode. What's the difference between them? And which one should we prefer to use? I have done some research today, here's my findings.
Test Results
First, let's see some tests. I've tested 2 couples of the same method between WebUtility
class and HttpUtility
class: UrlEncode/UrlDecode
and HtmlEncode/HtmlDecode
The only difference is the UrlEncode(string)
function has different outputs:
var webencode = System.Net.WebUtility.UrlEncode(test);
var httpencode = System.Web.HttpUtility.UrlEncode(test);
Console.WriteLine($"WebUtility.UrlEncode: {webencode}");
Console.WriteLine($"HttpUtility.UrlEncode: {httpencode}");
The result is WebUtility.UrlEncode()
is using lowercase while HttpUtility.UrlEncode()
outputs UPPERCASE for the encoded characters.
Why the heck is that?
Thanks to Microsoft for open sourcing .NET, we can find out the root cause by looking into the .NET Core source code here: https://github.com/dotnet/corefx
If you want to see for yourself, here's the source code locations:
WebUtility
class "\corefx\src\System.Runtime.Extensions\src\System\Net\WebUtility.cs"
HttpUtility
class "\corefx\src\System.Web.HttpUtility\src\System\Web\HttpUtility.cs"
WebUtility
I found the UrlEncode
method at line 408 (this may change if Microsoft updates the source code)
It will internally calls GetEncodedBytes()
method near the end.
It then calls IntToHex()
method.
Finally, the heck is because it is casting an UPPERCASE character 'A', this is why every encoded character is returning as UPPERCASE.
HttpUtility
I did the similar research to HttpUtility
, and found it finally calls System.Web.Util.IntToHex()
method, which is:
This explains why it is returning lowercase letters.
My guess
I don't know if this is intentionally by design or not, but having two versions of IntToHex()
does not make sense to me. I would prefer to have an optional parameter to let API users control the encoding casing.
Which one should we prefer to use?
In short, I use lowercase for URLs across my entire system. So, I will choose HttpUtility.UrlEncode()
to encode the URLs.
In Windows, UPPERCASE or lowercase in URLs does not matter. But in Linux, it is different, you can result in a 404 page. Besides, the HASHes for the UPPERCASE string and lowercase string are different. If somewhere in the system when there is a validation to check the URL's hash, it can fail because of case sensitivity regardless in Windows or Linux.
For SEO, some people may hear lowercase is better, but for Google, casing does not impact rankings, you can check the discussion here.
The point is you should use consistent URL casing across your own system, and be aware of the other systems you are talking to if they treats URL casing differently.