I recently built a URL forward service named “Link Forwarder” using .NET Core 2.2. It’s open source and currently having a preview version deployed to my subdomain “go.edi.wang”. This article will share how I built it and what I learned.
To help you understand the system and walk through the code, please check my GitHub repository for this project: https://github.com/EdiWang/LinkForwarder
The Problem
Resources on the internet can sometimes change their URLs. For example, when I created my website 10 years ago, a typical blog post URL was like “https://myolddomain.net/viewarticle.aspx?id=123”. Some friends of mine then referenced my post on their website or sent the URL to another person. After a few years, I got a new domain and launched a new blog system that completely changed the URL of that blog post to “https://edi.wang/post/2009/1/1/an-old-article”, which makes any old URLs obsolete, so any existing links that pointed to my blog a few years ago will fail. My blog makes no money, so it’s OK.
But this problem can happen to an enterprise’s products, especially for those client-side systems and applications. If you put a support link into your product that installed on client computers, and someday the link changed, you will have to push updates to all your clients just for changing that link.
To solve this problem, I would like to follow the example of Microsoft. What Microsoft did is that it created “go.microsoft.com” to use a static ID that can never change to redirect to the actual URL that can change over time. For example, https://go.microsoft.com/fwlink/?linkid=2049807 is pointing to the help document of the new Chromium-based Edge browser, whose current URL is https://microsoftedgesupport.microsoft.com/hc/en-us. If the document’s URL changes over time, Edge doesn’t have to change its built-in help link. Microsoft only needs to update its database to change the target URL of link ID 2049807. This “go.microsoft.com” service is everywhere across Microsoft products.
This is the basic idea of Link Forwarder.
Basic Workflow
An administrator creates a token URL (e.g. https://go.edi.wang/fw/e66fad1e) for a valid URL. (e.g. https://www.some-website.com/1234/abcd/1.html). Then, users can redirect to the original URL with a generated token URL. And by each successful redirection, the user’s browser UA and IP address will be recorded, so that the Administrator can view and get insights from the report.
Figure: Report
Figure: Create/Edit Link
Figure: Share Link
Not a URL Shortener
Link Forwarder is like, but not a URL shortener. The key differences are:
- URL shortener’s goal is creating URL as short as possible, usually deployed to a very short domain name. Link Forwarder doesn’t care if it’s deployed to a long domain name.
- Most URL shortener doesn’t allow editing origin links after they are created. Link Forwarder targets changes, as mentioned in the “Goal” section.
Not so simple
A link forwarder is not just mapping a token to a URL. The following problems need to be considered.
It needs to be fast and can handle a certain amount of traffic.
My current design caches valid URL redirection, so for requests to the same token, the system won’t query the database every time.
How to deal with an invalid token or valid but non-existing URLs?
For invalid token, stop the request. For token that valid, but it points to non-existing URLs (no records in the database), redirects the user to pre-set default URL.
The system needs to protect users from potentially harmful links.
For example, if the Link Forwarder’s database is compromised, and URL is pointing to “https://127.0.0.1/some-virus”, which the attacker was able to install previously on the user’s machine. The user can be attacked. Other URLs like “/abc”, “123” are also treated as invalid URLs, and would not perform a redirection.
For internet URLs that may contain malicious code, it’s not in the design scope currently. But maybe we can integrate third-party services to identify the link.
The system needs to protect itself.
Links that pointing to the system itself may cause a redirection dead loop and take out the server.
For example:
https://go.edi.wang/fw/a points to https://go.edi.wang/fw/b, and https://go.edi.wang/fw/b points back to https://go.edi.wang/fw/a.
A similar scenario can happen to a Link Forwarder or other similar forwarding system deployed to another domain, even with multiple loops:
Although modern browsers stop this kind of redirection loop, an attacker can work around this by not using a modern browser or not using a browser at all.
For links that point to the server’s domain itself, we can easily identify and stop it. But for a third-party redirection loop, I can’t figure out a reliable way to identify and block the requests. Thus, I use a workaround to limit the number of requests to the same token from the same IP address in a certain time period, which will be explained in this article later.
Redirection Workflow
The following diagram explains the URL redirection flow.
Database Design
To make redirection and track user events. We just need two tables. The database engine of my choice is Microsoft SQL Server LocalDB for development and Microsoft Azure SQL Database for production.
SQL Script:
IF NOT EXISTS(SELECT TOP 1 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = N'Link')
CREATE TABLE [Link](
[Id] [int] IDENTITY(1,1) PRIMARY KEY NOT NULL,
[OriginUrl] [nvarchar](256) NULL,
[FwToken] [varchar](32) NULL,
[Note] [nvarchar](max) NULL,
[IsEnabled] [bit] NOT NULL,
[UpdateTimeUtc] [datetime] NOT NULL)
IF NOT EXISTS(SELECT TOP 1 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = N'LinkTracking')
CREATE TABLE [LinkTracking](
[Id] UNIQUEIDENTIFIER PRIMARY KEY NOT NULL,
[LinkId] [int] NOT NULL,
[UserAgent] [nvarchar](256) NULL,
[IpAddress] [varchar](64) NULL,
[RequestTimeUtc] [datetime] NOT NULL)
IF NOT EXISTS(SELECT TOP 1 1 FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS WHERE CONSTRAINT_NAME = N'FK_LinkTracking_Link')
ALTER TABLE [LinkTracking] WITH CHECK ADD CONSTRAINT [FK_LinkTracking_Link] FOREIGN KEY([LinkId])
REFERENCES [Link] ([Id])
ON UPDATE CASCADE
ON DELETE CASCADE
ALTER TABLE [LinkTracking] CHECK CONSTRAINT [FK_LinkTracking_Link]
ASP.NET Core Application Design
This article does not list every detail pieces of the application code. For the complete reference, please check the project GitHub repository: https://github.com/EdiWang/LinkForwarder
LinkForwarder.Web
ASP.NET Core MVC Application as an entry point. It controls URL redirection, link validation, authentication with local accounts or Azure AD, creates or edit links, and view reports.
LinkForwarder.Services
Defines CRUD operations to the database and getting report data by ILinkForwarderService interface and implementation LinkForwarderService. The ITokenGenerator explained later is also in this project.
LinkForwarder.Setup
Used to run a SQL script to set up a database for a new server. This is only used in the first run of the system.
Key Points in Link Forwarder
Generating Tokens
The parameter behind “/fw” is a token. It is used to find origin URL in the database. The reason why I don’t use Link.Id is because when doing database migration or merge databases from multiple servers, Id can change. But token will stay the same.
The system uses ITokenGenerator interface for token generation.
public interface ITokenGenerator
{
string GenerateToken();
bool TryParseToken(string input, out string token);
}
GenerateToken() is used to create a new token whenever a new URL is submitted.
TryParseToken() is used for validating token format for a client request.
Currently, the only implementation of the ITokenGenerator interface is ShortGuidTokenGenerator. It will take the first 8 chars of a GUID as a token.
public class ShortGuidTokenGenerator : ITokenGenerator
{
private const int Length = 8;
public string GenerateToken()
{
return Guid.NewGuid().ToString().Substring(0, Length).ToLower();
}
public bool TryParseToken(string input, out string token)
{
token = null;
if (input.Length != Length)
{
return false;
}
token = input;
return true;
}
}
Note: In this example, TryParseToken() is not always reliable, because there’s no way to tell if a 8 character string is a part of GUID. You can certainly create another token generator by your own rules that can have accurate token validation.
Creating New Link
First, we need to prevent creating new tokens for an existing URL. For an existing URL, instead of generating a new token, we can find the old record and return the old token. And before that, we also need to verify the existing URL’s token again, to make sure data is good. For example, a hacker can change the token in database to some malicious string, and I don’t want it finally appended to the URL.
So again, TryParseToken() would be better reliable rather than my current design.
Second, we need to prevent generating tokens that already exists. Full GUID is reliable, but a partial GUID isn’t.
Based on these two facts, the code for creating a new link will be:
const string sqlLinkExist = "SELECT TOP 1 FwToken FROM Link l WHERE l.OriginUrl = @originUrl";
var tempToken = await conn.ExecuteScalarAsync<string>(sqlLinkExist, new { originUrl });
if (null != tempToken)
{
if (_tokenGenerator.TryParseToken(tempToken, out var tk))
{
_logger.LogInformation($"Link already exists for token '{tk}'");
return new SuccessResponse<string>(tk);
}
string message = $"Invalid token '{tempToken}' found for existing url '{originUrl}'";
_logger.LogError(message);
}
const string sqlTokenExist = "SELECT TOP 1 1 FROM Link l WHERE l.FwToken = @token";
string token;
do
{
token = _tokenGenerator.GenerateToken();
} while (await conn.ExecuteScalarAsync<int>(sqlTokenExist, new { token }) == 1);
_logger.LogInformation($"Generated Token '{token}' for url '{originUrl}'");
var link = new Link
{
FwToken = token,
IsEnabled = isEnabled,
Note = note,
OriginUrl = originUrl,
UpdateTimeUtc = DateTime.UtcNow
};
const string sqlInsertLk = @"INSERT INTO Link (OriginUrl, FwToken, Note, IsEnabled, UpdateTimeUtc)
VALUES (@OriginUrl, @FwToken, @Note, @IsEnabled, @UpdateTimeUtc)";
await conn.ExecuteAsync(sqlInsertLk, link);
return new SuccessResponse<string>(link.FwToken);
Verify Redirection URL
The system uses ILinkVerifier interface to verify the URL before sending it to the client. There are 3 invalid statuses:
Invalid Format: e.g. “865c8gyiB”
Local URLs: e.g. “/some-path”
Self-Referencing URL: e.g. “https://go.edi.wang/some-path”
public enum LinkVerifyResult
{
Valid,
InvalidFormat,
InvalidLocal,
InvalidSelfReference
}
public interface ILinkVerifier
{
LinkVerifyResult Verify(string url, IUrlHelper urlHelper, HttpRequest currentRequest);
}
We can take advantage of ASP.NET MVC’s IUrlHelper interface to do the first two invalid case validation.
public LinkVerifyResult Verify(string url, IUrlHelper urlHelper, HttpRequest currentRequest)
{
if (!url.IsValidUrl())
{
return LinkVerifyResult.InvalidFormat;
}
if (urlHelper.IsLocalUrl(url))
{
return LinkVerifyResult.InvalidLocal;
}
if (Uri.TryCreate(url, UriKind.Absolute, out var testUri))
{
if (string.Compare(testUri.Authority, currentRequest.Host.ToString(), StringComparison.OrdinalIgnoreCase) == 0
&& string.Compare(testUri.Scheme, currentRequest.Scheme, StringComparison.OrdinalIgnoreCase) == 0
&& testUri.AbsolutePath != "/")
{
return LinkVerifyResult.InvalidSelfReference;
}
}
return LinkVerifyResult.Valid;
}
To check if a URL is in the valid format:
public enum UrlScheme
{
Http,
Https,
All
}
public static bool IsValidUrl(this string url, UrlScheme urlScheme = UrlScheme.All)
{
bool isValidUrl = Uri.TryCreate(url, UriKind.Absolute, out var uriResult);
if (!isValidUrl)
{
return false;
}
switch (urlScheme)
{
case UrlScheme.All:
isValidUrl &= uriResult.Scheme == Uri.UriSchemeHttps || uriResult.Scheme == Uri.UriSchemeHttp;
break;
case UrlScheme.Https:
isValidUrl &= uriResult.Scheme == Uri.UriSchemeHttps;
break;
case UrlScheme.Http:
isValidUrl &= uriResult.Scheme == Uri.UriSchemeHttp;
break;
}
return isValidUrl;
}
IP Rate Limit
The redirection endpoint (/fw/{token}) is limited for a maximum of 30 requests in one minute for a single IP.
[Route("/fw/{token}")]
public async Task<IActionResult> Forward(string token)
This configuration in appsettings.json controls the IP limit rule:
"IpRateLimiting": {
"EnableEndpointRateLimiting": true,
"StackBlockedRequests": false,
"RealIpHeader": "X-Real-IP",
"ClientIdHeader": "X-ClientId",
"HttpStatusCode": 429,
"GeneralRules": [
{
"Endpoint": "*:/fw/*",
"Period": "1m",
"Limit": 30
}
]
}
For a more complete introduction of how IP rate limited is implemented, please check my previous blog post IP Rate Limit for ASP.NET Core.
Get Insights from User Agent
A typical User Agent string looks like this:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.12 Safari/537.36 Edg/76.0.182.6
In order to get information from it, I use a library called UAParser.
var uaParser = Parser.GetDefault();
string GetClientTypeName(string userAgent)
{
ClientInfo c = uaParser.Parse(userAgent);
return $"{c.OS.Family}-{c.UA.Family}";
}
This code allows me to group the data by OS-Browser information. By which, I mean, for example, users on Windows 7 + Chrome 60, and users on Windows 10 + Chrome 62 will be grouped as Windows-Chrome. So that the final pie chart won’t show too many fragmented series.
var q = from d in userAgentCounts
group d by GetClientTypeName(d.UserAgent)
into g
select new ClientTypeCount
{
ClientTypeName = g.Key,
Count = g.Sum(gp => gp.RequestCount)
};
Still Works to Do
The Link Forwarder project is in an early stage. There are plenty of improvements and new features I can think of. Such as providing REST API for third parties, add tags for manage links, or even use Blazor when ASP.NET Core 3.0 is released. It’s an open source project, so I welcome everyone to help make it better.
Comments