URLs With PowerShell
Regex… Even the PowerShell Masters struggle with it from time to time. I helped a friend of mine with some URL chaos. The URL he had was a software download and the token was inside the URL. Yeah, it was weird but totally worth it. There were many different ways to handle this one. But the Matches was the best way to go. What was interesting about this interaction was I could later in another script. Unlike my other posts, this one’s format is going to be a little different. Just following along.
The URL (Example)
"https://.download.software.net/windows/64/Company_Software_TKN_0w6xBqqzwvw3PWkY87Tg301LTa2zRuPo09iBxamALBfs512rSgomfRROaohiwgJx9YH7bl9k72YwJ_riGzzD3wEFfXQ7jFZyi5USZfLtje2H68w/MSI/installer"
Here is the string example we are working with. Inside the software installer, we have the name of the software, “Company_Software_” and the token, “0w6xBqqzwvw3PWkY87Tg301LTa2zRuPo09iBxamALBfs512rSgomfRROaohiwgJx9YH7bl9k72YwJ_riGzzD3wEFfXQ7jFZyi5USZfLtje2H68w” The question is how do you extract the token from this URL? Regex’s Matches we summon you!
Matches is one of the regex’s powerhouse tools. It’s a simple method that allows us to search a string for a match of our Regex. This will allow us to pull the token from URLs with PowerShell. First, it’s best to link to useful documentation. Here is the official Microsoft documentation. Sometimes it’s helpful. Another very useful tool can be found here.
PowerShell’s Regex can be expressed in many different ways. The clearest and most concise way is to use the -match flag.
$String -match $Regex
This of course produces a true or false statement. Then if you call the $matches variable, you will see all of the matches. Another way of doing this is by using the type method.
[regex]::Matches($String, $Regex)
This method is the same as above, but sometimes it makes more sense when dealing with complex items. The types method is the closest to the “.net” framework.
The Regex
Next, let’s take a look at the Regex itself. We are wanting to pull everything between TKN_ and the next /. This was a fun one.
'_TKN_([^/]+)'
The first part is the token. We want our search to start at _TKN_. This clears out all of the following information automatically: https://.download.software.net/windows/64/Company_Software. A next part is a group. Notice the (). This creates a group for us to work with. Inside this group, we are searching for all the characters inside []. We are looking for Everything starting at where we are, the TKN_ to a matching /. We want all the information so we place a greedy little +. This selects our token. This regex produces two matches. One with the word _TKN_ and one without. Thus we will want to capture the second output, aka index of 1.
$String -match '_TKN_([^/]+)'
$Matches[1]
Another way to go about this is the method-type model.
$Token = [regex]::Matches($String, '_TKN_([^/]+)') | ForEach-Object { $_.Groups[1].Value }
It same concept, instead this time we are able to parse the output and grab from group one.
Replace Method
Another way to go about this is by using replace. This method is much easier to read for those without experience in regex. I always suggest making your scripts readable to yourself in 6 months. Breaking up the string using replace makes sense in this light. The first part of the string we want to pull out is everything before the _TKN_ and the TKN itself. The second part is everything after the /. Thus, we will be using two replaces for this process.
$String -replace(".*TKN_",'')
Here we are removing everything before and the TKN_ itself. The .* ensures this. Then we wrap this code up and run another replace.
$Token = ($String -replace(".*TKN_",'')) -replace('/.*','')
Now we have our token. This method is easier to follow.
Conclusion
In Conclusion, parsing URLs With PowerShell can be tough, but once you get a hang of it, life gets easier. Use the tools given to help understand what’s going on.
Additional Regex Posts
Image created with midjourney AI.