Good afternoon!
Ive been asked to help our recruiters to submit a list of queries to be passed to the Google search.
The text file is ~100 lines with each line being the complete string to search for.
site:linkedin.com AND (Company AND Technology) AND (analyst OR engineer OR developer OR administrator) AND NOT (jobs OR hiring) AND Location:detroit, mi
If you open a web browser and past this text into search, it generates a result and displays the count of the result, i.e.:
About 70 results (0.55 seconds)
This is all the data I really need.
Ive written the script below. If I uncomment the very last line, start $Uri, a new tab will be opened for each line in the text file showing each query and the browser window has the “About ## results” text.
My script has a regex to search for this value. Unfortunately every pass fails because when you run it as a script, the resulting returned HTML does not have the result value in it.
If you uncomment #Out-File -FilePath .\test.html -InputObject $WebResponse.Content it will create test.html for each new query. When you display test.html in the browser of your choice, it does not have the “About ## results” in it.
I have tried using both Invoke-RestMethod and Invoke-WebRequest. Same results for both.
Im sure this has something to do with how Google percieves the Powershell Invoke-* client. I have tried forcing User-Agent but results do not change. I don’t have much hair left to pull out.
Anyone have any thoughts or suggestions?
Thank you in advance!
function Get-GoogleCSEQueryString {
param([string[]] $Query)
Add-Type -AssemblyName System.Web # To get UrlEncode()
$QueryString = ($Query | %{ [Web.HttpUtility]::UrlEncode($_)}) -join '+'
# Return the query string
$QueryString
}
$Technology_Variables = Get-Content ".\Technologies.txt"
## Example file format
## site:linkedin.com AND (Company AND Technology) AND (analyst OR engineer OR developer OR administrator) AND NOT (jobs OR hiring) AND Location:detroit, mi
$j = $Technology_Variables.count
for ($i= 0 ; $i -le $j-1 ; $i++)
{
Write-host "Searching for external talent" $i $Technology_Variables[$i]
$QueryString = Get-GoogleCSEQueryString $Technology_Variables[$i]
Write-host "Google Query" $QueryString
$Uri = "https://www.google.com/search?q=$QueryString"
# $WebResponse = Invoke-RestMethod -UseBasicParsing -URI $Uri
$WebResponse = Invoke-WebRequest -UseBasicParsing -URI $Uri
#Out-File -FilePath .\header.html -InputObject $WebResponse.Headers
#Out-File -FilePath .\test.html -InputObject $WebResponse.Content
$resultsearch = [regex]::match($WebResponse.Content,'About (\d*) results').Groups[1].Value
Write-host "About" $resultsearch
Start-Sleep -s 1
##Open each query in a web browser
# start $Uri
}
Thank you in advance!