__dopostback call in a webpage to download a file

Hello All

I am trying to dowload a csv file from a webpage. The csv file is masked in a java script url and i am having difficulty downloading the contents. I get the entire page contents instead of the data in the csv. Could someone provide some pointers on how to fix it.

$url = "https://www.mcxindia.com/market-data/bhavcopy"

Invoke-WebRequest $url -SessionVariable session -UseBasicParsing

$addUserSite = Invoke-WebRequest $url -WebSession $session
$addUserForm = $addUserSite.Forms[0]

$addUserForm.Fields["__EVENTTARGET"] = 'ctl00$cph_InnerContainerRight$C001$lnkExpToCSV'
$addUserForm.Fields["__EVENTARGUMENT"] = ''

$filename = "C:\temp\Data.txt"

Invoke-WebRequest -uri $url -method post -Body $addUserForm.Fields -WebSession $session -useBasicPrasing -outfile $fileName
}

I have tried to base it off the instructions in this link https://sqlrus.com/2018/01/javascript-postback-download-via-powershell/

 

 

I have never used this kind of method for downloading files , but while trying to download large files on regular intervals for a project , i came across a similar article. that might be a little help for you.

mridul7arya68, your provided link is the one the NR is already referencing.

NR, there are just sites, where they code things, to not allow such automation, for whatever reason they chose. Meaning, they only allow reaching such things via real human interaction, and prevent bots and the like from hammering their site / resource, etc.

Now, I am not saying this is the case for the site you are hitting, but I’ve run into this more and more. Sure, it’s irritating / disappointing, but the site owners / devs can do what they want regardless of how it impedes anything I am trying to do via automation to make my life or corporate process easier.

What you are doing, of course does get the raw file, but not like the interactive human action of clicking the link.
The data you are after is there in that CDATA block. So, you’ll have to parse all that.

Yet, all the headers in the download are dynamically grabbed from embedded table from the page. So, you’d have to manually deal with that as well. At least from what I can see when I step into the page code end to end. Also, those field values you are trying to use have no values in them at all, so, nothing to hit. This file gets generated by those hidden VIEWSTATE* items.

So, it may be best for you to re-code this to just use the IE DOM or using GUI tools like WASP or Selennium, and click the like to download the file.

Thanks Mridul

Thank you postanote for taking your time to post the reasoning and the detailed explanation. I will now look to use GUI tools to download the file. The only reason why i was trying to avoid GUI automation is because my thoughts are it will not be able to run as a background process and would require the application to be logged in and the GUI available.

Yeppers, I get that, but there is a good deal happening behind the seems, which you cannot control with what you are doing. That file is not static, it’s being built dynamically calling a function of that interactive click.

All the stuff you are after are embedded, in parent and child Divs, which are just a pain to deal with.

@NR I have no idea of GUI automation , kindly post how you figure out the problem , it would be a great way for me to look at it.

Thanks postanote. I agree pursuing it via a powershell script is going to be tedious.

@Mridul , postanote has mentioned about a few automation tools you can use to go about the GUI automation. He has the provided the links to a wordpress to get a quick intro and start developing a few scripts. It is a very good intro to GUI automation using powershell.

The idea here is to use these modules for eg in this case to mimic human action by clicking on a link and downloading a file.