So it all started with a pretty basic project of mine.
I needed the data for Indian Stock Indices such as SENSEX or NIFTY. I have came accross multiple resources such as two free apis from Upstox and Angelone. But the issue was they were asking for regeneration of token everyday plus the documentation was hell. I know I could have made thru but chose to don't go that route.
I also pondered over some projects such as this and some international APIs. But unforntunately those weren't also working.
So I decided to do the worst of all-DYI. But then I realised that Indian stock exchanges aren't quite individual friendly. Basically you cannot get it directly without a hefty fee. So, I decided to scrape! . Since I knew that my usage isn't quite extreme so it will be okay without proxies and all.
So, tried https://pptr.dev/ in Node.js but due to some issues, I decided to go with the python route. So the most popular tool/package I came accross was https://www.selenium.dev/ . Within a few hours I was to set up locally the things I needed and the server was working fine with uvicorn
and fastapi
.
Now Docker comes into the picture. For deploying something like a browser with its drivers and all you need maybe a virtual machine or containerization is the way to go for most of us. I started with something free and bit automatic railway.app .
So the starting point is creating a Dockerfile. It is essentially a check list for the server that these are the only things I need in my container so just install/copy these and at last you can have the command to run them.
With the help of ChatGPT, I was able to write Dockerfile quickly, but since I haven't worked with Docker previously I had no idea what is right/wrong. I started with deployment then I started facing the bombardments of error.
Even after 10-20 iterations I had managed to deploy it error free then my browser (Chrome at that time) started crashing. Basically the project was useless.Then tried again and again. Once I used chrome-for-testing by downloading it in the container and making it executable and then specifying the binary path in the code. It also didn't work. I also tried to use the chrome-drivers manually. Failed.
In between I faced a hell lot of issues but couldn't make it work.Then somewhere I read that railway.app might not be scrapping friendly. So I started the hunt for alternatives. I found that education.github.com offers a $13/month credit for Heroku. I opted for it. Setting it up was hell bad because of not having any proper material (few videos on youtube but wasn't that helpful but made me understand few basics of heroku). Then I was finally able to deploy it and surprise... it didn't worked. Same chrome crashing issue again.
Finally, I changed to firefox and man it worked on the first go.
Here's a screenshot of deployments and commit messages.
My Takeaways.
For a limited hardware/free tier - Don't use chrome. Use something lightwieght or try to figure out solutions with just webview.
For anything serious don't use WSL (Windows Subsystem for Linux). It creates a hell lot of artificial problems.
If you are using python's virtual environment. In the final commit any of those folders like lib, bin should not be there. It just irks heroku. If project is not so big and complex please avoid virtual environments.
Learn a bit of Docker (maybe some articles/videos).
ChatGPT can sometimes create problems especially when it is not aware with the latest version.
Here is my tiny little project, if wanna roast/commit anything : https://github.com/techlism/indian_stock_market_data
Peace ✌️.