Scraping does not work and how to configure Cache-Control

I deployed my web service, which is designed to scrape item data from Amazon.

https://ku-search-tarurata.koyeb.app/?k=aaa

However, it does not work. It fails to retrieve any information from Amazon. Initially, after deployment, it shows item data once (and it’s accurate only about 50% of the time right after deployment), but after a reload, it no longer displays any information from Amazon.

I assumed the issue might be related to caching. Therefore, I attempted to configure the cache settings like “no-store”.

However, I’m unsure where to configure cache control. I’m using a build pack, which automatically selected PHP for me, and PHP itself seems to work fine, but the scraping functionality does not work.

Does anyone know the possible reason for this issue or how to configure the cache?

Thanks.

Hi @Wataru_Murata ,

Have you been able to confirm your assumption? I don’t see how the cache would be at fault for the problem you describe. It seems that it is not working properly at any time.

I tried using your app. When I submit a request, I do not get the data back. When I refresh the browser wit the same request, I randomly get data returned.

@David
Thank you for your reply and I am sorry for my late reply.

When I refresh the browser wit the same request, I randomly get data returned.
Yes, only once a about five times I can get the data.

I solved the problem. It was not because of the cache.
The reason was that the script did not wait for the scraping results to come in.

I added the wait for the result of scraping and it works.
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);

Thank you so much.

1 Like