Using web scraping methods for price data analysis in economics
Abstract and keywords
Abstract (English):
The article examines the theoretical and practical aspects of using web scraping for automated data collection in economics. Web scraping enables the extraction of large volumes of information from web resources, making it particularly valuable for price analysis, market trend monitoring, and evaluating competitive activity. The focus is on the application of modern tools such as Python and its libraries (BeautifulSoup, Scrapy, Selenium), as well as the use of analytical platforms, databases, and cloud solutions for data storage and processing. The article describes the key stages of the web scraping process, including source identification, data extraction, parsing, storage, and analysis. Special attention is paid to legal and ethical aspects, such as compliance with copyright laws and data confidentiality, along with recommendations for the lawful use of technology. Practical examples illustrate how web scraping is applied to monitor prices in the Russian market, analyze consumer reviews, and predict price changes. The article also explores the prospects for web scraping development, including integration with artificial intelligence and machine learning, positioning it as a vital tool in the digital transformation of the economy.

Keywords:
web scraping, data analysis, prices, economics, monitoring, automation, information technologies
Text
Text (PDF): Read Download
References

1. Mitchell, R. Web Scraping with Python: Collecting More Data from the Modern Web. 2nd ed. O'Reilly Media, 2018. 394 p.

2. Crummy, K. Beautiful Soup Documentation. URL: https://www.crummy.com/software/BeautifulSoup/ (data obrascheniya: 23.12.2024).

3. McKinney, W. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. 2nd ed. O'Reilly Media, 2017. 550 p.

4. Pandas Development Team. Pandas Documentation. URL: https://pandas.pydata.org/ (data obrascheniya: 23.12.2024).

5. NumPy Community. NumPy Documentation. URL: https://numpy.org/ (data obrascheniya: 23.12.2024).

6. Scrapy Project. Scrapy Documentation. URL: https://docs.scrapy.org/ (data obrascheniya: 23.12.2024).

7. Selenium Project. Selenium Documentation. URL: https://www.selenium.dev/documentation/ (data obrascheniya: 23.12.2024).

8. Federal'nyy zakon ot 27 iyulya 2006 g. № 152-FZ "O personal'nyh dannyh". SPS "Konsul'tantPlyus".

9. Grazhdanskiy kodeks Rossiyskoy Federacii (chast' chetvertaya) ot 18 dekabrya 2006 g. № 230-FZ (v red. ot 1 iyulya 2021 g.). SPS "Konsul'tantPlyus".

10. Tableau Software. Tableau Public Documentation. URL: https://public.tableau.com/ (data obrascheniya: 23.12.2024).

11. Google Cloud. BigQuery Documentation. URL: https://cloud.google.com/bigquery/ (data obrascheniya: 23.12.2024).

12. LinkedIn vs. HiQ Labs Inc. Case No. 19-1116, Ninth Circuit Court of Appeals, 2020.

13. Velichko A. V., Ivanov S. A. Vliyanie cifrovyh tehnologiy na razvitie ekonomiki // Cifrovaya ekonomika i upravlenie. 2022. T. 5. № 2. S. 45–51.

14. Institut razvitiya interneta. Prognozy cifrovoy transformacii v Rossii // Ekonomika i tehnologii buduschego. 2022. T. 12. № 3. S. 22–29.

15. Open Data Handbook. What is Open Data? URL: https://opendatahandbook.org/ (data obrascheniya: 23.12.2024).

Login or Create
* Forgot password?