论文部分内容阅读
本文主要对数据量大、更新快的股票行情数据采用网上分布式抓取的设计和实现,使用了Java网页抓取分析技术、Oracle存储技术和分布式设计,有效提高了数据抓取速度和数据量。对此,本文将采用Java网页抓取分析技术、Oracle存储技术、分布式设计,来实现股票日线行情数据的网上分布式抓取。关键技术本文主要用到了如下关键技术:1 Java网页抓取技术使用Java的URL类实现对网页数据的抓取,并使用Pattern类进
In this paper, the main data on the amount of large, updated stock market data using online distributed crawler design and implementation, the use of Java web crawling analysis technology, Oracle storage technology and distributed design, effectively improve the data capture speed and data the amount. In this regard, this article will use Java web crawling analysis technology, Oracle storage technology, distributed design, to achieve online daily online stock market data capture. The key technologies This paper mainly uses the following key technologies: 1 Java web crawling technology Use Java URL class to crawl webpage data, and use the Pattern class