Big data distributed file export method

一种大数据分布式文件导出方法

Abstract

本发明公开一种大数据分布式文件导出方法,包括以下步骤:查询服务器根据预定的条件,将源查询语句分拆解析成若干个子查询语句,分发给多个数据检索服务器;接收到子查询语句的数据检索服务器,进行并行查询;数据检索服务器将子检索结果并行写入子文件;文件合并服务器根据预定条件,抓取子文件并将子文件进行合并,然后导出。本发明采用分布式方式导出大量数据,可以保证数据的完整及正确性,并且可以支持横向扩展,从而快速安全地导出大批量数据到文件中。
The invention discloses a big data distributed file export method, which comprises the following steps of: splitting and analyzing a source query statement into a plurality of sub-query statements by a query server according to predetermined conditions and distributing to a plurality of data retrieval servers; performing parallel query by the data retrieval servers having received the sub-query statements; writing sub-retrieval results into sub-files by the data retrieval servers; and fetching and combining the sub-files by a file combination server according to predetermined conditions and then exporting. According to the big data distributed file export method, a large amount of data is exported in a distributed manner, so that the completeness and the correctness of data can be guaranteed, and horizontal scaling can be supported; and thus, large-batch data is quickly and safely exported into the files.

Claims

Description

Topics

Download Full PDF Version (Non-Commercial Use)

Patent Citations (3)

    Publication numberPublication dateAssigneeTitle
    CN-101996067-AMarch 30, 2011阿里巴巴集团控股有限公司一种数据导出的方法和装置
    CN-102521406-AJune 27, 2012中国科学院计算技术研究所海量结构化数据复杂查询任务的分布式查询方法和系统
    CN-102737016-AOctober 17, 2012中国银联股份有限公司A system and a method for generating information files based on parallel processing

NO-Patent Citations (0)

    Title

Cited By (0)

    Publication numberPublication dateAssigneeTitle