L. Boyanov. Architectures and Tools for Internet of Things Big Data Processing

Key Words: Internet of Things (IoT); big data; big data processing tools; Hadoop.

Abstract. Internet of Things (IoT) is a modern paradigm referring to interconnected things/objects in the global digital network Internet. This model differs significantly from the traditional approach of connecting computers, laptops and servers to Internet. There is a huge variety of connected devices – ranging from sensors and RFID tags and mobile phones to data centres and supercomputers. They all create, transmit and process digital/digital data in a quantity, variety and unimaginable until recently. All this leads to new requirements for the means and environment for data processing. The paper presents a classification of architectural model, used for data from IoT. They are placed in four groups – such of standardization organization, of commercial organizations, in respect of Industrial Internet of Things and of researchers. A well-known architecture, that distinguishes the data path according to the speed of data processing – Lambda Architecture is also presented. The paper also looks at the most popular products, programs and software libraries for big data processing. A particular attention is given to the Hadoop software library, which allows processing of big sets of data. Other products and tools for ETL (Extract, Transform and Load), distributed event streaming, data storage, data processing and analytics are also presented. The paper describes a simplified architecture, which has been implemented and demonstrated to work on a 40-node cluster. Its software comes from the open source Hadoop environment. The next tasks and future work on this architecture are presented.