Introduction and Background
In most science areas, which include DNA & molecule designs in micron level and earth environment sciences in macro level, it is so important to extract meaningful information from big data, which is superficially useless data with huge size. The extraction techniques are called data mining. Data mining is so costly that it is difficult to process it in traditional ways. To achieve much more efficient data mining and result in innovative science technologies, we have to enhance parallelization and distribution in algorithms and execution styles.
Division of Next Generation Data Mining Technology, which is the previous division, especially focuses attention to medical and bio-systems, and has developed next generation data mining software together with researchers in artificial intelligence and statistics areas. In the process of that, we have found that we have to enhance parallelization/distribution to achieve new innovative technologies. In Division of Super Distributed Intelligent Systems, we will improve the results of the division of next generation data mining technology, and develop new parallelizing/ distributing techniques based on performance issues that the results have exposed. For example, we will enhance execution efficiency in the low level that is related with programming languages, parallel/distributed algorithms, and network protocols. In addition, we will design new parallel/distributed models based on knowledges given by cell signal processing or social insects. Eventually, we will apply these techniques and models to several areas such as image processing, power systems, machine learning, robot systems, software engineering tools and so on, including data mining.
As shown in Fig.1, we address the issues of parallelization and distribution in three hierarchical levels,“ applications”,“ models, and“ infrastructures” as follows:
- 1. Parallel/Distributed Applications
In the application level, considering three applications, “data mining & machine learning”, “image processing” and “distributed robot controls”, their special researchers improve system performance using application-level techniques such as a cloud computing.
- 2. Parallel/Distributed Infrastructures
In the Infrastructure level, considering “programming languages”, “language processors” and “network protocols”, their special researchers directly improve the parallelization and distribution techniques on various infrastructures.
- 3. Parallel/Distributed Models
In the model level, considering“ evolutionary computation”,“ cell communications” and “biological systems”, their special researchers develop models for making infrastructures work more efficiently. Also, they develop new models through which the improvements of infrastructures directly lead to the speedup of applications.
Productions developed and knowledges found in each level can quickly be shared by all the levels. Because of that, we can give domain specific effective solutions, as shown in Fig.2. For example, we have developed a system for detecting distraction of drivers based on movement of eyes in the previous division. The system can expose cognitive distraction of drivers through AI’s integrating environment information and eye movement data. In the system, since AI has to process huge various sensor data, it requires parallel learning and inference algorithms, and their parallel or distributed execution. Thus, truly parallel execution is given by improvement of the system in multiple levels, which is achieved by cooperation between specialists in several areas.
We believe that the challenges of this division will give breakthroughs in many traditional techniques, and open new horizons for parallel or distributed systems.
Fig. 1 Members of the division and their relations
Fig. 2 Expected sffectiveness
Future Development Goals
Development of highly parallelized/distributed AI systems that can handle manually processed huge data, and multiple robots for practical missions.
This research division aims to give effective domain specific parallelization/distribution solutions for each system in various levels. The solutions include the design of parallel models inspired from cell signal processing or social insects. We believe that the challenges of this division will open new horizons for parallel or distributed systems.