NEC Develops Distributed Heterogeneous Mixture Learning Technology on Spark that Rapidly Discovers Patterns Hidden in Super-Large-Scale Data
Tokyo, May 26, 2016 - NEC Corporation (NEC; TSE: 6701) today announced it has developed a novel distributed Heterogeneous Mixture Learning technology on Spark with enhanced functions that create prediction models from super-large-scale data using distributed computing systems based on its Heterogeneous Mixture Learning Technologies (*1, *2) that discover massive patterns hidden in big data using artificial intelligence (AI).
Heterogeneous Mixture Learning is a technology that automatically performs two important processes for big data analysis; (1) discovering "partitioning data" based on conditions such as the day of the week and weather, and (2) discovering "combining factors" that are important in making forecasts.
Conventional analysis methods require data to be broken down into segments in advance when analyzing tens of millions of samples of super-large-scale data larger than the onboard memory of one computer. In addition, there are limitations on the installation of high-performing CPUs with increased cores in one computer and it is a challenge to improve performance for massive data analysis.
NEC's distributed Heterogeneous Mixture Learning technology on Spark facilitates distributed analysis of heterogeneous mixture learning on multiple computers and ensures overall consistency of analysis results simultaneously. This makes it possible to create prediction models without any limitation on data scale by increasing the number of computers.
"The utilization of this technology is expected to enable analysis of super-large-scale data based on more than tens of millions of samples, for example, predictions of the balance at a major financial institution or contract cancellations of a large-scale telecommunications career. This technology has demonstrated a learning speed that is approximately 110 times faster than our conventional technologies. In addition, forecast accuracy also increased by about 17%. NEC will continue to develop this technology, aiming to launch commercial services in 2017," said Akio Yamada, General Manager, Data Science Research Laboratories, NEC Corporation.
The following are features of the new technology:
1. Distributed algorithm suitable for distributed computing systems
There are two significant features of this algorithm; (1) sharing only information about prediction models among computers, not massive data samples, and (2) an original algorithm which merges the individual prediction models. This can create highly accurate prediction models whose consistency is maintained despite individual computers executing the learning algorithm in parallel.
2. Distributed analysis software suitable for "Apache Spark," one of the in-memory distributed computing systems
This software implements a distributed heterogeneous mixture learning algorithm powered by the distributed computing framework of Apache Spark. At one time, it loads all data to be analyzed on the distributed memory composed of individual computer memories and then requires no further network communication with the massive data between the computers or additional loading of the data with slow disk accesses during complete execution of the algorithm. This feature of the software design maximizes computing performance of the in-memory based distributed computing architecture of Apache Spark, and results in high-speed execution of the algorithm.
NEC presents this technology at the following two events.
A scheduled presentation on June 8 at Spark Summit 2016, which will be held in San Francisco, California, U.S.A., from June 6 (Monday) to 8 (Wednesday).
https://spark-summit.org/2016/events/distributed-heterogeneous-mixture-learning-on-spark/
A scheduled presentation on June 30 at Hadoop Summit San Jose 2016, which will be held in San Jose, California, U.S.A., from June 28 (Tuesday) to 30 (Thursday).
http://hadoopsummit.org/san-jose/
***
Notes:
(*1) June 22, 2012
NEC Technology Automatically Detects Patterns from Big Data
http://www.nec.com/en/press/201206/global_20120622_02.html
(*2) June 19, 2014
NEC strengthens Heterogeneous Mixture Learning Technologies that automatically discover massive patterns hidden in big data
http://www.nec.com/en/press/201406/global_20140619_01.html
About NEC Corporation
NEC Corporation is a leader in the integration of IT and network technologies that benefit businesses and people around the world. By providing a combination of products and solutions that cross utilize the company's experience and global resources, NEC's advanced technologies meet the complex and ever-changing needs of its customers. NEC brings more than 100 years of expertise in technological innovation to empower people, businesses and society.  For more information, visit NEC at http://www.nec.com.
                  
The NEC Group globally provides "Solutions for Society" that promote the safety, security, efficiency and equality of society. Under the company's corporate message of "Orchestrating a brighter world," NEC aims to help solve a wide range of challenging issues and to create new social value for the changing world of tomorrow. For more information, please visit
http://www.nec.com/en/global/about/solutionsforsociety/message.html.

 
NEC is a registered trademark of NEC Corporation. All Rights Reserved. Other product or service marks mentioned herein are the trademarks of their respective owners. © NEC Corporation.

